Funding
Conntour Raises $7M to Turn Security Cameras into Search Engines

Conntour has emerged from stealth with a $7 million seed round backed by investors including General Catalyst, Y Combinator, SV Angel, and Liquid 2 Ventures. The company is positioning itself around a simple but ambitious idea: security teams should be able to search video footage as easily as they search the web.
The platform introduces natural language querying across camera networks, allowing users to describe what they are looking for rather than relying on predefined filters or categories.
From Passive Cameras to Searchable Intelligence
Traditional video surveillance systems are built on rigid rules. Operators must define what to detect in advance—specific objects, movements, or behaviors. This approach often results in missed incidents and hours of manual review when something unexpected happens.
Conntour replaces that model with a more flexible interface. Instead of configuring alerts ahead of time, users can type queries such as “a person leaving a bag unattended” or “a van near the loading dock yesterday,” and the system retrieves relevant footage.
This marks a shift from monitoring to querying. Video is no longer something to watch—it becomes something that can be explored and interrogated on demand.
Built for Real-World Complexity
One of the core challenges in surveillance is that real-world situations rarely fit into neat categories. Suspicious behavior is often contextual, involving sequences of actions rather than a single detectable object.
Conntour’s system is designed to handle this ambiguity. It operates across both live feeds and historical footage, enabling real-time alerts as well as rapid post-incident investigation. The platform also works with existing camera infrastructure and can be deployed fully on-premises, which is critical for environments where data cannot leave secure networks.
The interface is built for usability, allowing non-technical operators to interact with complex systems without needing to configure detection rules or understand underlying models.
Early Traction in High-Stakes Environments
The company is already being deployed in homeland security operations in Singapore, suggesting early adoption in environments where accuracy and speed are critical.
The founding team’s background in intelligence and high-tech systems appears to influence the product’s design, particularly its focus on operational efficiency. The platform claims to enable a single operator to monitor thousands of cameras while dramatically reducing the time required to investigate incidents.
Compared to traditional video analytics systems, the platform reports significant operational improvements:
- Up to 90% reduction in manual video review time
- Up to 80% fewer missed events
- Up to 70% reduction in false alarms
- The ability for one operator to oversee thousands of cameras
These gains come from replacing rule-based workflows with systems that interpret context and intent more dynamically.
Where This Technology Could Lead
What Conntour is building points to a broader shift in how video data is interpreted—not just faster analysis, but a fundamentally different interaction model. Instead of designing systems around predefined detection rules, the burden moves toward understanding intent expressed in natural language.
That shift has implications beyond security. If systems can reliably interpret open-ended queries like “someone leaving an object behind” or “unusual movement near an entrance,” it suggests a move toward semantic video understanding—where context, relationships, and sequences matter as much as individual objects.
At scale, this could reshape how organizations use video archives. Footage becomes an indexed dataset that can be queried dynamically rather than passively stored. In environments like transportation hubs, logistics networks, or public infrastructure, this could change how incidents are reconstructed, audited, and potentially anticipated.
Under the Hood: From Detection to Understanding
Traditional systems rely on object detection models trained to recognize specific categories such as people or vehicles. While effective in controlled scenarios, these models struggle when queries fall outside predefined labels.
Conntour’s approach likely involves building richer visual representations—often referred to as embeddings—that capture not just objects, but attributes, relationships, and changes over time. Natural language queries can then be mapped into the same representation space, allowing the system to match intent with visual data.
Another key challenge is temporal reasoning. Many real-world queries involve sequences of events rather than single frames. Supporting this requires tracking entities across time and understanding interactions, not just identifying objects in isolation.
Constraints and Tradeoffs
Despite its potential, this type of system introduces new challenges. Processing large volumes of video with advanced models is computationally intensive, especially in on-premises deployments where resources are constrained.
Accuracy is another consideration. Open-ended queries introduce ambiguity, and systems must balance flexibility with precision to avoid false positives or missed detections. Unlike rule-based systems, natural language-driven systems depend heavily on how well models generalize to edge cases.
There are also governance implications. The ability to search for highly specific attributes or behaviors raises questions about oversight, access control, and appropriate use—particularly in sensitive or public environments.
Conntour’s launch highlights a shift away from rigid, rules-based surveillance toward systems that can interpret intent and context in real time.
If this model proves reliable, it could redefine how organizations interact with video data—moving from passive monitoring toward dynamic, query-driven intelligence.










