Interviews
Omer Bachar, Co-founder and CEO of Vetric – Interview Series

Omer Bachar, Co-founder and CEO of Vetric, is an Israeli entrepreneur and open-source intelligence (OSINT) focused technology leader who has built a career around solving complex data and intelligence challenges. Before founding Vetric in 2022, Bachar developed and monetized multiple web businesses as a teenager, gaining experience in web development, SEO, traffic monetization, and scalable digital operations. He later served in the Israel Defense Forces, where he helped modernize Military Police operations through automation, AI, data mining, and agile methodologies while leading teams and product initiatives. Following an earlier failed startup and an intensive research period in Thailand with his co-founders, Bachar launched Vetric with a focus on infrastructure rather than venture funding, growing the company bootstrapped while serving organizations that rely on scalable access to public web data. His work centers on WEBINT, OSINT, data collection, and digital threat intelligence, particularly in areas involving impersonation, fraud, and deepfake-related risks.
Vetric is a data infrastructure company that helps organizations collect, process, and operationalize public web data at scale. The company positions itself as a trusted infrastructure layer for trust, safety, cybersecurity, and intelligence teams that need reliable, real-time access to open web information. Rather than building a consumer-facing platform, Vetric focuses on managed APIs and flexible data pipelines designed to simplify large-scale public data collection across constantly changing online environments. The company has emphasized a “data for good” approach, supporting organizations working to identify impersonation campaigns, deepfakes, scams, and other digital threats. According to the company, it has grown from a small founding team into a rapidly expanding operation serving more than 100 organizations globally while remaining focused on scalable infrastructure and operational reliability.
You started building internet businesses as a teenager, later worked on modernizing the Israeli Military Police with AI and OSINT tools, and eventually co-founded Vetric.io after spending months researching problems in Thailand. How did those experiences shape your belief that data infrastructure, not just detection models, would become the real battleground in the fight against deepfakes and digital impersonation?
The thread across all three of those chapters was the same: every problem I thought was a detection problem turned out to be a data problem one layer down.
As a teenager finding vulnerabilities, the hard part was never the exploit — it was building the infrastructure to find the right targets at scale. In the Military Police, OSINT was the gap nobody was working on. Criminals were leaving evidence all over the open web, in plain sight, and we had almost no systematic way to reach it. Then in Thailand, we spent months talking to defense agencies, threat intel teams, brand protection vendors, fraud teams. Same pattern everywhere. Everyone wanted insights.
That’s when it clicked. Detection models are only as good as the data you feed them. If you can’t see the deepfake the moment it appears on the open web, your detector is irrelevant. The battleground isn’t whether your model can spot a synthetic face in a benchmark — it’s whether you have visibility into the surface where the attack actually lives. Public data infrastructure is the part nobody wants to build because it’s hard and unglamorous, but without it the rest of the stack is theater.
You’ve argued that the deepfake problem has fundamentally changed over the last 18 months. What specific technological shifts are making today’s AI-generated fraud more dangerous than earlier waves of scams and impersonation attacks?
Three things changed at the same time, and the combination is what makes this different.
First, multimodal models became good enough that you can generate convincing video, voice, and text from the same prompt in minutes. What used to require a small team and a few weeks now takes one person and a coffee.
Second, the cost dropped through the floor. A high-quality voice clone of a public figure used to require a serious GPU budget. Today it runs on consumer hardware, or a $20-a-month SaaS subscription. The cost is no longer a barrier.
Third — and this is the underappreciated piece — the distribution layer caught up. The fakes no longer sit on some obscure forum waiting to be discovered. They live on the open web, in the same public sources your customers and employees consume dozens of times a day, often with paid amplification on top. A fabricated video travels exactly as fast as anything else on the public internet.
In earlier waves, an attacker had to choose between quality, speed, scale, and reach. Now they can have all four for under $100. The asymmetry between what it costs to fabricate a convincing impersonation and what it costs an enterprise to detect, attribute, and respond to it is wider than it has ever been. That’s the shift.
Most public discussion around deepfakes still focuses on detection software. Why do you believe reliable, large-scale public data infrastructure is now just as important as the AI models themselves?
Because a detector that can’t see the threat is a research project, not a defense.
Most of the deepfake conversation focuses on whether the model can tell a synthetic face from a real one in a controlled benchmark. That’s table stakes. The operational problem is the harder one: a brand impersonation video goes up at 3 a.m., picks up 200,000 views before anyone notices, and your detection pipeline has to actually receive that content in real time, attribute it, and trigger a response. If you don’t have a stable, scalable way to ingest public data from the open web — the places where impersonation and fraud actually unfold — you have nothing to run your detector on.
There’s a related point we made recently in a piece on the Anodot supply-chain breach. Enterprises are starting to realize that the security posture of their data vendors is their own security posture. The same realization is coming for AI: the reliability of your detection stack is the reliability of your data pipeline. Models get the press. Infrastructure does the work. Treating those two as separate budget lines is a mistake the market is about to stop making.
Vetric grew to significant scale without outside funding while serving enterprises and intelligence-related organizations. How did bootstrapping influence the way you built the company and prioritized infrastructure reliability over growth-at-all-costs expansion?
Bootstrapping forces a kind of customer obsession that funded companies can avoid for a long time. We’ve always believed in giving value before growing fast — and for us that isn’t a slogan, it’s a financial reality.
Every dollar in our bank account is one a customer chose to give us in exchange for something that worked. We don’t have a $50 million cushion to bridge the gap between a product that almost works and one that does.
That changes every decision the company makes. It shaped our architecture as much as it shaped our roadmap. Our customers, intelligence agencies, digital risk protection teams, public safety teams, threat intel platforms, don’t just want a product that works.
That’s also why we drew the trust boundary the way we did. We push data out to customers; we never pull data from them. We don’t hold long-lived credentials into a customer’s cloud. When customers want delivery into their own AWS, we use scoped, write-only IAM roles with external IDs – the AWS-recommended pattern specifically designed to prevent the kind of vendor-asattack-vector breach that just hit Anodot’s customers.
Deepfake-as-a-Service platforms are lowering the barrier to entry for attackers. Are we approaching a point where highly sophisticated fraud capabilities become fully commoditized and accessible to almost anyone?
We’re already there for the basic capabilities — and not just at the paid tier, at the free tier. We’re 12 to 18 months away from the more sophisticated ones. Look at what’s openly available right now. Models like Sora generate convincing video from a text prompt. Open-source face-swap projects on GitHub will take any reference image and put that face into a video, and the results genuinely look good — no specialist skill required, no payment, no underground marketplace.
The barrier to producing a high-quality fake is a free download and an afternoon of experimentation. There are also paid services that wrap a turnkey workflow around the same kinds of models for people who don’t want to install anything, but the open-source tier is the real story, because it’s the one you can’t price anyone out of. The person operating it doesn’t need to understand the underlying tech any more than someone running a phishing kit needs to understand SMTP.
What’s coming next is the templated, campaign-in-a-box tier — pre-built impersonation kits for specific brands, specific executives, specific election cycles, with the synthetic media, the fake accounts, the distribution schedule, and the targeting all bundled together. Once that exists at scale, the marginal cost of running an impersonation attack drops to roughly the cost of running a paid ad. The implication for defenders is uncomfortable. You can no longer assume the attacker is sophisticated or well-resourced. The attacker is anyone with a laptop and a grievance. That changes the volume and the variety of threats by an order of magnitude, and it redefines what “credible threat” means for any brand, executive, or public figure with an internet presence.
Many enterprises still rely on fragmented monitoring systems and delayed threat intelligence. What are the biggest weaknesses you see in how organizations currently collect and process public web data related to impersonation and fraud?
A few weaknesses keep showing up across customer conversations. The first is fragmentation. Most enterprises run one tool for brand monitoring, another for executive protection, another for fraud, another for security operations. Each pulls from different sources with different coverage, different latency, and different blind spots.
When an impersonation campaign breaks across several corners of the open web at the same time, the seams between those tools are exactly where the attacker hides. And underneath that fragmentation is a deeper issue: most of these tools optimize for the takedown. Pulling the fake profile is necessary, but on its own it’s whac-a-mole — another one pops up the next day. The teams getting real leverage are the ones doubling down on investigation: connecting the dots between accounts, infrastructure, and campaigns, so they’re dismantling the network behind the impersonation rather than swatting at one URL at a time.
The second is reliance on middleware that needs deep inbound access to customer systems. That’s the structural problem we wrote about in the context of the Anodot breach — vendors that require OAuth or API keys into your cloud become single points of failure for everyone downstream. SOC 2 won’t save you from that. The architecture either has the blast radius or it doesn’t.
The third is latency. A lot of what gets sold as “real-time threat intelligence” is actually on a four-to-twenty-four-hour delay because the underlying collection is batch-oriented. For impersonation and fraud, that’s an eternity. By the time the alert fires, the campaign has run its course.
The fourth is the misallocation between detection coverage and analysis capabilities. Companies invest in tools that flag synthetic or suspicious content, but they underinvest in the coverage to actually see what’s being posted about them across the open web, and in the analysis capabilities to make sense of what they do see — who’s behind it, how the accounts connect, what infrastructure they share. Flagging is one step; coverage and analysis are what turn a flag into action. That’s like buying a great smoke alarm and forgetting to plug it in.
You’ve spoken about handling billions of public data points every month. What technical challenges emerge when trying to build real-time infrastructure capable of identifying AI-driven threats at internet scale?
The hard problems aren’t the ones most people expect. The first thing to say is that building this kind of infrastructure is not the hard part — maintaining it is. You don’t build a pipeline like ours once and walk away. You need an entire team whose job is keeping it alive, and that’s the team we’ve built. The public sources we collect from push changes constantly, and any one of them can quietly break a collector. Most of the engineering effort over the lifetime of a product like this goes into maintenance, not greenfield development.
The second is that scaling is a treadmill. You improve the pipeline, you push it harder, and at some point the scale gets too much for what you built and things start to shake. Then you rebuild that layer — sometimes from scratch — for the next level. That cycle never really ends. Every order of magnitude in volume is effectively a different system, even if it looks like the same product from the outside.
The third is uptime, which sounds boring until you remember who our customers are. Some of them use our data to help prevent terror attacks. Real-time means you don’t get a grace period. Things will break — they always do — and what matters is that they’re up the rest of the time and fixed as fast as humanly possible when they’re not. We’ve engineered the on-call posture, the alerting, and the redundancy around that reality, not around a generic SaaS uptime number.
As synthetic media becomes more convincing, do you think society is moving toward a broader “trust crisis” where people begin doubting legitimate audio, video, and communications by default?
Closer than people want to admit. But I don’t think it ends in collapse — I think it ends in a different equilibrium.
The naive version of the trust crisis says: nobody will believe any audio or video, and the public sphere collapses. I don’t think that’s where this goes. Humans are remarkably adaptive about epistemic environments. We learned to read newspapers with skepticism, then television, then the web. We’ll learn to read AI-generated media with skepticism too, and that’s not entirely a bad outcome — a healthier baseline of “verify before you trust” is overdue.
What I do think we’ll see is a shift in what counts as evidence. Raw video and audio will become weaker forms of proof; provenance, cryptographic signing, chain-of-custody, and identity verification will become stronger ones. The institutions that figure out how to credibly verify their own communications — banks, governments, news organizations — will pull ahead. The ones that don’t will get impersonated into irrelevance.
The dangerous interval is the middle one, the next two to three years, where the fakes get convincing faster than the verification layer gets deployed. That’s the window where the damage gets done, and that’s the window we’re trying to give our customers the visibility to survive.
Thank you for the great interview, readers who wish to learn more should visit Vetric.












