Interviews
Mathis Joffre, Co-founder and Engineering Lead at Blaxel – Interview Series

Mathis Joffre, Co‑founder and Engineering Lead at Blaxel, is a seasoned infrastructure engineer who previously helped scale one of Europe’s largest cloud platforms at OVHcloud. At Blaxel, he leads the development of low-latency, scalable systems tailored for AI agents, and is a key contributor to the company’s open-source tools supporting performance-driven deployment.
Blaxel is a computing platform purpose-built for autonomous AI agents, enabling developers to build, test, and run agentic workflows without managing infrastructure. Its architecture includes ultra-fast microVMs, batch job execution, and a global gateway for routing and fallback. Blaxel prioritizes secure sandboxing, real-time observability, and seamless scalability to support production-grade agent deployments.
You spent three years working with AI and data infrastructure R&D at OVHcloud—what was the key moment or insight that inspired you to build Blaxel as a purpose‑built cloud for AI agents?
I realized, while working on AI Endpoints—one of OVHcloud’s flagship AI products—just how complex the next generation of cloud architectures and AI use cases will become. We’re moving from traditional chatbots to fully autonomous systems. This agentic revolution isn’t just about smarter applications; it’s forcing a rethink of everything from the software stack to data center architecture. That realization is what pushed me to build Blaxel.
Looking back at your early engineering path—from building networking tools at Orange Business to defining stacks at OVHcloud—how has that experience informed the architecture and philosophy of Blaxel?
I’d say: stay grounded. Even though this revolution can feel hypothetical or overhyped, the only way to make it real is to focus on concrete use cases and solve them well. That mindset shaped Blaxel from the beginning—we built it around our customers’ real-world needs, from code generation to video analytics. Instead of chasing trends, we wanted to deliver a purpose-built platform that gives agents exactly what they need to run effectively.
Can you walk us through the role of the Model Context Protocol (MCP) and multi-region model gateways? How does that enhance fault tolerance and scalability for agents?
Agents are all about context—their ability to access relevant information is key to acting effectively. MCP serves as our primary interface for integrating agents with our infrastructure because it addresses this challenge. Just as developers use REST APIs to connect apps in the SaaS world, they’ll now use the Model Context Protocol to provide specific, processable context to their agents.
But context alone isn’t enough—agents also rely on LLMs, such as those provided by OpenAI or Anthropic. Given the growing demand, these providers’ servers can occasionally become overwhelmed by traffic. That’s where multi-region model gateways come in.
Model gateways allow traffic to be rerouted dynamically to the nearest available LLM endpoint (in terms of latency), whether it’s OpenAI, Anthropic, or another provider. This not only improves response times but also ensures fault tolerance (by failing over to alternative providers) and scalability (by distributing load across multiple regions and models).
Blaxel supports developer tooling that agents themselves can call—what motivated designing APIs consumable by agents rather than humans? How do you see this evolving?
For me, OpenAI’s release of Operator was eye-opening—it made me realize that the future involves agents consuming infrastructure directly. Agents started by analyzing historical data and answering questions. Then they moved on to generating code. The next logical step is for them to deploy that code autonomously.
That’s why we believe agents need their own cloud—purpose-built around the idea that the future of IT operations will be driven by autonomous agents.
Reflecting on existing cloud providers and agent-hosting platforms (like Modal, RunPod, Replicate, etc.), where do you see the most common gaps when deploying agents at scale?
Most platforms today weren’t designed for persistent, stateful, autonomous agents—they were designed for stateless jobs or inference APIs. So you end up stitching together compute, memory, storage, and networking in ways that weren’t intended to support long-lived processes with memory, feedback loops, and complex I/O. The result is either brittle systems or high operational overhead. That’s the gap: we need infrastructure where agents are first-class citizens, not an afterthought.
What are the most common anti-patterns you see—and what do builders trip over when deploying autonomous agents in production versus dev/test?
The most common mistake is treating agents like functions—invoked, executed, then forgotten. In production, agents need to persist context, manage tools, and sometimes react to external signals in real-time. People also underestimate how messy real-world environments are: flaky APIs, inconsistent data, unexpected state transitions. Builders often test in ideal conditions, but production reality requires robust observability, sandboxing, and recovery strategies.
Your roadmap includes features like snapshot forking, automatic failover, and deeper compute optimization. Which do you consider most transformative for agent-first systems?
Snapshot forking, hands down. It unlocks debugging, experimentation, and parallel reasoning patterns that just aren’t possible in conventional cloud environments. Imagine an agent reaching a decision point—it forks its sandbox into multiple branches, explores different outcomes in parallel, and then picks the best path forward. That kind of branching logic is native to agent workflows but completely foreign to traditional cloud runtimes. It fundamentally changes how we think about autonomy and control flow.
Gartner predicts 75% of apps will use AI agents by 2028—how do you anticipate Blaxel evolving as AI agents become ubiquitous across industries?
As agents go mainstream, we expect Blaxel to evolve from being “infrastructure for AI agents” to being “the operating layer” they rely on—handling lifecycle, coordination, and even marketplace interactions. You won’t just deploy agents on Blaxel—you’ll compose them, monitor them, and have agents that manage other agents. We’re already seeing use cases emerge in finance, security, and enterprise automation that are pointing in that direction.
Do you envision a future where agents not only run applications—but manage and reconfigure infrastructure autonomously? What are the cultural and security implications of that shift?
Yes, and it’s both exciting and unnerving. Technically, it makes sense—agents can monitor system health, apply patches, optimize workloads. But culturally, it challenges how we think about control and trust in operations. Security-wise, it means rethinking permission models: not just who can act, but what an agent is allowed to become. We’re going to need new abstractions for verifiable autonomy and constrained self-improvement.
What’s the single biggest misconception about what makes agent-native infrastructure unique?
That it’s just about more GPUs or longer runtimes. Agent-native infra is about behavioral affordances—giving agents the ability to remember, explore, adapt, and recover. That requires changes across the stack: storage that tracks evolving state, execution models that support concurrency and branching, observability tuned for reasoning, not just latency. It’s a mindset shift, not just a resource bump.
Which technical regret or limitation from your time at OVHcloud are you most glad to fix at Blaxel?
At OVHcloud, a lot of what we built was constrained by legacy abstractions—VMs, containers, networks—optimized for human-driven workloads. We couldn’t easily break from those paradigms. With Blaxel, we’re starting clean. No need to pretend an agent is a batch job or a microservice. We can build primitives like memory, tools, and goals directly into the runtime—and that unlocks entirely new design space.
Thank you for the great interview, readers who wish to learn more should visit Blaxel.












