Interviews

Dr. Xianxin Guo, CEO and Co-Founder of Lumai – Interview Series

mm

Dr. Xianxin Guo, CEO and Co-Founder of Lumai, is a physicist and deep-tech entrepreneur specializing in optical computing and AI hardware, with a PhD in quantum physics and nonlinear optics from the Hong Kong University of Science and Technology. He previously held research roles including a postdoctoral position at the University of Calgary and an 1851 Research Fellowship at the University of Oxford, where he contributed to advances in photonics and AI acceleration. Rising through Lumai from Head of Research to CEO, he is the primary inventor behind the company’s core technology and brings over a decade of experience at the intersection of physics, machine learning, and advanced computing systems.

Lumai is an Oxford University spinout developing next-generation AI processors based on 3D optical computing, using light instead of electricity to perform key AI calculations. Its technology is designed to accelerate matrix operations that underpin modern AI models, delivering significantly faster processing speeds while reducing energy consumption compared to traditional silicon-based GPUs. By integrating optical computation into existing data center environments, Lumai aims to enable more scalable and cost-efficient AI deployment, addressing the growing limitations around compute power and energy usage in large-scale AI systems.

You began your career in quantum physics and nonlinear optics, later becoming an 1851 Research Fellow at the University of Oxford before co-founding Lumai from your research. What was the pivotal moment when you realized optical computing could move from academic theory to a commercially viable company?

During my time at the University of Oxford, we were exploring how the properties of light in free space could be used to solve the kinds of matrix operations that underpin machine learning. Around the same time, the limitations of conventional hardware for AI were becoming more important. The convergence of these challenges that we had solved in our research and the need for more efficient compute gave us the confidence that we could take our ideas and solve real world problems.

We have come a long way from that initial research – at Lumai we have now built the world’s first optical computing system capable of running billion-parameter LLMs in real time.

Lumai is tackling one of the biggest bottlenecks in AI today, the energy and scalability limits of silicon-based computing. What specific limitations in traditional architectures pushed you toward a fundamentally different approach using light?

What pushed us was the limited trajectory of silicon solutions. With silicon, you are seeing incremental gains, but they come with disproportionate increases in power and complexity. The limitation of silicon scaling is primarily down to the physics – frequencies are not increasing, and the number of transistors that can be switched is limited by thermals. Leakage currents continue to be an issue. It is estimated that silicon only contributes to 25% year-on-year increase in performance.

At that point, it makes sense to ask whether a different physical medium might handle those operations more naturally, rather than continuing to push electrons harder.

Your work focuses on optical computing and machine learning. How does using photons instead of electrons fundamentally change the way computation happens at the hardware level?

With electrons, computation is inherently sequential and lossy – you are switching transistors, moving charge, generating heat. Every operation has a thermal cost, and that cost accumulates.

Photons behave very differently. Light travels without the same resistive losses, and critically, by using the properties of light, enormous numbers of matrix operations can be executed in parallel simply by structuring how beams of light interact through a physical medium. The computation is happening in the propagation of light itself, not in the switching of billions of gates.

Lumai’s technology leverages 3D optical processing and massive spatial parallelism. Can you explain how this architecture enables such dramatic improvements in throughput and efficiency compared to GPUs?

The goal is to perform dense matrix multiplication as efficiently and as fast as possible in a single cycle. Lumai’s approach does exactly this by using light in a three-dimensional volume, performing millions of operations simultaneously.

You simply cannot achieve that level of parallelism in 2D structures, where operations are processed across hundreds of cores requiring constant data movement. It is this inherent parallelism – combined with the fact that once you are in the light domain, operations can be performed without burning power – that drives both the throughput improvement and the dramatic reduction in energy per token.

Many AI infrastructure companies are still focused on training, yet Lumai is targeting inference. Why do you believe inference is the defining challenge of this next phase of AI?

Inference is where AI actually does something useful – every query answered, every agent task completed, every document generated. We have now entered the inference era, and demand is growing at a rate that training-focused hardware was never designed to absorb.

The economics are also different: inference runs continuously, across millions of users. Cost per token becomes the defining metric, and that is where the energy wall hits hardest.

What makes inference particularly well-suited to optical compute is that the prefill stage is heavily compute-bound. In this prefill stage of disaggregated inference the full context is processed before generating a response. This maps almost perfectly onto our optical engine and it is where we have focused first.

One of the long-standing challenges in optical computing has been stability and scalability. What were the key technical breakthroughs that allowed Lumai to overcome these barriers?

The challenge was never demonstrating that optics could perform computation – researchers had shown that in principle for years. The challenge was making it work at scale, outside the lab.

Two things mattered most. First, we use the same type of components already deployed in data centres today for communication and networking. No exotic materials, no speculative supply chain. Second, we made a deliberate architectural choice to use a hybrid design, combining the optical tensor engine with digital processing for system control and software.

Your system uses a hybrid approach combining optical and digital components. How important is this balance in making optical computing practical for real-world data centre deployment?

It is fundamental. Optical computing does not mean replacing everything with light. Digital systems are extraordinarily good at control, sequencing, and interfacing with the software ecosystem the industry has built up over decades. Our optical engine excels at the core mathematical operations that dominate inference compute. The hybrid architecture lets each component do what it does best.

From a deployment standpoint, this matters enormously. Lumai Iris integrates into existing data centre infrastructure, uses standard interfaces, and runs real models including Llama 8B and 70B today.

With the announcement of the Lumai Iris family, particularly the Iris Nova server, what does achieving real-time inference on billion-parameter models signal for the future of AI infrastructure?

It signals that optical compute has crossed from research into reality. Running billion-parameter models in real time is the proof point the industry needed. Lumai Iris Server family consists of three servers: Nova, Aura, and Tetra. Lumai Iris Nova, the first server in the family, is available for evaluation now, and we are already engaging with partners who want to put it to work against real inference workloads.

More broadly, it signals that the trajectory of AI infrastructure is about to change. The assumption has been that scaling inference means buying more GPUs, drawing more power, building larger data centres. Lumai Iris Nova shows there is another path – one that delivers dramatically more performance per kilowatt and a fundamentally different cost structure per token. As the Lumai Iris server family develops, the implications for how hyperscalers and enterprises think about compute procurement will be significant.

The press release highlights up to 90% lower energy consumption compared to traditional systems. How significant is this breakthrough in the context of the growing energy constraints facing global data centres?

The energy constraint is the defining infrastructure challenge of the AI era – power capacity is already a limiting factor on deployment plans and we have hit the so-called power wall.

Against that backdrop, a 90% reduction in energy consumption changes the fundamental economics and feasibility of AI at scale. A single Lumai system can replace tens of power-hungry GPUs, which translates into a significant shift in what is achievable within a given power envelope.

There is also a cost dimension: data centre build costs reflect power capacity, so a lower-power data centre costs less to build. Reducing energy consumption directly reduces cost per token – which is ultimately what makes AI economically viable at the scale the industry is building towards.

Looking ahead, as the industry begins to talk about a post-silicon era, how do you see optical computing evolving over the next decade, and what role will Lumai play in shaping that transition?

The post-silicon era is already beginning, and it is happening at the same time as the shift to the inference era and the continued demand for more performance at lower cost per token. Silicon will of course continue to play a role, but the assumption that every generation of compute improvement comes from advancing silicon nodes is no longer credible at the rate AI demands. We see optical compute being used in key parts of the stack where highly parallel, high-throughput processing is needed.

For Lumai, the roadmap is about continuing to push the density, efficiency, and capability of optical compute and rolling this out to data centres.  The vision is a world where the energy cost of intelligence falls and where a megawatt-scale data centre can generate the same token volume as a gigawatt-scale facility does today.

That future is not distant speculation. We have built the first system that proves optical compute works at scale. Everything from here is engineering.

Thank you for the great interview, readers who wish to learn more should visit Lumai.

Antoine is a visionary leader and founding partner of Unite.AI, driven by an unwavering passion for shaping and promoting the future of AI and robotics. A serial entrepreneur, he believes that AI will be as disruptive to society as electricity, and is often caught raving about the potential of disruptive technologies and AGI.

As a futurist, he is dedicated to exploring how these innovations will shape our world. In addition, he is the founder of Securities.io, a platform focused on investing in cutting-edge technologies that are redefining the future and reshaping entire sectors.