Interviews

Elad Raz, CEO of NextSilicon – Interview Series

mm

Elad Raz, CEO of NextSilicon, is an experienced entrepreneur and technology leader widely respected for his deep expertise in low-level systems, security, networking, and file-system development. Over a career spanning elite military engineering roles, senior software leadership, company building, and long-term investing, Raz has led complex, mission-critical projects across operating system internals and hardware-software integration. Prior to founding NextSilicon, he built and exited multiple technology companies, served in senior leadership roles at a leading semiconductor firm, and invested in a diverse portfolio of startups, combining hands-on engineering depth with strong execution and long-term strategic vision.

NextSilicon is an Israeli high-performance computing and semiconductor company founded in 2017 that is redefining compute architecture for demanding workloads such as AI and scientific computing. The company has developed a software-defined, intelligent compute platform designed to deliver high performance and efficiency without requiring developers to rewrite applications. By focusing on adaptability at the hardware level, NextSilicon aims to address fundamental bottlenecks in modern data centers and supercomputing environments, positioning itself as a next-generation alternative to traditional accelerators.

Can you tell us about your journey leading up to the founding of NextSilicon? What sparked the initial idea, and how did your early experiences with computing shape your vision?

I have been fascinated by computers since I was a kid. That fascination led me from tinkering with old Commodore 64s and Ataris (which I still collect to this day) to co-founding startups, and eventually to selling my previous company to Mellanox. But even with those early successes, I kept seeing the same challenge over and over again in this industry. As computational workloads have become more complex, traditional CPU and GPU architectures are reaching performance, power-efficiency, and scalability limits. Whether optimizing algorithms or running large-scale simulations, it became clear that current architectures were forcing workloads to adapt to the hardware – not the other way around.

The spark for NextSilicon came from this recurring challenge and raised the question: What if we could flip the script and build compute architectures that adapt to workloads, rather than forcing workloads to adapt to the hardware? My early exposure to algorithm design and hardware taught me that the real breakthrough would come from combining the two in real time. That is the foundation of our Intelligent Compute Architecture (ICA) and NextSilicon’s guiding vision since the beginning.

Maverick‑2 is described as an Intelligent Compute Accelerator that adapts to real-time workloads. How does its architecture differ from traditional GPUs or FPGAs, and what enables that level of adaptability?

CPUs and GPUs have transformed our world and served us well. But they were never designed to meet the demands of modern AI and high-performance computing (HPC) workloads in fields such as science, weather, energy, and defense. These workloads have complex data dependencies, memory access patterns, and computational patterns that today’s processors weren’t designed to handle. The result is bottlenecks that slow down innovation.

Maverick-2’s key difference is its novel approach that combines a reconfigurable dataflow engine with real-time software optimization. The key architectural difference is that hardware is configured based on your workload, not the other way around.  For Maverick-2, data availability drives computation, rather than a program counter driving instruction execution like in traditional processors. This enables us to create software-defined virtual processing units that can be configured and reconfigured in real time to match specific workload patterns.

The results speak for themselves:  Maverick-2 delivers over 4x the performance-per-watt of GPUs and more than 20x that of CPUs, while reducing operational costs by more than half. As a result, researchers and engineers can run large, irregular simulations faster and more efficiently, unlocking insights and breakthroughs in a fraction of the time.

You’ve reported achieving over 4× performance-per-watt versus GPUs and more than 20× over high-end CPUs. What are the key innovations that drive those performance gains in real-world workloads?

The performance gains come from a few key innovations working together.

First, we departed from the Von Neumann model that has dominated computing for 80 years. Instead of sequential instruction execution, Maverick-2 uses a dataflow architecture where computation follows data availability. This is fundamentally better suited for irregular, memory-intensive workloads.

Second, our self-optimizing architecture generates software-defined processor cores in real time. The hardware adapts to each application’s needs without requiring code rewrites—you get optimization without the overhead.

Third, and this is critical: we focus on sustained real-world performance, not theoretical peaks. Many architectures look great on paper but falter on actual workloads. Maverick-2 maintains efficiency across AI, HPC, and vector databases by continuously adapting to the workload’s needs.

Maverick‑2 supports C/C++, Fortran, OpenMP, and Kokkos out of the box without requiring code changes. How have developers responded to that compatibility, and what are your plans for supporting CUDA, ROCm, or popular AI frameworks?

Developers love that Maverick-2 is a genuine “drop-in replacement.” They can run their existing applications immediately without the porting barriers that plague this industry. We currently support C/C++, Fortran, OpenMP, and Kokkos out of the box, with CUDA, ROCm, and major AI frameworks like TensorFlow, JAX, PyTorch, and ONNX in active development. This eliminates vendor lock-in and costly code rewrites and allows customers to evaluate and adopt new architectures without disrupting their workflows.

How does Maverick‑2’s telemetry-driven system optimization work behind the scenes? What’s involved in profiling and reconfiguring the chip in real time?

Think of our system optimization as a continuous loop: during execution, our telemetry system measures hundreds of performance indicators (e.g., memory bandwidth, utilization, queue depths). All of that data is fed into a runtime optimizer that determines whether the current hardware configuration remains optimal for the workload and its projected needs. If not, it can re-partition resources, re-order data paths, and adjust compute pipelines continuously, meaning the application doesn’t stop. This occurs within milliseconds, so the application maintains consistent peak efficiency as its computational profile changes.

Are there specific types of workloads or edge cases where adaptive runtime performance tuning is less effective or introduces trade-offs in latency or power? 

Any architecture will have trade-offs. Maverick-2 excels at complex, irregular workloads with shifting computational and data-access patterns. For highly predictable, fixed-function workloads, a well-tuned GPU can be very efficient without the overhead of adaptation. In those cases, our adaptability still delivers solid performance, but the relative advantage may be smaller.

NextSilicon’s design is all about versatility and competitiveness in straightforward cases, but transformative in challenging ones.

Why did you decide to prioritize the HPC market early on, when most startups were rushing into AI? How has that shaped your product and business strategy?

HPC represents the frontier of computational complexity and problem-solving for massive datasets, irregular memory access, and unpredictable computational patterns. If you can build an architecture that thrives there – such as running exabyte-scale simulations in climate modeling or particle physics, it’ll excel in AI too.

By focusing on HPC first, we proved Maverick-2 on the most demanding workloads in climate modeling, physics, and life sciences. That gave us credibility, real-world performance data, and a mature product before moving deeper into AI markets. Now, we’re positioned to serve both, without having compromised our architecture to capitalize on short-term trends or needs.

Now that Maverick‑2 is in production and deployed across dozens of customers, can you share examples of how it’s being used? Any specific results or benchmarks from flagship deployments?

A flagship example is our deployment at Sandia National Labs, where Maverick-2 is powering their Spectra supercomputer as part of the Vanguard-II program. We’re seeing impressive out-of-the-box performance results without requiring code modifications. We’re also working with ODISSEE (Online Data Intensive Solutions for Science in the Exabytes Era), which brings together leading research institutions to handle exabyte-scale data from CERN’s High-Luminosity Large Hadron Collider and the Square Kilometre Array Observatory. The role Maverick-2 will play is solving the challenge of processing petabytes of raw experimental data in a fraction of the time and energy previously required. The end goal is to enable faster physics analyses and astronomical discoveries.

You’ve raised over $300 million, with major rounds announced in 2021 and more recently. Can you share how that funding has accelerated your product development and market reach? 

The funding allowed us to do three things. First, we could push the architecture further, not just incremental improvements, but fundamental advances that made Maverick-2 production-ready.  Second, we scaled our manufacturing and supply chain to meet growing demand, a non-trivial challenge for novel silicon. Third, we expanded our software ecosystem to enable customers to integrate and deploy faster.

It also enabled strategic partnerships with supercomputing centers and cloud providers, allowing us to streamline our concept-to-deployment process and expand much faster than traditional hardware startups.

In a landscape that includes Cerebras, SambaNova, and Nvidia, how do you see NextSilicon positioning itself? What’s your go-to-market approach as a challenger?

We see NextSilicon as more of a technology company rather than just another chip company. We optimize every workload, provide adaptability, and don’t lock customers into proprietary programming or hardware. Our customers can bring their existing code and achieve immediate acceleration without lengthy porting cycles or vendor lock-in to a single ecosystem, such as CUDA. This matters especially as AI workloads evolve beyond pure training. Reasoning models, extended inference, and large context windows require fundamentally different compute patterns: more dynamic memory access, variable-length computation, and adaptive resource allocation. These aren’t problems you solve with more of the same fixed architecture.

Our go-to-market strategy focuses on solving the hardest problems first: working with research institutions, national labs, and enterprises where performance, energy efficiency, and flexibility are critical. From there, we scale into broader AI and data-intensive markets. The industry is dominated by fixed architectures. We are offering something different, adaptability built from the start. AI is shifting from pure scale to intelligence. Bigger models are enabling smarter reasoning, from short prompts to long-context understanding. In this transition, adaptable architectures won’t just be novel; they’ll be essential.

Thank you for the great interview, readers who wish to learn more should visit NextSilicon.

Antoine is a visionary leader and founding partner of Unite.AI, driven by an unwavering passion for shaping and promoting the future of AI and robotics. A serial entrepreneur, he believes that AI will be as disruptive to society as electricity, and is often caught raving about the potential of disruptive technologies and AGI.

As a futurist, he is dedicated to exploring how these innovations will shape our world. In addition, he is the founder of Securities.io, a platform focused on investing in cutting-edge technologies that are redefining the future and reshaping entire sectors.