Partnerships
OpenAI Taps Cerebras for $10 Billion in Low-Latency Compute

OpenAI announced a multi-year agreement with chip startup Cerebras Systems that will deliver 750 megawatts of dedicated AI compute to the ChatGPT maker, in what both companies describe as the largest high-speed inference deployment ever attempted.
The deal, valued at over $10 billion according to sources familiar with the terms, marks OpenAI’s most significant infrastructure bet outside its primary relationship with Microsoft. Cerebras will build and host the systems in phases through 2028, with the first capacity coming online this year.
The partnership targets a specific problem: speed. While OpenAI has scaled ChatGPT to 800 million weekly users, the company faces compute constraints that slow response times—particularly for demanding workloads like code generation, agentic tasks, and real-time voice interaction.
“Cerebras adds a dedicated low-latency inference solution to our platform,” said Sachin Katti, who leads OpenAI’s compute strategy. “That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people.”
Why Wafer-Scale Silicon Matters
Cerebras’s pitch centers on its wafer-scale processors—chips the size of dinner plates that eliminate the communication delays inherent in systems stitched together from many smaller GPUs. The company claims its architecture delivers inference speeds up to 15 times faster than GPU-based alternatives, with models like GPT-OSS-120B running at roughly 3,000 tokens per second.
For OpenAI, that speed translates directly to user experience. When AI responds in real time—without the latency that makes conversations feel artificial—users engage more deeply and accomplish more. The company tested Cerebras’s silicon with its open-weight models before Thanksgiving, and technical conversations between the teams quickly escalated to a signed term sheet, according to Cerebras CEO Andrew Feldman.
“Just as broadband transformed the internet, real-time inference will transform AI,” Feldman said. “This enables entirely new ways to build and interact with AI models.”
The comparison isn’t hyperbole. Early dial-up internet supported email and basic browsing; broadband enabled streaming video, voice calls, and eventually the smartphone app economy. OpenAI appears to be betting that sufficiently fast inference will similarly unlock applications that current latency makes impractical—particularly for AI agents that must chain multiple operations together without human patience wearing thin.
The Infrastructure Arms Race Intensifies
The Cerebras deal comes as AI infrastructure valuations have exploded, with Databricks recently raising at $134 billion and Cerebras itself reportedly in talks for fresh funding at a $22 billion valuation. The compute demands of frontier AI models show no signs of plateauing, and companies are scrambling to lock in capacity before competitors do.
For Cerebras, the OpenAI partnership solves a business concentration problem. The United Arab Emirates’ G42 accounted for 87% of Cerebras’s revenue in the first half of 2024—a customer concentration that made investors nervous. Adding OpenAI as a major customer ahead of a potential IPO significantly de-risks the business.
For OpenAI, the deal diversifies its AI infrastructure beyond Microsoft’s Azure cloud. While Microsoft remains OpenAI’s primary compute provider, the Cerebras partnership gives OpenAI dedicated low-latency capacity optimized specifically for inference—a different workload than the training runs Microsoft’s infrastructure handles.
The timing also matters. OpenAI recently released GPT-5.2 amid intensifying competition from Google’s Gemini. As models grow more capable, the companies deploying them are discovering that raw intelligence isn’t enough—users also expect near-instantaneous responses. A brilliant AI that takes ten seconds to answer feels broken; the same AI responding in under a second feels magical.
Sam Altman, OpenAI’s CEO, is already an investor in Cerebras, and OpenAI once considered acquiring the company outright. This deal suggests the relationship is evolving into something more strategic: a partnership where both companies’ fates become intertwined in the race to make AI feel truly conversational.












