Futurist Series

AI Is Solving Erdős Problems. What Comes Next?

Published May 25, 2026

Daniel Martin

Only a few months ago, the question felt mostly philosophical: if artificial intelligence can help solve open math problems, what happens to the idea of human genius?

That question is no longer theoretical. Two recent developments involving OpenAI and Google DeepMind suggest that AI is moving from mathematical assistant to mathematical participant. Not in the sense of replacing mathematicians wholesale, but in the more precise and more important sense of generating, searching, checking, and sometimes discovering arguments that can survive expert scrutiny.

The first development came from OpenAI, which announced that an internal general-purpose reasoning model had made a breakthrough on the planar unit distance problem, a famous question posed by Paul Erdős in 1946. The problem asks how many pairs of points in a plane can be exactly one unit apart. For decades, the prevailing belief was that square grid-like constructions were essentially optimal. OpenAI’s model found a new family of constructions that disproved that belief.

The second came from Google DeepMind researchers, who published a paper titled Advancing Mathematics Research with AI-Driven Formal Proof Search. Their system, AlphaProof Nexus, evaluated AI-driven proof generation on open research-level problems and reported that its strongest agent autonomously resolved 9 of 353 open Erdős problems. It also proved 44 of 492 open conjectures from the Online Encyclopedia of Integer Sequences.

Together, these results mark a shift. The important story is not that AI has suddenly solved mathematics. It has not. The important story is that AI systems are beginning to operate inside the research loop itself.

Why Erdős Problems Are a Serious Test for AI

Paul Erdős was one of the most prolific mathematicians in history, and the problems associated with his work occupy a special place in mathematics. Many are easy to state, difficult to solve, and connected to deep areas such as combinatorics, number theory, graph theory, and discrete geometry.

That makes them unusually useful as a benchmark for AI reasoning. They are not school exercises. They are also not always giant, theory-shattering conjectures like the Riemann hypothesis. Instead, many Erdős problems sit in the middle ground where progress depends on finding the right connection, the right construction, or the right overlooked lemma.

This is exactly where AI may be most useful early on. Modern reasoning systems are not just calculators. They can explore many possible proof paths, compare partial strategies, retrieve distant ideas from adjacent fields, and test whether an argument can be made rigorous.

The OpenAI result is striking because the model did not merely polish a known route. It found an unexpected bridge between discrete geometry and algebraic number theory. That is the kind of conceptual jump that mathematicians usually associate with genuine creativity.

OpenAI and the Unit Distance Breakthrough

The planar unit distance problem is simple to describe. Place n points in the plane. Count how many pairs of points are exactly one unit apart. The goal is to understand how large that count can be as n grows.

For nearly 80 years, mathematicians suspected that the best constructions would not dramatically outperform square grid-based arrangements. OpenAI’s model challenged that assumption by producing an infinite family of examples that beat the expected bound by a polynomial improvement.

That matters for two reasons. First, it changes the mathematical picture. It suggests that number theoretic constructions may have more to contribute to discrete geometry than many researchers assumed. Second, it changes the AI picture. The model involved was described as a general-purpose reasoning model, not a system built only for this specific problem.

In other words, the system appears to have transferred reasoning power into an unfamiliar research setting. It did not simply memorize a known proof. It generated a result that external mathematicians treated as a major contribution.

Google DeepMind and Formal Proof Search

Google DeepMind’s paper addresses a different but equally important question: how can AI-generated mathematics be made reliable?

Language models can produce elegant-looking arguments that contain subtle mistakes. In normal prose, those mistakes may be hard to detect. In mathematics, one wrong step can invalidate the entire proof. This is why formal proof systems such as Lean matter. Lean does not care whether an argument sounds persuasive. Every logical step must check.

AlphaProof Nexus uses that constraint as part of the workflow. AI agents generate proof attempts in Lean, receive feedback from the compiler, revise their approach, and continue searching. The stronger version coordinates subagents and uses more advanced proof tools to focus the search.

Development	AI Method	Why It Matters
OpenAI unit distance result	General-purpose reasoning model	Showed AI can produce an original construction for a prominent open problem
Google DeepMind AlphaProof Nexus	LLM-guided Lean proof search	Showed AI can formally resolve multiple open Erdős problems
Formal proof verification	Compiler-checked logic	Reduces the risk of convincing but invalid mathematical output

The DeepMind result is especially important because it connects AI reasoning to verification. The system does not need to be trusted in the same way a natural language chatbot must be trusted. Its proof either compiles or it does not.

What This Means for Mathematical Research

The current lesson is not that mathematicians are obsolete. It is that the bottleneck in mathematics may be changing.

Historically, a researcher needed to do nearly everything: formulate the problem, survey the literature, test ideas, build the proof, check every step, and communicate the result. AI now threatens to redistribute that labor. Some parts of the workflow may become faster, cheaper, and more automated.

AI can explore many proof routes before a human commits time to one.
Formal systems can verify steps that would otherwise require slow expert checking.
Researchers can use AI to search across distant mathematical subfields.

This could produce a new research style. Instead of asking an AI for an answer, mathematicians may increasingly supervise fleets of proof agents. The human role becomes less like a calculator and more like a research director: choosing the right problems, interpreting the results, detecting conceptual significance, and deciding which paths deserve deeper attention.

The Limits Are Still Real

It is important not to overstate the moment. Google DeepMind’s system resolved 9 of 353 attempted Erdős problems. That is impressive, but it also means most remained unsolved. OpenAI’s unit distance result is a milestone, but it does not imply that every famous conjecture is now within easy reach.

AI systems still struggle when a problem requires a new conceptual framework, when the relevant mathematics is poorly formalized, or when a proof depends on long chains of insight that cannot be easily decomposed into searchable steps. Formal proof libraries are also uneven. Areas with mature Lean coverage are more accessible to AI agents than areas where the foundational material still needs to be encoded.

The systems remain dependent on human problem selection and interpretation.
Formalization can be difficult when definitions are ambiguous or underdeveloped.
AI-generated proof attempts can still hide hard work inside unproven helper claims.

These limits are not failures. They clarify where the next advances must happen. Better theorem-proving agents, richer formal libraries, stronger verification workflows, and more effective human-AI interfaces will all matter.

What Comes Next After AI Solves Erdős Problems?

The most likely near-term future is not one dramatic moment where AI solves all of mathematics. It is a steady expansion of AI-assisted research into domains with clean problem statements, strong formal libraries, and large bodies of fragmented prior work.

Combinatorics, graph theory, number theory, optimization, and discrete geometry are natural early targets. These fields often contain problems where the question is concise, but the solution depends on stitching together ideas from distant places. AI is well-suited to that kind of search.

Over time, the same pattern could extend beyond mathematics. If a model can hold a difficult argument together, test intermediate claims, and connect ideas across fields, those capabilities matter in physics, biology, materials science, cryptography, and AI research itself.

The deeper consequence is cultural. Mathematics has always valued proof, but it has also valued taste: the ability to know which question matters, which abstraction is worth inventing, and which result changes the shape of a field. As proof search becomes more automated, taste may become more important, not less.

Conclusion: Genius Moves Up the Stack

The follow-up to the earlier question is now clearer. If AI can solve open math problems, human genius does not disappear. It moves up the stack.

The scarce skill will not be the ability to grind through every technical step alone. It will be the ability to ask the right questions, frame the right abstractions, judge the meaning of machine-discovered results, and guide AI systems toward problems that matter.

OpenAI’s unit distance breakthrough and Google DeepMind’s formal proof search results do not close the book on human mathematical creativity. They open a new chapter in which mathematicians may work with systems that can explore the terrain faster than any individual mind.

The future of mathematics may not belong to humans or machines alone. It may belong to the researchers who learn how to make both think together.