Development
An OpenAI model solved a famous math problem that stumped humans for 80 years
June 1, 2026 Development Source: Ars Technica
Share this article
That might not last. AI systems have been improving at math so rapidly that it’s unclear what role, if any, human mathematicians will play a decade from now.
Paul Erdős was one of the most prolific mathematicians in history. He wrote over 1,500 papers in his lifetime, the most ever. One of his greatest talents was coming up with problems that are simple to state but have deep roots.
In 1946, he introduced the unit distance problem. Imagine you have some points in a 2D plane and you measure the distance between each pair of points:
Credit: Kai Williams / Understanding AI Credit: Kai Williams / Understanding AI
OpenAI’s diagram is based on choosing c² = 65, which can be satisfied by either 1² + 8² = 65 or 4² + 7² = 65. This means that if the grid spacing is 1/√65, each point will be one unit away from 16 other points: (1,8), (4,7), (7,4), (8,1), (-1,8), (-4,7), and so forth. Larger values for c²—if they’re chosen carefully—enable more whole-number diagonals and hence more unit-distance pairs.
However, if c² is too large compared to the number of points in the grid, then many of the potential one-unit-away neighbors will be outside the grid.
In short, we want to choose a c² that’s large enough but not too large. Using insights from number theory, including Jacobi’s two-square theorem, Erdős was able to show that an optimally sized circle will enable the number of unit-distance pairs to grow faster than the number of points, but only barely.
The question became “can you do better?” To find an upper bound, Erdős used an argument from a quite different area of mathematics called graph theory to show that you could only have so many unit distances. But his upper bound grows much, much faster than the best lower bound he was able to construct.
Erdős’s conjecture was that the actual optimum was much closer to the lower bound than the upper one. He predicted, but couldn’t prove, that the maximum number of unit-distance pairs grows just barely faster than the number of points.
To be more precise, Erdős conjectured that the number of unit distances would be n^(1+o(1)). In other words, for a sufficiently large n, the maximum number of unit distances would be less than n^(1+𝜖) for any 𝜖 > 0. That could end up growing a little faster than his lower-bound construction—which was n^(1 + C/(log log n)) for some constant C—but within the same general ballpark.
Proving his guess became known as the unit distance problem. For the next 80 years, it looked like Erdős was right.
Erdős’s conjecture assumed that, at least for a large number of points, a square grid could yield about as many unit-distance pairs as organizing the points in other ways. OpenAI’s AI proved this wrong by demonstrating that there was another, more complex way to organize n points that allowed more pairs to be exactly one unit apart.
Precisely because the new pattern of points is more complicated, it’s tricky to explain it concisely. But you can think of it as a clever modification of Erdős’s grid.
The AI constructed a grid in a high-dimensional space and then projected this more complex structure into two dimensions. And instead of using a whole-number grid with points like (1,3) or (-3,6), the AI construction used something called algebraic integers to build this more complicated grid. It turns out that this kind of higher-dimensional grid has richer structure, which allows the AI to pack more unit distances into the same number of points.
It’s hard to illustrate this alternative arrangement of points because it only becomes advantageous with a very large number of points. But here’s a simpler arrangement of points that was constructed in a similar way. You can click here if you want to play with the illustration yourself.
It has 1,345 points and only produces 5,916 unit distances, fewer than the 7,632 unit distances that a square 1,296-point grid produces using the Erdős technique. But I think it gives a sense of how a pattern that isn’t a grid could produce more unit distances than a square grid.
So AI companies have been working to develop LLM systems that can directly output a correct solution to any math problem. OpenAI’s result is a substantial step in that direction. But it also fits the pattern of previous AI-assisted mathematics.
For one thing, other companies have also worked to solve Erdős problems. Because Erdős posed hundreds of problems over his career—and because mathematician Thomas Bloom has organized an effort to compile all of them at www.erdosproblems.com—AI companies have used them as a testing ground to evaluate AI systems. In January, Cambridge undergraduate Kevin Barreto worked with a friend to ask GPT-5.2 and Harmonic’s Aristotle to produce the first autonomous solution of an Erdős problem. On May 22, two days after OpenAI’s announcement, Google announced that its AI system had solved nine open Erdős problems, including two that had been open for over 50 years.
To be clear, the problem that OpenAI solved is more impressive than any of the other work I just mentioned. But OpenAI’s solution is more in line with past AI efforts than the headline result might suggest.
One reason the unit distance problem was unsolved for 80 years, despite being so well known, is that most people thought Erdős’s conjecture was true. But the mathematical tools we have are nowhere close to being able to prove Erdős’s bound. So mathematicians expected that any proof of the conjecture would involve major new ideas or approaches.
Instead, as we’ve seen, the AI disproved the conjecture by making an extension of Erdős’s initial construction. It was a clever and nonobvious solution, but it also bore some similarity to the kind of optimization work done by a system like AlphaEvolve.
This dynamic is reflected in some of the mathematicians’ responses. Mathematician Tim Gowers wrote that when he first heard about the AI’s result, he thought it had proved the theorem. “I spent the evening adjusting my world view: If the AI could come up with a proof like that, then maybe it would be all over for mathematicians very soon.”
But the next morning, Gowers and other external reviewers received an email about the result, and he realized that the LLM “had disproved the conjecture rather than proving it, which came as a big relief.”
To be clear, what the AI system did is still impressive. “It’s always tempting to look at a completed proof and declare it obvious after the fact,” Tsimerman said later in his remark. But as I noted previously, it also played to the strengths of AI systems.
In the short to medium term, this points to a world where AI models complement humans but do not replace them. AI systems will tackle lists of problems curated by human mathematicians or aid humans in finding relevant approaches from seemingly unrelated mathematical fields. But they won’t immediately displace the human role in choosing which questions to ask or developing wholly new techniques.
Even this result was very much a human-AI collaboration. While the AI system found the proof on its own, human mathematicians verified the result. Other humans came up with better-written proofs that extended the AI’s initial ideas, like Will Sawin finding an explicit lower bound as I mentioned above.
It’s unclear how long this complementarity will last, however. Gowers spent the rest of his comment exploring whether the relief he felt on hearing that AI had disproved the conjecture was justified. He more or less concluded that it was, but in a footnote, he wrote that he would guess “that AI will soon reach a high level at other activities such as building theories, formulating definitions and asking interesting questions.”
In the past year, we’ve gone from AI systems that hadn’t yet beaten high school mathematics competitions to ones that can advance mathematics in interesting ways. It seems likely that AI systems will continue to become more autonomous when working on mathematical problems.
At the same time, we haven’t fully explored what current models can achieve in math. Soon after OpenAI’s announcement, University of Michigan postdoc Xiao Ma found that GPT-5.5 was also able to prove Erdős wrong if given a small hint. If a generally available model could disprove this famous conjecture and no one noticed, what other discoveries could happen today that no one has thought to try?
Kai Williams is a reporter for Understanding AI, a Substack newsletter founded by Ars Technica alum Timothy B. Lee. His work is supported by a Tarbell Fellowship. Subscribe to Understanding AI to get more from Tim and Kai.