GPT-5 Pro Disproves 80-Year-Old Erdős Conjecture via Formal Verification

Marcus Webb

Senior Backend Analyst

The Pitch

OpenAI has released GPT-5 Pro, which autonomously disproved the 80-year-old Erdős planar unit distance conjecture using cross-disciplinary algebraic number theory. Verified via the Lean language, this result marks a shift from conversational mimicry to formal mathematical discovery. See OpenAI profile

Under the Hood

The model disproved the $n^{1+o(1)}$ upper bound for the unit distance problem, identifying point sets with more than $n^{1+\delta}$ distances (source: OpenAI Index, May 20, 2026). Verification was completed using the Lean formal proof language and a peer-review panel including Noga Alon and Melanie Wood (source: Dataconomy/KuCoin).

In terms of benchmarks, GPT-5 Pro reached 94.6% on AIME 2025 and 88.4% on GPQA Diamond (source: Stark Insider). While these numbers are statistically high, the system remains a black box. OpenAI has not disclosed the total compute budget or the number of failed attempts required to generate the successful proof (source: Reddit r/MachineLearning).

Current pricing for GPT-5 sits at $1.25 per 1M input tokens and $10 per 1M output tokens as of May 2026 (source: Vellum.ai). Critics argue the system acts as a "Stockfish for math," discovering counterexamples through massive search rather than fundamental conceptual understanding (source: HN Comment 4).

There are significant technical gaps in the public documentation. We don't know the exact model architecture—speculated to be a "Strawberry/o3" successor—and the specific "non-trivial tweaks" mentioned by postdocs in the Companion Remarks remain undisclosed (source: HN Comment 1).

OpenAI is also contending with a credibility gap following the April 2026 resignation of VP Kevin Weil. Weil departed after it surfaced that the company had falsely claimed GPT-5 solved 10 Erdős problems in late 2025 (source: AutoGPT/Dataconomy). This history of misrepresentation necessitates a cautious approach to their latest claims.

Marcus's Take

Do not mistake a brute-force search success for general reasoning capabilities in your production environment. While the Erdős proof is a genuine academic milestone, the high output costs and lack of transparency regarding failed attempts suggest GPT-5 is a high-compute specialist. Use it for complex verification where accuracy is non-negotiable, but stick to Claude 4.5 Opus or Claude 4 Sonnet for standard backend orchestration.

Ship clean code,
Marcus.

Marcus Webb

Marcus Webb - Senior Backend Analyst at UsedBy.ai

Trend Analysis·3 min read

Audiomass: Multitrack Audio Editing via 100kb of Vanilla JavaScript

Audiomass is a browser-based, multitrack audio editor that operates entirely client-side with a remarkably small 100kb footprint (audiomass.co). It provides a workflow reminiscent of classic editors l

Trend Analysis·3 min read

Magnifica Humanitas: The Vatican’s Framework for the GPT-5 Era

The document, signed May 15 and officially released today, was presented at the Vatican alongside Christopher Olah, co-founder of Anthropic and lead of its interpretability team (ncronline.org, Forbes

Trend Analysis·3 min read

The Zero-Click Economy: Kagi Search vs. Google AI Mode

Google has effectively pivoted to an "answer engine" where Gemini 3.5 Flash provides conversational summaries, while Kagi remains the primary refuge for users seeking a human-centric, ad-free index. W

Stay Ahead of AI Adoption Trends

Get our latest reports and insights delivered to your inbox. No spam, just data.

The Pitch

Under the Hood

Marcus's Take

Related Articles

Audiomass: Multitrack Audio Editing via 100kb of Vanilla JavaScript

Magnifica Humanitas: The Vatican’s Framework for the GPT-5 Era

The Zero-Click Economy: Kagi Search vs. Google AI Mode

Stay Ahead of AI Adoption Trends