Sunday, January 18, 2026

OpenAI Model Wins Gold at IMO

- Advertisement -

OpenAI Model Wins Gold at IMO

OpenAI researchers have announced a major advance in general reasoning: their experimental language model achieved gold medal-level performance at the 2025 International Mathematical Olympiad (IMO). The model solved 5 out of 6 problems, under the same timed conditions as human competitors, earning 35 out of 42 points—enough for first place at the world’s top high school mathematics competition.

This breakthrough sets the model apart from existing AI systems. Recent evaluations from MathArena.ai show that leading AI models like Gemini 2.5 Pro, Grok-4, and OpenAI’s own o3 scored significantly lower—none even reached bronze medal levels, and the best competitor earned only 13 points.

General Reasoning Breakthrough

Unlike specialized math systems, the model demonstrates general reasoning. Researcher Alexander Wei explained, “a general-purpose reasoning LLM that incorporates experimental techniques.” The AI worked without internet access or computational aids, generating full natural language proofs during two 4.5-hour sessions—mirroring human exam conditions.

“We achieved this milestone by forging advances in general-purpose reinforcement learning and scaling test-time compute,” Wei noted on social media. Former IMO medalists independently graded the AI’s solutions, reaching full consensus on its medal-worthy score.

OpenAI scientist Noam Brown highlighted the model’s extended reasoning powers: “It thinks for hours, unlike previous AIs operating on seconds or minutes. Slightly surpassing top human performance represents a huge leap.”

General Reasoning in Research Model Only

Despite this demonstration of general reasoning, OpenAI clarified that the model is strictly experimental and not available to the public. Wei stated, “We don’t plan to release anything with this level of math capability for several months.” The update follows confirmation of the imminent, but separate, release of GPT-5—developed by a different team.

The announcement’s timing is notable: recent benchmarks reveal even state-of-the-art AI models providing solutions “filled with logical errors, incomplete arguments, and even made-up theorems,” as reported by The Decoder.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

1,468FansLike
141FollowersFollow
440FollowersFollow
227SubscribersSubscribe
- Advertisement -

Latest Articles