Technology

Google's Deep Think crushes reasoning benchmarks

February 17, 2026 4 weeks ago

Google just released a major update to its Gemini 3 Deep Think reasoning mode, posting dominant scores across math, coding, and science while also introducing its Olympiad level math research agent driven by the new upgrade.

Deep Think hit 84.6% on ARC-AGI-2, obliterating Opus 4.6 (68.8%) and GPT-5.2 (52.9%), and set a new high of 48.4% on Humanity's Last Exam.

It also reached gold medal marks on the 2025 Physics & Chemistry Olympiads and scored a 3,455 Elo on Codeforces, nearly 1,000 points above Opus 4.6.

Google also unveiled Aletheia, a math agent that autonomously solves open problems, verifies proofs, and hits new highs across domain benchmarks.

The Deep Think upgrade is live for Google AI Ultra subscribers in the Gemini app, with API access open to researchers via an early access program.