A deep dive into o3's test-time compute
Extended computation, community reactions, AGI, and job replacement.
Sometimes the most interesting stories in AI start with someone being spectacularly wrong. In this case, it was Terence Tao - widely considered one of the world's greatest mathematicians - who predicted that certain mathematical problems would "resist AIs for several years at least."
Then OpenAI released o3.
O3 is OpenAI's latest reasoning model, and it just achieved something remarkable: 25.2% accuracy on the Frontier Math benchmark, a collection of research-level math problems so complex they can take expert mathematicians days to solve. For context, just two months ago, the best AI systems could only manage 2% accuracy on these problems.
This wasn't supposed to happen. Not yet. Not this fast.
But o3 isn't just another incremental improvement in AI capabilities.
Understanding test-time compute
Here's where things get interesting - and a bit controversial.