This episode explores OpenAI's advancements in AI reasoning, particularly focusing on scaling compute and reinforcement learning (RL). Dan Roberts, formerly of Sequoia and now at OpenAI, discusses the evolution of AI models from GPT-4 to O3, highlighting the increasing importance of test-time compute, where models improve with more thinking time. Against the backdrop of these advancements, Roberts uses the analogy of Einstein's discovery of general relativity to illustrate the potential for AI to make significant contributions to human knowledge. More significantly, he emphasizes a shift in focus towards reinforcement learning, envisioning a future where RL compute dominates pre-training compute, contrary to the current AI research paradigm. Roberts outlines OpenAI's plan to scale compute by investing in infrastructure and developing scaling science to guide this expansion. He also touches on the limitations of current models, which, despite their capabilities, still lack the depth of understanding needed for groundbreaking discoveries, suggesting that scaling up is the key to unlocking further potential. The discussion concludes with a prediction that AI models could potentially achieve breakthroughs like discovering general relativity within the next nine years, driven by exponential growth in task-solving abilities.