In this episode of the Latent Space podcast, Alessio and Swyx are joined by Will Brown from Prime Intellect to discuss the newly released Claude 4. The conversation covers Claude 4's emphasis on coding and agentic capabilities, downplaying reasoning aspects compared to previous versions. They speculate on the differences in how Claude's extended thinking works versus older models, touching on model routing and the role of reinforcement learning. The discussion shifts to the controversy around Claude's safety testing, including its potential to report users for harmful requests, and the broader implications for AI safety and tool use. They also explore the challenges of reward hacking, the utility of thinking budgets, and the role of academia in AI evaluations. The episode concludes with a discussion on multi-turn RL, model-based rewards, and Will Brown previewing his upcoming talk at the AI Engineer World's Fair.