In this episode of the Latent Space Podcast, Alessio and Swyx host Nathan Lambert from AI2 to discuss recent advancements and challenges in AI, particularly focusing on reasoning models, reinforcement learning, and tool use. They delve into topics such as the Tulu project, RLVR (Reinforcement Learning with Verifiable Rewards), and the importance of data and training methodologies. The conversation explores the nuances of hybrid reasoning models versus reasoning-only models, the role of search in AI, and the concept of overoptimization. They also touch on the potential of open models, the challenges of creating effective evaluations, and future directions for AI research, including character training and model routing.
Sign in to continue reading, translating and more.
Continue