In this episode of The MAD Podcast, Matt Turck interviews Jerry Tworek, VP of Research at OpenAI, about the evolution of AI reasoning models. They discuss the concept of reasoning in AI, particularly how models like ChatGPT arrive at answers, and the importance of time and "chain of thought." Tworek explains the progression from O1 to O3 and GPT-5, highlighting the advancements in AI's ability to solve puzzles and leverage contextual information. The conversation also explores Tworek's journey into AI, OpenAI's research culture, the balance between research and rapid releases, and the roles of pre-training and reinforcement learning (RL) in developing AI. They delve into the specifics of RL, including the use of human feedback and the challenges of scaling RL for real-world applications, and touch on the topic of AI alignment and the future of AGI.
Sign in to continue reading, translating and more.
Continue