How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek | The MAD Podcast with Matt Turck

In this episode of The MAD Podcast, Matt Turck interviews Jerry Tworek, VP of Research at OpenAI, about the evolution of AI reasoning models. They discuss the concept of reasoning in AI, particularly how models like ChatGPT arrive at answers, and the importance of time and "chain of thought." Tworek explains the progression from O1 to O3 and GPT-5, highlighting the advancements in AI's ability to solve puzzles and leverage contextual information. The conversation also explores Tworek's journey into AI, OpenAI's research culture, the balance between research and rapid releases, and the roles of pre-training and reinforcement learning (RL) in developing AI. They delve into the specifics of RL, including the use of human feedback and the challenges of scaling RL for real-world applications, and touch on the topic of AI alignment and the future of AGI.

Outlines

Part 1: Introduction and Background

Part 2: Technical Deep Dive

Part 3: Future Outlook

Sign in to continue reading, translating and more.

Open full episode in Podwise

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

The MAD Podcast with Matt Turck

Part 1: Introduction and Background

Introduction to Reasoning in AI Models

The Evolution of Reasoning Models and Jerry Tworek's Early Life

Transition to AI and Collaborative Culture at OpenAI

Part 2: Technical Deep Dive

Pace of Releases and Reinforcement Learning Basics

Reinforcement Learning Terminology and Evolution

Unsupervised Learning, GRPO Release, and Scaling RL

Alignment, Math Prowess, and Generalization of RL

Part 3: Future Outlook

The Path to AGI and Future Directions

Conclusion

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

The MAD Podcast with Matt Turck

Part 1: Introduction and Background

00:00Introduction to Reasoning in AI Models

Introduction to Reasoning in AI Models

07:21The Evolution of Reasoning Models and Jerry Tworek's Early Life

The Evolution of Reasoning Models and Jerry Tworek's Early Life

17:20Transition to AI and Collaborative Culture at OpenAI

Transition to AI and Collaborative Culture at OpenAI

Part 2: Technical Deep Dive

29:27Pace of Releases and Reinforcement Learning Basics

Pace of Releases and Reinforcement Learning Basics

37:41Reinforcement Learning Terminology and Evolution

Reinforcement Learning Terminology and Evolution

47:52Unsupervised Learning, GRPO Release, and Scaling RL

Unsupervised Learning, GRPO Release, and Scaling RL

59:19Alignment, Math Prowess, and Generalization of RL

Alignment, Math Prowess, and Generalization of RL

Part 3: Future Outlook

1:09:14The Path to AGI and Future Directions

The Path to AGI and Future Directions

1:15:20Conclusion

Conclusion