John Schulman (OpenAI Cofounder) — Reasoning, RLHF, & plan for 2027 AGI | Dwarkesh Podcast

This podcast episode delves into various aspects of AI models, including pre-training and post-training, the potential for longer time horizons in task completion, generalization and affordances, the need for coordination and caution in the face of AGI, the importance of establishing limits and monitoring systems, fine-tuning and context learning, the development of ChatGPT, the progress and potential limitations of language models, the scaling law and sample efficiency, keeping humans in the loop in AI-run companies, the challenges of replicability in social sciences, improving chatbot personality, preference models and the existence of a moat in the field, and the potential of AI assistants. The episode provides insights into the capabilities, challenges, and future potential of AI models.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise

John Schulman (OpenAI Cofounder) — Reasoning, RLHF, & plan for 2027 AGI

Dwarkesh Podcast

Pre-training and Post-training in AI Models

Long Horizon Tasks and Model Intelligence

Generalization and Affordances in AI Models

Preparing for the Potential Arrival of AGI: Coordination and Caution

Ensuring Coordination and Deployment Limits for AI Systems

Ensuring the Safety of Deployed AI Systems: Monitoring and Early Detection

The Role of RLHF in Reasoning and the Importance of Nucleus Premium Genetic Testing

The Role of Fine Tuning and Context Learning in Model Training

Development of ChatGPT and Challenges with Instruction Following Models

Fine-tuning and Post-training in Language Models

Hitting a Potential Data Wall and the Generalization of Pre-training Models

Understanding the Scaling Law and Sample Efficiency of Bigger Models

The Role of AIs in Running Firms and the Importance of Human Oversight

Challenges and Trade-offs in Keeping Humans in the Loop with AI

Replicability Challenges in ML Literature and Improving Chatbot Personality

Variation in Training Process and Improving Writing in Chatbots

The Complexity of Preference Models and the Existence of a Moat

The Future of AI Assistants: Moving Towards Proactive Collaboration

John Schulman (OpenAI Cofounder) — Reasoning, RLHF, & plan for 2027 AGI

Dwarkesh Podcast

00:00Pre-training and Post-training in AI Models

Pre-training and Post-training in AI Models

06:25Long Horizon Tasks and Model Intelligence

Long Horizon Tasks and Model Intelligence

11:28Generalization and Affordances in AI Models

Generalization and Affordances in AI Models

16:12Preparing for the Potential Arrival of AGI: Coordination and Caution

Preparing for the Potential Arrival of AGI: Coordination and Caution

19:52Ensuring Coordination and Deployment Limits for AI Systems

Ensuring Coordination and Deployment Limits for AI Systems

24:37Ensuring the Safety of Deployed AI Systems: Monitoring and Early Detection

Ensuring the Safety of Deployed AI Systems: Monitoring and Early Detection

29:56The Role of RLHF in Reasoning and the Importance of Nucleus Premium Genetic Testing

The Role of RLHF in Reasoning and the Importance of Nucleus Premium Genetic Testing

36:58The Role of Fine Tuning and Context Learning in Model Training

The Role of Fine Tuning and Context Learning in Model Training

42:11Development of ChatGPT and Challenges with Instruction Following Models

Development of ChatGPT and Challenges with Instruction Following Models

47:02Fine-tuning and Post-training in Language Models

Fine-tuning and Post-training in Language Models

53:33Hitting a Potential Data Wall and the Generalization of Pre-training Models

Hitting a Potential Data Wall and the Generalization of Pre-training Models

58:22Understanding the Scaling Law and Sample Efficiency of Bigger Models

Understanding the Scaling Law and Sample Efficiency of Bigger Models

1:03:47The Role of AIs in Running Firms and the Importance of Human Oversight

The Role of AIs in Running Firms and the Importance of Human Oversight

1:07:25Challenges and Trade-offs in Keeping Humans in the Loop with AI

Challenges and Trade-offs in Keeping Humans in the Loop with AI

1:15:02Replicability Challenges in ML Literature and Improving Chatbot Personality

Replicability Challenges in ML Literature and Improving Chatbot Personality

1:20:58Variation in Training Process and Improving Writing in Chatbots

Variation in Training Process and Improving Writing in Chatbots

1:25:58The Complexity of Preference Models and the Existence of a Moat

The Complexity of Preference Models and the Existence of a Moat

1:31:45The Future of AI Assistants: Moving Towards Proactive Collaboration

The Future of AI Assistants: Moving Towards Proactive Collaboration