Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken | Dwarkesh Podcast

In this panel discussion, Dwarkesh, Sholto Douglas, and Trenton Bricken delve into the advancements and future of AI, particularly focusing on reinforcement learning (RL) and its applications in software engineering, computer use, and scientific research. They discuss the importance of feedback loops, the challenges of aligning AI with human values, and the potential for AI to automate white-collar jobs. The conversation covers topics such as the limitations of current models, the role of compute and data, the significance of mechanistic interpretability, and the ethical considerations surrounding increasingly capable AI systems, including the potential for economic disruption and the need for proactive policy-making. The panelists also make predictions about the capabilities of AI agents in the near future and offer advice for those looking to enter the field.

Outlines

Part 1: Introduction and Foundations

Part 2: Techniques and Model Properties

Part 3: Safety and Alignment

Part 4: Evaluation and Interpretability

Part 5: Future Outlook and Research

Sign in to continue reading, translating and more.

Open full episode in Podwise

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Dwarkesh Podcast

Part 1: Introduction and Foundations

Introduction and Recent Advancements in RL and Language Models

Current Limitations and Future Predictions of AI Agents

The Role of Feedback Loops and Verifiable Rewards in AI Progress

Part 2: Techniques and Model Properties

The Importance of Scaffolding and Prompt Engineering

Model Size, Sample Efficiency, and Generalization

Feedback Loops, Model Training, and the Challenges of Real-World Application

Part 3: Safety and Alignment

Model Alignment, Emergent Misalignment, and the Challenges of Ensuring Safe AI

Defining the Endgame of Superintelligence and the Challenges of Aligning AI with Human Values

Part 4: Evaluation and Interpretability

Benchmarks, Metrics, and the Challenges of Evaluating AI Capabilities

Mechanistic Interpretability, Circuits, and the Future of AI

The Future of Work, AI Agents, and the Implications for Society

Neuralese, Model Communication, and the Challenges of Interpretability

Part 5: Future Outlook and Research

Inference Compute Bottlenecks and the Implications for Future AI Development

DeepSeek, Model Efficiency, and the Role of Research Taste

Future Predictions, Advice for Aspiring AI Researchers, and the Implications for National Policy

Data Collection, Model Training, and the Challenges of Generalization

The Future of AI Research and the Importance of Interdisciplinary Collaboration

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Dwarkesh Podcast

Part 1: Introduction and Foundations

00:00Introduction and Recent Advancements in RL and Language Models

Introduction and Recent Advancements in RL and Language Models

03:31Current Limitations and Future Predictions of AI Agents

Current Limitations and Future Predictions of AI Agents

11:05The Role of Feedback Loops and Verifiable Rewards in AI Progress

The Role of Feedback Loops and Verifiable Rewards in AI Progress

Part 2: Techniques and Model Properties

18:02The Importance of Scaffolding and Prompt Engineering

The Importance of Scaffolding and Prompt Engineering

22:30Model Size, Sample Efficiency, and Generalization

Model Size, Sample Efficiency, and Generalization

30:10Feedback Loops, Model Training, and the Challenges of Real-World Application

Feedback Loops, Model Training, and the Challenges of Real-World Application

Part 3: Safety and Alignment

37:48Model Alignment, Emergent Misalignment, and the Challenges of Ensuring Safe AI

Model Alignment, Emergent Misalignment, and the Challenges of Ensuring Safe AI

45:01Defining the Endgame of Superintelligence and the Challenges of Aligning AI with Human Values

Defining the Endgame of Superintelligence and the Challenges of Aligning AI with Human Values

Part 4: Evaluation and Interpretability

50:33Benchmarks, Metrics, and the Challenges of Evaluating AI Capabilities

Benchmarks, Metrics, and the Challenges of Evaluating AI Capabilities

58:59Mechanistic Interpretability, Circuits, and the Future of AI

Mechanistic Interpretability, Circuits, and the Future of AI

1:07:16The Future of Work, AI Agents, and the Implications for Society

The Future of Work, AI Agents, and the Implications for Society

1:16:04Neuralese, Model Communication, and the Challenges of Interpretability

Neuralese, Model Communication, and the Challenges of Interpretability

Part 5: Future Outlook and Research

1:23:45Inference Compute Bottlenecks and the Implications for Future AI Development

Inference Compute Bottlenecks and the Implications for Future AI Development

1:30:19DeepSeek, Model Efficiency, and the Role of Research Taste

DeepSeek, Model Efficiency, and the Role of Research Taste

1:37:48Future Predictions, Advice for Aspiring AI Researchers, and the Implications for National Policy

Future Predictions, Advice for Aspiring AI Researchers, and the Implications for National Policy

2:10:25Data Collection, Model Training, and the Challenges of Generalization

Data Collection, Model Training, and the Challenges of Generalization

2:17:30The Future of AI Research and the Importance of Interdisciplinary Collaboration

The Future of AI Research and the Importance of Interdisciplinary Collaboration