Andrej Karpathy — “We’re summoning ghosts, not building animals” | Dwarkesh Patel

In this interview, Andrej Karpathy discusses his perspectives on the progress and future of AI agents, emphasizing that it's the "decade of agents" rather than just a year due to the significant work still needed in areas like continual learning, multimodality, and computer use. He reflects on the history of AI, including the missteps of early reinforcement learning and the importance of representation learning with LLMs. Karpathy also touches on the differences between building AI and the evolution of animal intelligence, the role of pre-training as "crappy evolution," and the need to remove knowledge from AI models to enhance their cognitive core. He further explores the limitations of current RL methods, the potential of process-based supervision, and the challenges of model collapse, and shares his insights on the future of AI architecture, the value of coding models, and the importance of education in empowering humanity in an increasingly automated world.

Outlines

Part 1: AI Evolution and Agent Development

Part 2: Learning, Memory, and LLM Architecture

Part 3: Coding with AI and Reinforcement Learning

Part 4: AGI, Job Replaceability, and the Role of Coding

Part 5: Intelligence, Culture, and Self-Driving

Part 6: Education and Human Nature

Part 7: Closing

Sign in to continue reading, translating and more.

Continue

Andrej Karpathy — “We’re summoning ghosts, not building animals”

Dwarkesh Patel

Part 1: AI Evolution and Agent Development

00:00Introduction to the Decade of Agents and the Evolution of AI

Introduction to the Decade of Agents and the Evolution of AI

04:45The Missteps and Progress in AI Agent Development

The Missteps and Progress in AI Agent Development

07:51Animal vs. Ghost: Contrasting Approaches to AGI and the Role of Evolution

Animal vs. Ghost: Contrasting Approaches to AGI and the Role of Evolution

Part 2: Learning, Memory, and LLM Architecture

12:28Pre-training as Crappy Evolution and the Cognitive Core

Pre-training as Crappy Evolution and the Cognitive Core

17:28Information Assimilation, Working Memory, and Brain Analogies in LLMs

Information Assimilation, Working Memory, and Brain Analogies in LLMs

22:04Continual Learning, Distillation, and the Future of Neural Network Architectures

Continual Learning, Distillation, and the Future of Neural Network Architectures

Part 3: Coding with AI and Reinforcement Learning

27:31NanoChat and the Nuances of Coding with AI Assistance

NanoChat and the Nuances of Coding with AI Assistance

32:32Cognitive Deficits in Coding Models and the Autonomy Slider

Cognitive Deficits in Coding Models and the Autonomy Slider

39:37The Limitations of Reinforcement Learning and the Need for Review Processes

The Limitations of Reinforcement Learning and the Need for Review Processes

45:37The Challenges of Process-Based Supervision and the Search for Better Algorithms

The Challenges of Process-Based Supervision and the Search for Better Algorithms

50:25Synthetic Data Generation, Model Collapse, and the Importance of Entropy

Synthetic Data Generation, Model Collapse, and the Importance of Entropy

55:12The Cognitive Core and the Quality of Training Data

The Cognitive Core and the Quality of Training Data

Part 4: AGI, Job Replaceability, and the Role of Coding

1:02:24Frontier Model Size and Mercury Advertisement

Frontier Model Size and Mercury Advertisement

1:07:13Measuring Progress Towards AGI and the Replaceability of Jobs

Measuring Progress Towards AGI and the Replaceability of Jobs

1:11:15The Bottleneck of Human Expertise and the Dominance of Coding in AI Deployment

The Bottleneck of Human Expertise and the Dominance of Coding in AI Deployment

1:15:02The Unique Fit of Coding for LLMs and the Gradual Loss of Control

The Unique Fit of Coding for LLMs and the Gradual Loss of Control

1:20:20Loss of Understanding vs. Loss of Control and the Intelligence Explosion

Loss of Understanding vs. Loss of Control and the Intelligence Explosion

1:23:48The Hyper-Exponential Trend and the Nature of True AGI

The Hyper-Exponential Trend and the Nature of True AGI

1:27:30The Discrete Jump and the Industrial Revolution

The Discrete Jump and the Industrial Revolution

Part 5: Intelligence, Culture, and Self-Driving

1:31:13Veeo Advertisement and the Evolution of Intelligence

Veeo Advertisement and the Evolution of Intelligence

1:35:31Animal Intelligence and the Cultural Scaffold

Animal Intelligence and the Cultural Scaffold

1:40:35The Lack of Culture in LLMs and the Need for Multi-Agent Systems

The Lack of Culture in LLMs and the Need for Multi-Agent Systems

1:44:02Self-Driving Cars and the March of Nines

Self-Driving Cars and the March of Nines

1:47:14Safety Guarantees and the Scalable Approach

Safety Guarantees and the Scalable Approach

1:52:24The Economics of Deployment and the Importance of Being Grounded in Reality

The Economics of Deployment and the Importance of Being Grounded in Reality

Part 6: Education and Human Nature

1:57:08Eureka and the Starfleet Academy

Eureka and the Starfleet Academy

2:01:16Building Ramps to Knowledge and the Role of AI

Building Ramps to Knowledge and the Role of AI

2:07:30The Timelessness of Human Nature and the Importance of Flourishing

The Timelessness of Human Nature and the Importance of Flourishing

2:12:34The Culture World and the Power of Learning

The Culture World and the Power of Learning

2:15:26Advice for Educators and the Importance of First Order Terms

Advice for Educators and the Importance of First Order Terms

2:20:04The Curse of Knowledge and the Power of Explanation

The Curse of Knowledge and the Power of Explanation

Part 7: Closing

2:25:37Closing Remarks

Closing Remarks