Episode cover
YouTube19 Jun 2026

The data black hole at the center of AI

Podcast cover

Dwarkesh Patel

Intelligence is fundamentally defined by sample efficiency—the ability to operate competently with minimal data. Current AI progress relies on massive data scaling and compute rather than improved efficiency, with models requiring trillions of tokens compared to the approximately 200 million tokens humans encounter in a lifetime. While scaling laws suggest larger models, they cannot bridge the million-fold efficiency gap between human cognition and artificial systems. Despite this inefficiency, the industry prioritizes automating white-collar tasks and AI research, as the ability to amortize training costs across billions of sessions justifies the high resource consumption. Ultimately, the path toward human-level intelligence involves automating AI research itself, potentially overcoming current limitations in learning efficiency by leveraging the specific capabilities of large language models to solve remaining research bottlenecks.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise