Episode cover
YouTube09 Jun 2026

Yann LeCun: World Models: Enabling the next AI revolution

Podcast cover

Computer Vision and Geometry Group, ETH Zurich

Achieving human-level intelligence requires moving beyond current large language models toward grounded "world models" capable of understanding continuous, high-dimensional, and noisy environments. Intelligence is defined not by the accumulation of declarative knowledge or specific skills, but by the ability to adapt and solve new problems with minimal training. Current generative AI approaches, which focus on pixel-level prediction, fail to capture the underlying structure of the physical world. Instead, the Joint Embedding Predictive Architecture (JEPA) utilizes energy-based models and information maximization to learn abstract representations. This approach enables hierarchical planning and common sense reasoning, allowing systems to predict outcomes and navigate complex tasks safely. By shifting focus from text-based generation to these grounded, predictive architectures, AI can overcome the limitations of current machine learning and move toward more robust, adaptive physical intelligence.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise