YouTube19 Sept 2025
2h 3m

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]

Podcast cover

Machine Learning Street Talk

The discussion centers on challenging conventional wisdom in machine learning, particularly around the bias-variance tradeoff and the role of model complexity. Andrew Wilson, a professor at NYU, argues against the necessity of a bias-variance trade-off, advocating for expressive models with soft inductive biases that adapt to both small and large datasets. He shares insights on deep learning's relative universality and its effectiveness in representation learning, highlighting the importance of scale in achieving good generalization through a simplicity bias. The conversation explores misconceptions in understanding generalization, such as the belief that models should change based on available data points, and delves into the mysteries behind the simplicity bias at scale, touching on loss landscapes and compressibility. Wilson also touches upon the potential for AI to discover new scientific theories.

Outlines

Part 1: Foundations and Misconceptions

Part 2: Principles of Model Construction

Part 3: Overfitting and Generalization

Part 4: Bayesian Perspectives

Part 5: Complexity and Information Theory

Part 6: Intelligence and Compression

Part 7: Marginalization and Uncertainty

Part 8: Advanced Dynamics and Optimization

Part 9: Future Outlook and Scaling

Sign in to continue reading, translating and more.

Open full episode in Podwise