YouTube15 May 2025
1h 5m

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 9: Scaling laws 1

Podcast cover

Stanford Online

The podcast focuses on scaling laws in machine learning, particularly for large language models (LLMs). It begins by framing the challenge of building the best open-source LLM with limited resources, emphasizing the need to innovate rather than just copy existing models. The discussion covers the history and background of scaling laws, highlighting their grounded nature and evolution from theoretical machine learning to empirical applications. It explores data scaling, model scaling, and the interplay between data, model size, and compute, including the Chinchilla scaling laws, and addresses practical engineering decisions like hyperparameter tuning, architecture selection, and resource allocation, and the trade-offs between model size and data set size.

Outlines

Part 1: Introduction and Foundations

Part 2: Data Scaling

Part 3: Model Scaling

Part 4: Joint Scaling and Optimization

Part 5: Applications and Conclusion

Sign in to continue reading, translating and more.

Open full episode in Podwise