YouTube27 Aug 2025
48m

Lecture 7 – Large Foundation Models (MIT How to AI Almost Anything, Spring 2025)

Podcast cover

Paul Liang

Paul Liang delivers a condensed lecture on large-scale models (LLMs), covering recurrent neural networks (RNNs) and transformers, including their advantages, disadvantages, and the shift towards linear attention. He explains pre-training LLMs using next token prediction on vast datasets, the architecture types (encoder-only, encoder-decoder, decoder-only), and the significance of instruction fine-tuning and preference tuning to align model responses with human expectations. The lecture also explores recent trends in efficient training methods like low-rank adaptation and mixture of experts, quantization techniques for model compression, and practical tips for fine-tuning LLMs, including data preparation and model selection. The session concludes with a Q&A segment and a brief overview of future research directions, such as reasoning in LLMs and multi-modal LLMs.

Outlines

Part 1: Introduction and Background

Part 2: Training and Tuning LLMs

Part 3: Compression and Practical Application

Part 4: Conclusion

Sign in to continue reading, translating and more.

Open full episode in Podwise