YouTube27 Aug 2024
1h 44m

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Podcast cover

Stanford Online

This podcast focuses on the practical aspects of building large language models (LLMs). The speaker begins with an overview of key components (architecture, training, data, evaluation, systems) then delves into pre-training (classical language modeling) and post-training (AI assistant development). Specific attention is given to tokenization, evaluation metrics (perplexity and benchmarks like HELM and MMLU), and data challenges. The speaker also discusses scaling laws, showing how increased compute correlates with improved performance and how this informs resource allocation. Finally, the lecture covers supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), highlighting the use of LLMs to improve data collection efficiency and the challenges of evaluating open-ended chatbot responses.

Outlines

Part 1: LLM Fundamentals

Part 2: Data and Scaling

Part 3: Post-Training Alignment

Part 4: System Optimization and Summary

Sign in to continue reading, translating and more.

Open full episode in Podwise