CS294-196 (Agentic AI MOOC) - Lecture 1 {Yann Dubois} | Berkeley RDI Center on Decentralization & AI

Training large language models involves a three-stage pipeline: pre-training, post-training, and reasoning reinforcement learning. Pre-training focuses on predicting the next token across massive datasets—often exceeding 10 trillion tokens—to build foundational world knowledge. Post-training, including supervised fine-tuning and reinforcement learning from human feedback, aligns these models with user intent and specific task requirements. Reasoning models further optimize performance on objective tasks like math and coding by leveraging verifiers and reinforcement learning algorithms such as GRPO. Success in this field relies heavily on scaling laws, where increased compute and high-quality data consistently drive performance gains. Infrastructure optimization, including techniques like tensor parallelism and fused kernels, remains critical for managing the massive computational demands and memory constraints inherent in training frontier models.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise

CS294-196 (Agentic AI MOOC) - Lecture 1 {Yann Dubois}

Berkeley RDI Center on Decentralization & AI

The Three Stages of LLM Training Pipelines

Five Pillars of LLM Development

Pre-training Methodologies and Data Curation

Scaling Laws and Compute Economics

Post-training: SFT and Reinforcement Learning

Evaluation Frameworks for LLM Performance

Systems Infrastructure and GPU Optimization

CS294-196 (Agentic AI MOOC) - Lecture 1 {Yann Dubois}

Berkeley RDI Center on Decentralization & AI

00:02The Three Stages of LLM Training Pipelines

The Three Stages of LLM Training Pipelines

08:00Five Pillars of LLM Development

Five Pillars of LLM Development

12:15Pre-training Methodologies and Data Curation

Pre-training Methodologies and Data Curation

40:00Scaling Laws and Compute Economics

Scaling Laws and Compute Economics

53:30Post-training: SFT and Reinforcement Learning

Post-training: SFT and Reinforcement Learning

1:18:00Evaluation Frameworks for LLM Performance

Evaluation Frameworks for LLM Performance

1:29:00Systems Infrastructure and GPU Optimization

Systems Infrastructure and GPU Optimization