YouTube08 Apr 2025
1h 18m

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Overview and Tokenization

Podcast cover

Stanford Online

This is an introductory lecture for the CS336 course, "Language Models from Scratch," co-taught by Percy Liang and Tatsunori. The lecture outlines the course's motivations, which stem from a perceived disconnect between researchers and the underlying technology of language models due to increasing abstraction. The course aims to provide a foundational understanding by building language models from scratch, focusing on mechanics, mindset, and intuitions. It addresses the challenges posed by the industrialization of language models, where frontier models are out of reach for academic purposes. The course covers five main units: basics (tokenizer, model architecture, training), systems (kernels, parallelism, inference), scaling laws, data curation, and alignment (supervised fine-tuning, learning from feedback). The goal is to enable students to maximize efficiency in model building given limited compute and data resources.

Outlines

Part 1: Course Introduction

Part 2: System Optimization and Scaling Laws

Part 3: Data and Alignment

Part 4: Tokenization and Course Summary

Sign in to continue reading, translating and more.

Open full episode in Podwise