Stanford CS336 Language Modeling from Scratch | Spring 2025 | Overview and Tokenization
Stanford Online
This is an introductory lecture for the CS336 course, "Language Models from Scratch," co-taught by Percy Liang and Tatsunori. The lecture outlines the course's motivations, which stem from a perceived disconnect between researchers and the underlying technology of language models due to increasing abstraction. The course aims to provide a foundational understanding by building language models from scratch, focusing on mechanics, mindset, and intuitions. It addresses the challenges posed by the industrialization of language models, where frontier models are out of reach for academic purposes. The course covers five main units: basics (tokenizer, model architecture, training), systems (kernels, parallelism, inference), scaling laws, data curation, and alignment (supervised fine-tuning, learning from feedback). The goal is to enable students to maximize efficiency in model building given limited compute and data resources.
Part 1: Course Introduction
Part 2: System Optimization and Scaling Laws
Part 3: Data and Alignment
Part 4: Tokenization and Course Summary
Sign in to continue reading, translating and more.
Open full episode in Podwise