YouTube24 Apr 2025
1h 18m

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Podcast cover

Stanford Online

This is an introductory lecture for the course "CS336, Language Models from Scratch," co-taught by Percy Liang and Tatsu. The lecture introduces the core staff, including TAs, and outlines the course's goals, which include providing a foundational understanding of language models by building them from scratch. It addresses the increasing disconnection between researchers and underlying technology due to reliance on proprietary models. The lecture also discusses the challenges of scale in language models, the importance of efficiency, and the different types of knowledge the course aims to impart: mechanics, mindset, and intuitions. The course covers tokenization, model architecture, training, system optimization, scaling laws, data curation, and alignment, with a focus on maximizing efficiency given hardware and data constraints. The lecture concludes with a detailed overview of tokenization, including character-based, byte-based, word-based, and BPE encoding methods.

Outlines

Part 1: Course Introduction and Motivation

Part 2: Core Course Components

Part 3: Tokenization Deep Dive

Sign in to continue reading, translating and more.

Open full episode in Podwise