YouTube10 Apr 2025
1h 19m

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lec. 2: Pytorch, Resource Accounting

Podcast cover

Stanford Online

In this lecture, the speaker discusses building language models from scratch using Pytorch, focusing on efficiency in resource utilization (memory and compute). The lecture covers Pytorch primitives such as tensors, models, optimizers, and training loops. The speaker explains memory accounting, including different floating-point representations (float32, float16, BFLOAT16, FP8) and their memory implications. Compute accounting is also discussed, emphasizing the importance of GPU usage and data movement. The lecture further delves into tensor operations, INOPS, and the computation cost of these operations, particularly matrix multiplications. The speaker touches on parameter initialization, building a simple model, data loading, optimizers (AdaGrad), and the memory requirements of optimizer states. The lecture concludes with a discussion of training loops, checkpointing, and mixed precision training, highlighting the trade-offs between precision, accuracy, stability, and computational cost.

Outlines

Part 1: Introduction and Memory

Part 2: Tensor Operations and Cost

Part 3: Model Building and Training

Sign in to continue reading, translating and more.

Open full episode in Podwise