YouTube31 Oct 2023
55m

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Podcast cover

Stanford MLSys Seminars

This podcast episode explores the challenges of training large language models at scale and the various forms of parallelism that can be used to address these challenges. Deepak Narayanan, a senior applied research scientist at NVIDIA, discusses the need for careful consideration of different parallelism dimensions and domain-specific optimizations to achieve efficient training. The episode highlights the benefits and complexities of parallelism, including data parallelism, tensor model parallelism, and pipeline parallelism. The interactions between tensor and pipeline model parallelism, as well as the impact of communication patterns on training speed, are also discussed. The episode concludes with a focus on the importance of optimizing throughput in distributed matrix multiplication and hints at future discussions on inference in MLSys Seminars.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise