13 Jan 2022

Resource-Efficient Deep Learning Execution - Deepak Narayanan | Stanford MLSys #50

Stanford MLSys Seminars

This podcast episode explores the challenges of training and inference in deep neural networks, specifically focusing on the impact of hardware heterogeneity, parallelization strategies, and resource allocation. It introduces GAVL, a new heterogeneity-aware preemption-based scheduler, and discusses the concept of effective throughput for optimizing resource allocation. The conversation also delves into the cost savings of using preemptible instances in cloud services and the potential for federated learning in edge devices. The implications of model parallelism and the importance of better tooling in distributed training are highlighted. The ultimate goal is to democratize optimization techniques and improve the efficiency of model training.

Takeaways

Outlines

Q & A

Preview

How to Get Rich: Every EpisodeNaval