YouTube21 Sept 2024
1h 20m

Stanford CS149 I Parallel Computing I 2023 I Lecture 10 - Efficiently Evaluating DNNs on GPUs

Podcast cover

Stanford Online

In this podcast episode, we explore the complexities of optimizing deep neural networks (DNNs) for better rendering and computation, focusing on striking the right balance between accuracy and efficiency. We start by tackling the challenges of rendering transparent circles in parallel, then move on to the fundamental principles of DNN architecture and optimization techniques. These include advanced matrix multiplication, specialized hardware options, and innovative approaches to enhancing transformer networks. The speaker emphasizes the importance of understanding workloads and utilizing hardware capabilities, while also introducing new strategies to boost performance and memory efficiency in today's neural network applications.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise