In this episode of the NVIDIA AI Podcast, host Noah Kravitz interviews Ian Buck, NVIDIA's VP of Hyperscale and High-Performance Computing, about mixture-of-experts (MOE), an AI architecture that improves efficiency and reduces costs. Buck explains MOE's function, comparing it to a human brain that activates only necessary neurons for specific tasks, and how it divides AI models into smaller, specialized "experts." The discussion covers the "DeepSeek moment" that popularized MOE, the symbiotic relationship between AI hardware and models, and NVIDIA's role in co-designing systems like the GB200 with NVLink to optimize communication between experts, ultimately lowering the cost per token and advancing AI capabilities across various applications. Buck recommends GTC for listeners seeking more information.
Sign in to continue reading, translating and more.
Continue