
Shayle Kann interviews Dr. Ben Lee about the future of AI compute, specifically focusing on the distribution between cloud-based, edge, and on-device inference. The discussion covers the energy implications of shifting inference workloads from centralized data centers to edge computing or directly onto devices like phones. They explore the trade-offs between performance, latency, privacy, and energy efficiency in different compute locations. Dr. Lee predicts that in the future, a significant portion of inference compute could move to the edge, with a smaller fraction occurring on devices, while training remains primarily in large data centers. The conversation also addresses the potential for increased overall energy consumption due to the lower efficiency of smaller, edge-based data centers compared to hyperscale facilities.
Sign in to continue reading, translating and more.
Continue