
The Latent Space Podcast explores the evolving landscape of AI development, developer experience, and the future of inference. Nader Khalil and Kyle Kranen from NVIDIA discuss the acquisition of Brev, a developer tool for GPU access, and NVIDIA's broader strategy in developer experience, emphasizing the importance of understanding the end-user. They introduce Dynamo, a data center scale inference engine designed to accelerate inference by leveraging techniques like disaggregation, and touch on the concept of "SOL" (Speed of Light) as a method to create urgency and understand the theoretical limits of project timelines. The conversation also covers the shift towards hardware-model co-design, the challenges of long-context models, and the potential of agents in streamlining workflows, while also addressing security concerns.
Sign in to continue reading, translating and more.
Continue