03 Sep 2024
1h 5m

Efficiency is Coming: 3000x Faster, Cheaper, Better AI Inference from Hardware Improvements, Quantization, and Synthetic Data Distillation

Podcast cover

Latent Space: The AI Engineer Podcast

This podcast episode follows Nyla Worker's inspiring journey from astrophysics to AI, detailing her commitment to optimizing AI efficiency and exploring the intersection of synthetic data, LLMs, and 3D content creation. Nyla's work illustrates the significance of inference efficiency and the delicate balance between model accuracy and optimization techniques, all while pondering the challenges of achieving true AGI through innovative approaches like model distillation and NPC simulation.

Outlines

Sign in to continue reading, translating and more.

Continue
 
mindmap screenshot
Preview
preview episode cover
How to Get Rich: Every EpisodeNaval