How Zyphra went all-in on AMD + Why Devs feel faster with AI but are slower — with Quentin Anthony | Latent Space: The AI Engineer Podcast

In this episode of the Latent Space Podcast, Alessio interviews Quentin Anthony, Head of Model Training at Zyphra and advisor at Eleuther AI, about Zyphra's work on foundation models for edge deployment and their recent move to AMD training clusters. Quentin shares insights on optimizing kernels for AMD GPUs, the role of open source in AMD development, and the use of coding agents. They discuss the METR study on AI's impact on software engineering productivity, Quentin's coding workflow with AI, and the challenges of evaluating AI-generated kernels. The conversation also covers Zyphra's model development roadmap, the potential of ASICs for inference, edge deployment strategies, and the future of open source AI with Eleuther AI.

Outlines

Part 1: Zyphra, AMD, and Open Source

Part 2: Kernel Development and Hardware Considerations

Part 3: Edge Deployment and Future Roadmap

Part 4: AI Coding Productivity and Workflow

Part 5: Team Building, Interviewing, and Open Source

Sign in to continue reading, translating and more.

Continue

How Zyphra went all-in on AMD + Why Devs feel faster with AI but are slower — with Quentin Anthony

Latent Space: The AI Engineer Podcast

Part 1: Zyphra, AMD, and Open Source

Introduction to Zyphra's Edge-Focused Foundation Models and AMD Transition

Overcoming AMD Software Challenges and the Role of Open Source

AMD Hardware Generations and Kernel Optimization Strategies

Part 2: Kernel Development and Hardware Considerations

Coding Agents, Kernel Verification, and the Kernel Writer Workflow

Kernel Implementation, Alternative Hardware, and Training for Inference Efficiency

Custom ASICs for Inference and Model Architecture Considerations

Part 3: Edge Deployment and Future Roadmap

On-Device Inference, Model Sizes, and Edge Deployments

Edge Deployment Use Cases and Zyphra's Future Roadmap

Part 4: AI Coding Productivity and Workflow

METR Study on Software Engineering Productivity with AI

AI Coding Setup and Workflow Changes

Model Jumps, GPT-5, and Productivity with AI

Data Quality, Time Boxing, and Context Engineering

Part 5: Team Building, Interviewing, and Open Source

High Focus vs. Low Focus Tasks and Team Building

Interviewing Engineers in the Age of AI

EleutherAI's Focus and Open Source Collaboration

Open Source Project Requests and Final Thoughts

How Zyphra went all-in on AMD + Why Devs feel faster with AI but are slower — with Quentin Anthony

Latent Space: The AI Engineer Podcast

Part 1: Zyphra, AMD, and Open Source

00:05Introduction to Zyphra's Edge-Focused Foundation Models and AMD Transition

Introduction to Zyphra's Edge-Focused Foundation Models and AMD Transition

04:03Overcoming AMD Software Challenges and the Role of Open Source

Overcoming AMD Software Challenges and the Role of Open Source

07:05AMD Hardware Generations and Kernel Optimization Strategies

AMD Hardware Generations and Kernel Optimization Strategies

Part 2: Kernel Development and Hardware Considerations

10:13Coding Agents, Kernel Verification, and the Kernel Writer Workflow

Coding Agents, Kernel Verification, and the Kernel Writer Workflow

15:36Kernel Implementation, Alternative Hardware, and Training for Inference Efficiency

Kernel Implementation, Alternative Hardware, and Training for Inference Efficiency

19:06Custom ASICs for Inference and Model Architecture Considerations

Custom ASICs for Inference and Model Architecture Considerations

Part 3: Edge Deployment and Future Roadmap

22:34On-Device Inference, Model Sizes, and Edge Deployments

On-Device Inference, Model Sizes, and Edge Deployments

26:34Edge Deployment Use Cases and Zyphra's Future Roadmap

Edge Deployment Use Cases and Zyphra's Future Roadmap

Part 4: AI Coding Productivity and Workflow

29:40METR Study on Software Engineering Productivity with AI

METR Study on Software Engineering Productivity with AI

33:00AI Coding Setup and Workflow Changes

AI Coding Setup and Workflow Changes

36:50Model Jumps, GPT-5, and Productivity with AI

Model Jumps, GPT-5, and Productivity with AI

40:46Data Quality, Time Boxing, and Context Engineering

Data Quality, Time Boxing, and Context Engineering

Part 5: Team Building, Interviewing, and Open Source

45:52High Focus vs. Low Focus Tasks and Team Building

High Focus vs. Low Focus Tasks and Team Building

49:13Interviewing Engineers in the Age of AI

Interviewing Engineers in the Age of AI

52:36EleutherAI's Focus and Open Source Collaboration

EleutherAI's Focus and Open Source Collaboration

57:03Open Source Project Requests and Final Thoughts

Open Source Project Requests and Final Thoughts