YouTube20 Mar 2026

If You Don’t Understand AI Evals, Don’t Build AI

Podcast cover

Aakash Gupta

The podcast explores the critical role of evals in building effective AI products, highlighting that their quality, usage, and improvement drive the success of AI initiatives. Ankur Goyal, founder and CEO of Braintrust, emphasizes that even "vibe checks" are a form of evaluation, particularly useful in the early stages of product development. As AI products scale, more structured evals become essential due to the unpredictable nature of LLMs. Goyal argues that investing in evals creates a durable competitive advantage, more so than focusing solely on the latest models or agents. He also notes the increasing importance of product managers in defining evals, viewing them as the modern equivalent of PRDs, and shares a live demonstration of creating an eval from scratch using Linear's MCP server.

Outlines

Part 1: Foundations of AI Evals

Part 2: Strategic Value and the PM Role

Part 3: Braintrust Growth and Market Trends

Part 4: Technical Framework and Live Implementation

Part 5: Iteration and Optimization Strategies

Part 6: Production, Trust, and Resources

Sign in to continue reading, translating and more.

Open full episode in Podwise