YouTube06 Aug 2025
15m

Evals Are Not Unit Tests — Ido Pesok, Vercel v0

Podcast cover

AI Engineer

Ido Pesok, an engineer at Vercel working on V0, introduces evals at the application layer, distinguishing them from model-layer evals. He uses the analogy of a fruit letter counter app to illustrate the unreliability of LLMs and the importance of building reliable AI applications. Pesok emphasizes understanding the "court" or boundaries of your application's data, collecting relevant user prompts, and avoiding out-of-bounds or concentrated data sets. He advises on putting constants in data and variables in tasks, simplifying scores for debugging, and adding evals to CI for tracking improvements and regressions, ultimately advocating for evals as a core component for improving app reliability and quality.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise