22 May 2026
1h 19m

How to Run Evals in Claude Code with Aparna Dhinakaran, Founder and CPO of Arize

Podcast cover

The Growth Podcast

Product management in the AI era centers on cultivating "product taste" by building iterative loops that transform user feedback into actionable agent improvements. Modern AI PMs must bridge the technical gap by utilizing terminal-based tools like Claude Code to build, instrument, and evaluate agents directly. Observability through tracing is essential for this process, as it provides the granular data needed to identify performance failures and refine evaluation metrics. Rather than relying on static roadmaps, successful teams implement self-improving cycles where agents analyze their own traces to prioritize bugs and feature requests. Aparna Dhinakaran, CPO and co-founder of Arize AI, emphasizes that the most effective PMs treat these data-driven feedback loops as a foundational layer, enabling them to ship high-impact solutions at unprecedented velocity while maintaining rigorous standards for accuracy and alignment.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise