In this episode of the podcast, Lenny interviews Hamel Husain and Shreya Shankar about evals, a method for systematically measuring and improving AI applications. They discuss how evals are becoming increasingly important for product builders, and they walk through the process of developing an effective eval, explaining what evals are, addressing misconceptions, and sharing best practices. The conversation covers error analysis, open coding, axial codes, LLM-as-judge, and the importance of data analysis. They also touch on the debate around evals versus A/B testing and offer tips for getting started with evals.
Sign in to continue reading, translating and more.
Continue