
Aman Khan, an AI product manager at Arize, delivers a presentation on shipping AI that works, focusing on an evaluation framework for product managers. The talk covers the importance of evals, building an AI trip planner with a multi-agent system, and evaluating the prototype. Aman discusses his background, the changing expectations of AI product managers, and the need for reliable AI systems. The presentation includes a live demo of building and evaluating an AI trip planner using Arize, emphasizing the importance of data-driven development and prompt engineering. The session concludes with an extensive Q&A, addressing topics such as building evaluation teams, improving eval prompts, using different evaluation models, and integrating human feedback into the evaluation process.
Sign in to continue reading, translating and more.
Continue