23 Apr 2025

Ensure AI Agents Work: Evaluation Frameworks for Scaling Success — Aparna Dhinkaran, CEO Arize

AI Engineer

Aparna Dhinkaran, one of the founders of Arise, discusses the importance of evaluating AI agents and assistants, especially as they move into production and multimodal applications like voice. She breaks down the components of an agent—router, skills, and memory—explaining how each functions and can be evaluated. Using examples, including the Priceline PennyBot and her own company's co-pilot, she emphasizes the need for evaluations at every level of the agent's operation, including the audio component in voice applications, to ensure accuracy, efficiency, and the correct execution of skills.

Outlines

Continue

Preview

How to Get Rich: Every EpisodeNaval

Ensure AI Agents Work: Evaluation Frameworks for Scaling Success — Aparna Dhinkaran, CEO Arize

AI Engineer

Introduction to Evaluating AI Agents: Components and Architecture

Evaluating Routers, Skills, and Convergence in AI Agents

Evaluating Voice Applications and Real-World Agent Evaluation

Ensure AI Agents Work: Evaluation Frameworks for Scaling Success — Aparna Dhinkaran, CEO Arize

AI Engineer

00:16Introduction to Evaluating AI Agents: Components and Architecture

Introduction to Evaluating AI Agents: Components and Architecture

05:45Evaluating Routers, Skills, and Convergence in AI Agents

Evaluating Routers, Skills, and Convergence in AI Agents

11:17Evaluating Voice Applications and Real-World Agent Evaluation

Evaluating Voice Applications and Real-World Agent Evaluation