How to look at your data — Jeff Huber (Chroma) + Jason Liu (567) | AI Engineer

Jeff Huber and Jason Liu present a two-part session on how to effectively analyze data to improve AI systems. Jeff discusses the importance of using fast evals—query and document pairs—to quickly and inexpensively measure retrieval system performance and suggests using LLMs to generate realistic queries. He shares a case study with Weights and Biases, demonstrating how different embedding models perform using both ground truth and synthetically generated queries. Jason then explains how to analyze the outputs of AI systems, such as chatbot conversations, to identify user needs and improve product development. He introduces Cura, a library for summarizing and clustering conversations to extract valuable metadata, enabling data-driven decisions on tool development and prompt engineering. The goal is to understand user behavior, segment user types, and make impact-weighted decisions to enhance AI application performance and guide product roadmaps.

Outlines

Part 1: Introduction to Data Analysis

Part 2: Analyzing System Outputs

Part 3: Data-Driven Improvements

Sign in to continue reading, translating and more.

Continue

How to look at your data — Jeff Huber (Chroma) + Jason Liu (567)

AI Engineer

Part 1: Introduction to Data Analysis

Introduction to Data Analysis for AI Practitioners

Fast Evals for Retrieval System Optimization

Empirical Results and Accessing the Full Report

Part 2: Analyzing System Outputs

Analyzing Outputs for System Improvement

Extracting Structured Data from Conversations for Analysis

Cura Library and KPI Comparison Across Clusters

Part 3: Data-Driven Improvements

Data-Driven Product Roadmaps and Targeted Investments

Key Takeaways and Resources

How to look at your data — Jeff Huber (Chroma) + Jason Liu (567)

AI Engineer

Part 1: Introduction to Data Analysis

00:00Introduction to Data Analysis for AI Practitioners

Introduction to Data Analysis for AI Practitioners

01:32Fast Evals for Retrieval System Optimization

Fast Evals for Retrieval System Optimization

05:23Empirical Results and Accessing the Full Report

Empirical Results and Accessing the Full Report

Part 2: Analyzing System Outputs

07:12Analyzing Outputs for System Improvement

Analyzing Outputs for System Improvement

09:07Extracting Structured Data from Conversations for Analysis

Extracting Structured Data from Conversations for Analysis

11:19Cura Library and KPI Comparison Across Clusters

Cura Library and KPI Comparison Across Clusters

Part 3: Data-Driven Improvements

13:30Data-Driven Product Roadmaps and Targeted Investments

Data-Driven Product Roadmaps and Targeted Investments

15:18Key Takeaways and Resources

Key Takeaways and Resources