AI evaluations on Amazon Bedrock | AWS Show and Tell - Generative AI | S1 E16 | AWS Events

In this episode of AWS Show & Tell, Trevor Spires and Anil Nadiminti host Jesse Menders and Wally to discuss ModelEvaluations and RAG evaluations within Generative AI on AWS. The speakers explain the importance of model evaluation in balancing quality, latency, and cost, and aligning with company style, brand voice, and responsible AI practices. They introduce Amazon Bedrock's ModelEvaluation features, including programmatic evaluation, human evaluation, and LLM as a judge, and demonstrate how to use these tools via the console, including custom metrics. The conversation transitions to RAG evaluation, emphasizing the challenges of retrieving relevant context and ensuring correct, complete, and grounded answers, and they showcase how to perform RAG evaluations using Amazon Bedrock.

Outlines

Sign in to continue reading, translating and more.

Continue

AI evaluations on Amazon Bedrock | AWS Show and Tell - Generative AI | S1 E16

AWS Events

Introduction to Model Evaluation in Generative AI

Key Aspects and Types of Model Evaluation

LLM as a Judge: Model Selection and Evaluation Metrics

Custom Metrics and Evaluation Results

Introduction to RAG Evaluation

Metrics and Implementation of RAG Evaluation

Final Thoughts and Call to Action

AI evaluations on Amazon Bedrock | AWS Show and Tell - Generative AI | S1 E16

AWS Events

00:00Introduction to Model Evaluation in Generative AI

Introduction to Model Evaluation in Generative AI

09:37Key Aspects and Types of Model Evaluation

Key Aspects and Types of Model Evaluation

17:50LLM as a Judge: Model Selection and Evaluation Metrics

LLM as a Judge: Model Selection and Evaluation Metrics

27:13Custom Metrics and Evaluation Results

Custom Metrics and Evaluation Results

37:06Introduction to RAG Evaluation

Introduction to RAG Evaluation

45:24Metrics and Implementation of RAG Evaluation

Metrics and Implementation of RAG Evaluation

56:55Final Thoughts and Call to Action

Final Thoughts and Call to Action