In this episode of AWS Show & Tell, Trevor Spires and Anil Nadiminti host Jesse Menders and Wally to discuss ModelEvaluations and RAG evaluations within Generative AI on AWS. The speakers explain the importance of model evaluation in balancing quality, latency, and cost, and aligning with company style, brand voice, and responsible AI practices. They introduce Amazon Bedrock's ModelEvaluation features, including programmatic evaluation, human evaluation, and LLM as a judge, and demonstrate how to use these tools via the console, including custom metrics. The conversation transitions to RAG evaluation, emphasizing the challenges of retrieving relevant context and ensuring correct, complete, and grounded answers, and they showcase how to perform RAG evaluations using Amazon Bedrock.
Sign in to continue reading, translating and more.
Continue