AI Testing and Evaluation: Reflections

In the final episode of the "AI Testing and Evaluation" podcast series, Kathleen Sullivan interviews Amanda Craig Deckard, Senior Director of Public Policy in Microsoft's Office of Responsible AI, to reflect on key insights from the series. They discuss the importance and challenges of AI testing for building trust, managing risks, and enabling innovation across different organizations. Amanda highlights three critical takeaways: how testing is used, the emphasis on pre- versus post-deployment testing and monitoring, and the rigidity versus adaptability of testing regimes. The conversation explores the evolution of testing approaches in various domains like pharmaceuticals and cybersecurity, emphasizing the need for both pre- and post-deployment strategies and the significance of system-level evaluation. They also address the need to advance rigor, standardization, and interpretability in AI testing, advocating for collaboration across the value chain, public-private partnerships, and further exploration of transparency and information sharing in risk evaluation.

Outlines

Sign in to continue reading, translating and more.

Continue

Microsoft Research Podcast

Introduction to AI Testing and Evaluation

Key Takeaways on Testing and Deployment

Contrasting Testing Regimes: Pharmaceuticals vs. Cybersecurity

The Importance of Deployment Context in Risk Assessment

Advancing the AI Evaluation and Testing Ecosystem

Stakeholder Roles and Public-Private Partnerships

Future Directions and Transparency in Risk Evaluation

Socio-Technical Impacts and Future Outlook

AI Testing and Evaluation: Reflections

Microsoft Research Podcast

00:03Introduction to AI Testing and Evaluation

Introduction to AI Testing and Evaluation

03:42Key Takeaways on Testing and Deployment

Key Takeaways on Testing and Deployment

07:03Contrasting Testing Regimes: Pharmaceuticals vs. Cybersecurity

Contrasting Testing Regimes: Pharmaceuticals vs. Cybersecurity

10:59The Importance of Deployment Context in Risk Assessment

The Importance of Deployment Context in Risk Assessment

14:49Advancing the AI Evaluation and Testing Ecosystem

Advancing the AI Evaluation and Testing Ecosystem

17:40Stakeholder Roles and Public-Private Partnerships

Stakeholder Roles and Public-Private Partnerships

22:01Future Directions and Transparency in Risk Evaluation

Future Directions and Transparency in Risk Evaluation

25:21Socio-Technical Impacts and Future Outlook

Socio-Technical Impacts and Future Outlook