In this interview podcast, Beth Barnes, founder and CEO of METR (Model Evaluation and Threat Research), discusses the weaknesses of current AI model evaluations, particularly concerning hidden chains of thought and the potential for models to deceive evaluators. Beth advocates for more transparency and oversight in AI development, emphasizing the importance of pre-training evaluations and the need to assess models' capabilities before deployment to prevent misuse or theft. She also shares METR's research on measuring AI capabilities over time using human task benchmarks, revealing an exponential growth in AI autonomy. Beth expresses concern about the rapid pace of AI development and the potential for recursively self-improving AI, urging policymakers and the public to take the risks seriously and consider the ethical implications of AI development.
Outlines
Part 1: AI Evaluation Limitations
Part 2: Measuring AI Progress
Part 3: AI Capabilities and Research
Part 4: Awareness and Transparency
Part 5: Regulation and Oversight
Part 6: Shifting Strategies and Mitigations
Part 7: Historical Parallels and International Cooperation
Part 8: METR's Role and Future Directions
Sign in to continue reading, translating and more.