02 Jun 2025
3h 47m

#217 – Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress

Podcast cover

80,000 Hours Podcast

In this interview podcast, Beth Barnes, founder and CEO of METR (Model Evaluation and Threat Research), discusses the weaknesses of current AI model evaluations, particularly concerning hidden chains of thought and the potential for models to deceive evaluators. Beth advocates for more transparency and oversight in AI development, emphasizing the importance of pre-training evaluations and the need to assess models' capabilities before deployment to prevent misuse or theft. She also shares METR's research on measuring AI capabilities over time using human task benchmarks, revealing an exponential growth in AI autonomy. Beth expresses concern about the rapid pace of AI development and the potential for recursively self-improving AI, urging policymakers and the public to take the risks seriously and consider the ethical implications of AI development.

Outlines

Part 1: AI Evaluation Limitations

Part 2: Measuring AI Progress

Part 3: AI Capabilities and Research

Part 4: Awareness and Transparency

Part 5: Regulation and Oversight

Part 6: Shifting Strategies and Mitigations

Part 7: Historical Parallels and International Cooperation

Part 8: METR's Role and Future Directions

Sign in to continue reading, translating and more.

Open full episode in Podwise