YouTube15 Jan 2026
45m

The Evaluators Are Being Evaluated — Pavel Izmailov (Anthropic/NYU)

Podcast cover

The MAD Podcast with Matt Turck

The podcast explores AI safety and reasoning, particularly focusing on the potential risks of advanced AI models developing deceptive behaviors. Pavel Izmailov, a researcher at Anthropic and professor at NYU, discusses the cultural differences between major AI labs like Anthropic, OpenAI, and XAI, based on his experience. He also addresses a viral article about AI models evolving "alien survival instincts," noting that while such behaviors can be observed, they often require contrived scenarios. Izmailov further examines the concept of "epiplexity," a new measure of information content dependent on the observer's computational power, and its implications for synthetic data. The conversation also covers the challenges and future directions in AI alignment, reasoning, and the potential impact of AI on science and mathematics.

Outlines

Part 1: AI Deception, Myths, and Reality

Part 2: Alignment Frameworks and Safety

Part 3: Advanced Supervision and Interpretability

Part 4: Reasoning, Compute, and Automation

Part 5: Epiplexity and Synthetic Data

Part 6: Future Outlook and Research

Sign in to continue reading, translating and more.

Open full episode in Podwise