This podcast interviews David Hershey from Anthropic about Claude Plays Pokémon, a project where Anthropic's Claude language model plays Pokémon Red. The interview covers the project's origins, technical implementation (including tools like a Navigator to address Claude's vision limitations), and the challenges of using a large language model for long-running tasks. Hershey discusses the cost (thousands of dollars in tokens) and the insights gained into Claude's capabilities and limitations through this experiment, highlighting that Claude's performance improved significantly with newer models. The conversation also touches upon potential future applications and the use of game milestones as a method for evaluating the model's progress. The project demonstrates a novel way to benchmark large language models.