Dwarkesh interviews Richard Sutton, a founding father of reinforcement learning, to discuss the differences between the RL perspective and the LLM way of thinking about AI. Sutton argues that LLMs mimic people and lack a true understanding of the world, while RL is about understanding and interacting with the world through experience and reward. They debate whether imitation learning provides a good prior for AI and whether LLMs have genuine goals. Sutton emphasizes the importance of continual learning from experience and criticizes the reliance on human knowledge in LLMs. They explore the concept of a general reward function for AI, the role of transition models, and the challenges of transfer learning. Sutton shares his perspective on the evolution of AI, the surprising effectiveness of neural networks in language tasks, and the dominance of simple, basic principles like learning and search. The conversation touches on AI succession, the potential for digital intelligences to help each other, and the importance of cybersecurity in a future of digital spawning and reforming.
Sign in to continue reading, translating and more.
Continue