Dwarkesh reflects on a previous interview with Richard Sutton, focusing on Sutton's "Bitter Lesson" essay and its implications for AI development. Dwarkesh interprets Sutton's argument as advocating for AI techniques that effectively leverage compute, criticizing current LLMs for inefficient training methods, reliance on human data, and the creation of models that predict human responses rather than developing true world models. Dwarkesh disagrees with Sutton's sharp distinctions between LLMs and true intelligence, arguing that imitation learning and RL are complementary, and that models of humans can facilitate the development of true world models. He suggests that continual learning could be integrated into LLMs and that while current LLMs may have limitations, they are undergoing RL on ground truth and paving the way for future AI systems based on Sutton's principles.
Sign in to continue reading, translating and more.
Continue