This podcast episode explores the challenges and potential of computer vision in various contexts, particularly in the field of autonomous driving. It discusses the tendency of researchers to underestimate the difficulty of computer vision problems and highlights the need for a realistic assessment of the challenges involved. The episode examines the use of vision-based approaches, such as Tesla's Autopilot system, and the potential for solving driving as a purely vision-based problem. It also delves into the concept of learning in artificial intelligence and the potential for neural networks to accumulate knowledge. Various aspects of computer vision, including the style of computing, the importance of knowledge and schema, and the role of interactivity and simulation environments, are discussed. The episode also explores the interplay between bottom-up and top-down information, the value of segmentation, and the three main components of computer vision: recognition, reconstruction, and reorganization. Additionally, it examines the connection between child development and computer vision, the relationship between language and vision, and the limitations of the Turing test as a measure of intelligence. The trade-off between interpretability and performance in neural networks is also explored, as well as the potential risks and consequences of AI. The episode features insights from renowned computer vision researcher Jitendra Malik and emphasizes the importance of scientific research and mentorship in the field.