Gemini Robotics – AI for the Physical World, with Keerthana Gopalakrishnan and Ted Xiao of Google DeepMind
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
In this episode of The Cognitive Revolution, Nathan Labenz interviews Keerthana Gopalakrishnan and Ted Xiao from Google DeepMind about their work on Gemini Robotics. They discuss the advancements in robotics, comparing its current state to the GPT-3/3.5 era in language models, and highlight the Gemini Robotics project, which aims to improve robots' reasoning and action capabilities. The conversation covers the architecture of the Gemini Robotics models, including the Embodied Reasoning model and the Vision Language Action model, and touches on the challenges of data collection, safety, and deployment in real-world environments. They also explore the potential of imitation learning, the role of foundation models, and the interplay between hardware and software in robotics.
Part 1: Introduction and Current State
Part 2: Robotics and Language Models
Part 3: Model Architecture and Functionality
Part 4: Failure Modes and Safety
Part 5: Data and Hardware
Part 6: Future Trajectory and Applications
Part 7: Embodiments and Hardware
Part 8: Conclusion
Sign in to continue reading, translating and more.
Open full episode in Podwise