In this episode of The Cognitive Revolution, Nathan Labenz interviews Keerthana Gopalakrishnan and Ted Xiao from Google DeepMind about their work on Gemini Robotics. They discuss the advancements in robotics, comparing its current state to the GPT-3/3.5 era in language models, and highlight the Gemini Robotics project, which aims to improve robots' reasoning and action capabilities. The conversation covers the architecture of the Gemini Robotics models, including the Embodied Reasoning model and the Vision Language Action model, and touches on the challenges of data collection, safety, and deployment in real-world environments. They also explore the potential of imitation learning, the role of foundation models, and the interplay between hardware and software in robotics.
Sign in to continue reading, translating and more.
Continue