Training General Robots for Any Task: Physical Intelligence’s Karol Hausman and Tobi Springenberg

Physical Intelligence is building robotic foundation models that can enable any robot to perform any task. Karol and Tobi explain that robotics has been bottlenecked by intelligence, not hardware, and that the classical approach of breaking robotics down into perception, planning, and control was fundamentally flawed. Their newest model, PI-STAR 0.6, uses reinforcement learning to learn from experience, achieving robust real-world performance, such as robots making coffee for 13 hours straight and generalizing across tasks from surgical robots to drone flying. The model architecture is analogous to vision language models, pre-trained on robotics data and internet data, with an added action model to drive the robot.

Outlines

Part 1: Mission, Context, and the Intelligence Bottleneck

Part 2: Technical Architecture and End-to-End Learning

Part 3: Data Strategy and Reinforcement Learning

Part 4: Pi-Star 0.6 Results and Reliability

Part 5: Scaling, Generalization, and Future Deployment

Part 6: Commercialization and Grand Vision

Sign in to continue reading, translating and more.

Open full episode in Podwise

Sequoia Capital

Part 1: Mission, Context, and the Intelligence Bottleneck

Physical Intelligence's Mission: Building Foundation Models for Robotics

Focusing on Intelligence: Why Physical Intelligence Builds Foundation Models

Hardware vs. Intelligence: Addressing the Bottleneck in Robotics

Capability, Generalization, and Performance: The Three Factors of Robotic Intelligence

Part 2: Technical Architecture and End-to-End Learning

Performance Level and Technical Architecture

Pixels to Actions: The End-to-End Neural Net for Robotics

From Perception to Actions: The Evolution of Robotics and End-to-End Learning

Reasoning in Robotics: Planning Actions and Decomposing Tasks

Part 3: Data Strategy and Reinforcement Learning

Data Nuances: Quality, Diversity, and the Plateau Effect in Robotics

Pi-Star 0.6: Reinforcement Learning from Experience

Generalization vs. Performance: Hill Climbing on Reward Signals

Real-World RL: Modeling the World vs. Modeling the Robot

Part 4: Pi-Star 0.6 Results and Reliability

PyStar 0.6 Results: Reliability and Throughput Improvements

Customer Deployment Reliability: Reinforcement Learning for Real-World Tasks

Learning from Experience: A New Capability for Robotics

Human Corrections: Improving Model Performance

Part 5: Scaling, Generalization, and Future Deployment

Generalization from Pre-Training: Onboarding New Tasks

Value Functions: Predicting Success and Failure

Bootstrapping Deployment: Data and the Long Term

World Modeling: Counterfactuals and Credit Assignment

Part 6: Commercialization and Grand Vision

Customer Deployments: Commercialization

The Grand Vision: Physical Intelligence

General Purpose Solutions: Physical Intelligence

Impressive Results: Video Models and General Intelligence

Learning from Data: The Algorithm

Training General Robots for Any Task: Physical Intelligence’s Karol Hausman and Tobi Springenberg

Sequoia Capital

Part 1: Mission, Context, and the Intelligence Bottleneck

00:48Physical Intelligence's Mission: Building Foundation Models for Robotics

Physical Intelligence's Mission: Building Foundation Models for Robotics

02:59Focusing on Intelligence: Why Physical Intelligence Builds Foundation Models

Focusing on Intelligence: Why Physical Intelligence Builds Foundation Models

04:44Hardware vs. Intelligence: Addressing the Bottleneck in Robotics

Hardware vs. Intelligence: Addressing the Bottleneck in Robotics

06:29Capability, Generalization, and Performance: The Three Factors of Robotic Intelligence

Capability, Generalization, and Performance: The Three Factors of Robotic Intelligence

Part 2: Technical Architecture and End-to-End Learning

11:34Performance Level and Technical Architecture

Performance Level and Technical Architecture

12:55Pixels to Actions: The End-to-End Neural Net for Robotics

Pixels to Actions: The End-to-End Neural Net for Robotics

16:21From Perception to Actions: The Evolution of Robotics and End-to-End Learning

From Perception to Actions: The Evolution of Robotics and End-to-End Learning

20:13Reasoning in Robotics: Planning Actions and Decomposing Tasks

Reasoning in Robotics: Planning Actions and Decomposing Tasks

Part 3: Data Strategy and Reinforcement Learning

23:52Data Nuances: Quality, Diversity, and the Plateau Effect in Robotics

Data Nuances: Quality, Diversity, and the Plateau Effect in Robotics

25:36Pi-Star 0.6: Reinforcement Learning from Experience

Pi-Star 0.6: Reinforcement Learning from Experience

27:19Generalization vs. Performance: Hill Climbing on Reward Signals

Generalization vs. Performance: Hill Climbing on Reward Signals

29:33Real-World RL: Modeling the World vs. Modeling the Robot

Real-World RL: Modeling the World vs. Modeling the Robot

Part 4: Pi-Star 0.6 Results and Reliability

32:02PyStar 0.6 Results: Reliability and Throughput Improvements

PyStar 0.6 Results: Reliability and Throughput Improvements

34:51Customer Deployment Reliability: Reinforcement Learning for Real-World Tasks

Customer Deployment Reliability: Reinforcement Learning for Real-World Tasks

36:49Learning from Experience: A New Capability for Robotics

Learning from Experience: A New Capability for Robotics

39:20Human Corrections: Improving Model Performance

Human Corrections: Improving Model Performance

Part 5: Scaling, Generalization, and Future Deployment

41:47Generalization from Pre-Training: Onboarding New Tasks

Generalization from Pre-Training: Onboarding New Tasks

44:59Value Functions: Predicting Success and Failure

Value Functions: Predicting Success and Failure

46:19Bootstrapping Deployment: Data and the Long Term

Bootstrapping Deployment: Data and the Long Term

48:15World Modeling: Counterfactuals and Credit Assignment

World Modeling: Counterfactuals and Credit Assignment

Part 6: Commercialization and Grand Vision

49:24Customer Deployments: Commercialization

Customer Deployments: Commercialization

51:41The Grand Vision: Physical Intelligence

The Grand Vision: Physical Intelligence

53:09General Purpose Solutions: Physical Intelligence

General Purpose Solutions: Physical Intelligence

55:18Impressive Results: Video Models and General Intelligence

Impressive Results: Video Models and General Intelligence

57:54Learning from Data: The Algorithm

Learning from Data: The Algorithm