Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning | Stanford Online

The podcast discusses imitation learning, a method for training policies by mimicking expert demonstrations. It covers representing policy distributions using neural networks, emphasizing the importance of expressive distributions to capture the multimodality often present in expert data. The discussion includes techniques like Gaussian mixture models, discretized actions with autoregressive models, and diffusion models. The podcast addresses challenges such as compounding errors and covariate shift, and it introduces strategies for collecting corrective data through online interventions like the DAgger algorithm to improve policy robustness.

Outlines

Part 1: Fundamentals, Supervised Learning

Part 2: Expressive Distributions, Generative Models

Part 3: Error Correction, Online Methods

Sign in to continue reading, translating and more.

Open full episode in Podwise

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning

Stanford Online

Part 1: Fundamentals, Supervised Learning

Introduction to Imitation Learning

Version Zero of Imitation Learning and Potential Issues

Part 2: Expressive Distributions, Generative Models

Learning Distributions with Neural Networks

Mixture of Gaussians and Autoregressive Models for Action Prediction

Training Autoregressive Models and Expressive Power

Diffusion Models, Expressivity, and Offline Imitation Learning

Part 3: Error Correction, Online Methods

Challenges in Imitation Learning: Compounding Errors and Corrective Data

Dataset Aggregation (DAgger) and Online Interventions

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 2: Imitation Learning

Stanford Online

Part 1: Fundamentals, Supervised Learning

00:05Introduction to Imitation Learning

Introduction to Imitation Learning

04:15Version Zero of Imitation Learning and Potential Issues

Version Zero of Imitation Learning and Potential Issues

Part 2: Expressive Distributions, Generative Models

14:22Learning Distributions with Neural Networks

Learning Distributions with Neural Networks

23:14Mixture of Gaussians and Autoregressive Models for Action Prediction

Mixture of Gaussians and Autoregressive Models for Action Prediction

35:04Training Autoregressive Models and Expressive Power

Training Autoregressive Models and Expressive Power

45:02Diffusion Models, Expressivity, and Offline Imitation Learning

Diffusion Models, Expressivity, and Offline Imitation Learning

Part 3: Error Correction, Online Methods

53:35Challenges in Imitation Learning: Compounding Errors and Corrective Data

Challenges in Imitation Learning: Compounding Errors and Corrective Data

1:00:47Dataset Aggregation (DAgger) and Online Interventions

Dataset Aggregation (DAgger) and Online Interventions