The podcast discusses model-based reinforcement learning, including using learned models with synthetic data generation and determining when to use model-based reinforcement learning. It also covers multi-task imitation learning and reinforcement learning, including conditioning on tasks, goal-reaching tasks, and an approach called hindsight relabeling. The discussion includes planning with gradient-based or sampling-based optimization, updating models with collected data, and replanning to account for errors. The podcast further explores using learned models to learn a policy by augmenting collected data with a learned simulator, generating synthetic data, and updating policies using both real and generated data. Additionally, it addresses multi-task reinforcement learning, focusing on learning a generalist policy conditioned on the task, amortizing complexity across tasks, and leveraging shared structures between tasks, including identifying tasks and using task identifiers.
Sign in to continue reading, translating and more.
Continue