Genie: Generative Interactive Environments with Ashley Edwards - #696 | The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

In this episode of The TWIML AI Podcast, host Sam Charrington interviews Ashley Edwards, a member of technical staff at RunwayML and former researcher at Google DeepMind, about her work in video generation, specifically the Genie environment. Ashley discusses her background in reinforcement learning from videos and imitation learning, explaining how Genie learns actions in an unsupervised way to create an unlimited source of training environments for generalist agents. She details the three major components of Genie—the latent action model, the dynamics model, and the video tokenizer—and how they work together to allow users to interact with text-generated images, sketches, and real-world photos as if they were real games. The conversation explores the technical aspects of Genie, including the use of spatiotemporal transformers, and touches on the broader implications of the technology for education, creative tools, and interactive media.

Outlines

Sign in to continue reading, translating and more.

Continue

Genie: Generative Interactive Environments with Ashley Edwards - #696

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Introduction to Ashley Edwards and the Motivation Behind Genie

Genie's Capabilities and Core Components

Deep Dive into the Latent Action Model

Benchmarking and Architecture of Genie

Dynamics Model and Video Tokenizer Explained

Computational Complexity, Playability, and Action Consistency

Accommodating Diverse Input Sources and Comparisons to Other Video Generation Models

Broader Implications, Future Directions, and Concluding Remarks

Genie: Generative Interactive Environments with Ashley Edwards - #696

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

00:00Introduction to Ashley Edwards and the Motivation Behind Genie

Introduction to Ashley Edwards and the Motivation Behind Genie

04:53Genie's Capabilities and Core Components

Genie's Capabilities and Core Components

09:07Deep Dive into the Latent Action Model

Deep Dive into the Latent Action Model

15:01Benchmarking and Architecture of Genie

Benchmarking and Architecture of Genie

21:26Dynamics Model and Video Tokenizer Explained

Dynamics Model and Video Tokenizer Explained

27:20Computational Complexity, Playability, and Action Consistency

Computational Complexity, Playability, and Action Consistency

33:56Accommodating Diverse Input Sources and Comparisons to Other Video Generation Models

Accommodating Diverse Input Sources and Comparisons to Other Video Generation Models

40:03Broader Implications, Future Directions, and Concluding Remarks

Broader Implications, Future Directions, and Concluding Remarks