Odyssey's CTO Jeff Hawke joins the podcast to discuss world models, a new category of AI models that predict potential futures through continuous streams of interactive pixels. He details Odyssey2 Pro, a frontier model trained on large-scale public video data, distinguishing it from LLMs and generative video models. Hawke explains that world models learn how the world evolves from visual observations, enabling use cases in gaming, retail, live events, and robotics. He notes current limitations, such as generating stable video streams for extended periods, and highlights the computational intensity requiring powerful GPUs. Hawke differentiates world models from spatial intelligence and proxy world models like Sora, emphasizing their potential for real-time inference and content generation. He also touches on the open-source landscape and the future integration of world models into foundation models.
Outlines
Part 1: Introduction to World Models
Part 2: Applications and Use Cases
Part 3: Definitions and Categorization
Part 4: Data and Robotics
Part 5: Development and Scalability
Part 6: Technical Challenges and Research
Part 7: Advanced Techniques and Ecosystem
Part 8: Infrastructure and Conclusion
Sign in to continue reading, translating and more.