In this episode of the Latent Space podcast, we dive into the highlights from the 2024 International Conference on Machine Learning (ICML), with a special focus on generative video models. We explore presentations on OpenAI's Sora, Google DeepMind's Genie, and VideoPoet, examining their strengths and weaknesses in producing high-quality, controllable videos. The conversation also touches on the latest advancements in diffusion models and their applications across different formats, including audio and speech, while addressing the challenges of evaluating generative models. To wrap up, we discuss the intersection of large language models and computer vision, stressing the significance of data and efficient training techniques in robotics and reinforcement learning, along with the necessity for automated environment shaping.