In this episode of the Eye on AI podcast, Fei Fei Li discusses spatial intelligence and world models, emphasizing their importance in advancing AI beyond language-based learning. She highlights her startup, World Labs, and their product Marble, which generates 3D spaces from internal model representations. Fei Fei Li also touches on the necessity of multimodal learning, incorporating various sensory inputs for AI to understand and interact with the world more effectively. The conversation explores the real-time frame model (RTFM) and the concept of a universal task function, drawing parallels with next token prediction in language models. She envisions future AI systems that combine statistical learning with physics engines to achieve a deeper understanding of the physical world, while also addressing the potential for AI to contribute to scientific discovery.
Sign in to continue reading, translating and more.
Continue