YouTube27 Aug 2025
54m

Lecture 2 – Data, Structure, Information (MIT How to AI Almost Anything, Spring 2025)

Podcast cover

Paul Liang

In this lecture on Data Structure and Learning, Paul Liang discusses the importance of data in machine learning and AI, covering various data forms such as visual, language, auditory, sensing, set, and graph data. He generalizes these forms into a discussion about data properties and modeling architectures, using real-world datasets and labels. The lecture also covers logistics, including Piazza enrollment, project preferences, and schedule changes due to President's Day. Liang introduces sensory modalities, abstractions from raw data, and different ways to represent data, like bag of words and spectrograms. He also explains different learning paradigms: supervised, unsupervised, and reinforcement learning, along with interactive learning paradigms like curriculum learning and human-in-the-loop learning, and emphasizes the importance of data collection, cleaning, visualization, and evaluation metrics before model selection to avoid overfitting.

Outlines

Part 1: Course Introduction and Data Modalities

Part 2: Data Properties and Learning Paradigms

Part 3: Modeling and Practical Considerations

Sign in to continue reading, translating and more.

Open full episode in Podwise