In this lecture, Paul Liang aims to cover common model architectures in deep learning within a unified framework. He begins with logistics, including project proposal submissions and reading assignments focused on data and learning, specifically discussing the "bitter lesson" and "grokking" or "double descent" phenomena. The lecture outlines a unified paradigm for viewing different architectures, emphasizing the spectrum from domain-specific to general-purpose models. Liang discusses key factors for a good model, including capturing semantic information, granularity, data usage, resource constraints, and usability. He then delves into multimodal specific methods, recapping modality profiles and the steps in deep learning models: learning representations and combining them. Using sets and point clouds as examples, he explains data invariances and equivariances, followed by common architectures like temporal models, sequence models, transformers, spatial models (CNNs and vision transformers), and graph networks, all within the context of invariances and equivariances. The lecture concludes with a summary of how to model data effectively, emphasizing data collection, cleaning, normalization, visualization, evaluation, and the importance of understanding data invariances and equivariances.
Sign in to continue reading, translating and more.
Continue