09 Feb 2025
42m

MLG 033 Transformers

Podcast cover

Machine Learning Guide

This Machine Learning Guide podcast episode explains transformer neural networks, focusing on the attention mechanism. The speaker begins by contrasting context-free (traditional) and context-aware neural networks, using examples from housing markets and shipment dispatch. He then details how the attention mechanism allows parts of a network to "talk" to each other by comparing embeddings using dot products, resulting in context-aware processing. The episode concludes by explaining the differences between self-attention and cross-attention, and introduces concepts like positional encodings and masking. Listeners gain a foundational understanding of transformers and their advantages over recurrent neural networks, particularly in parallelization for improved computational efficiency.

Outlines

Part 1: Introduction to Transformers

Part 2: RNN Limitations & Transformer Architecture

Part 3: Attention Mechanism Deep Dive

Sign in to continue reading, translating and more.

Open full episode in Podwise