David Bau on How Artificial Intelligence Works

In this episode of "The Good Fight," Yascha Mounk interviews David Bau, a computer scientist from Northeastern University and former Google employee, to provide a 101 introduction to how AI models work. Bau explains the differences between traditional AI classifiers and modern large language models (LLMs), detailing the technology behind neural networks, neurons, and the training processes involved. He discusses the significance of the transformer architecture and its role in enabling short-term memory and contextual understanding in AI. Bau also addresses the two-step process of modern machine learning: pre-training and fine-tuning, and the distinction between supervised and unsupervised training methods. He also voices his concern about the trend of accepting AI as a black box without fully understanding its inner workings, advocating for more research into the interpretability of these complex systems.

Outlines

Part 1: AI Fundamentals

Part 2: AI Training

Part 3: Episode Preview

Sign in to continue reading, translating and more.

Continue

The Good Fight

Part 1: AI Fundamentals

The Role of Computer Scientists and the Shift Towards Black Box AI

Classifiers vs. Large Language Models: Understanding the Differences

The Technology Behind AI: Neurons, Neural Networks, and the Training Process

Interpretability and the Scaling of AI Models

The Transformer Architecture and Short-Term Memory

Technical Details of Transformers: Parallel Computing and Context Windows

Part 2: AI Training

Building an AI: Pre-training and Fine-tuning

The Training Process: Supervised vs. Unsupervised Learning

Unsupervised Learning and Probability Distributions

Post-Training and Instruction Fine-Tuning

Part 3: Episode Preview

Preview of the Rest of the Episode

David Bau on How Artificial Intelligence Works

The Good Fight

Part 1: AI Fundamentals

00:00The Role of Computer Scientists and the Shift Towards Black Box AI

The Role of Computer Scientists and the Shift Towards Black Box AI

04:11Classifiers vs. Large Language Models: Understanding the Differences

Classifiers vs. Large Language Models: Understanding the Differences

07:52The Technology Behind AI: Neurons, Neural Networks, and the Training Process

The Technology Behind AI: Neurons, Neural Networks, and the Training Process

17:44Interpretability and the Scaling of AI Models

Interpretability and the Scaling of AI Models

23:41The Transformer Architecture and Short-Term Memory

The Transformer Architecture and Short-Term Memory

28:12Technical Details of Transformers: Parallel Computing and Context Windows

Technical Details of Transformers: Parallel Computing and Context Windows

Part 2: AI Training

33:44Building an AI: Pre-training and Fine-tuning

Building an AI: Pre-training and Fine-tuning

37:07The Training Process: Supervised vs. Unsupervised Learning

The Training Process: Supervised vs. Unsupervised Learning

42:31Unsupervised Learning and Probability Distributions

Unsupervised Learning and Probability Distributions

47:33Post-Training and Instruction Fine-Tuning

Post-Training and Instruction Fine-Tuning

Part 3: Episode Preview

54:15Preview of the Rest of the Episode

Preview of the Rest of the Episode