"I Co-Invented the Transformer. Now I'm Replacing It." & Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI] | Machine Learning Street Talk (MLST)

The podcast features Llion Jones, one of the inventors of the Transformer model, and Luke Darlow, a research scientist at Sakana AI, discussing the current state and future directions of AI research. Jones expresses concern about the field being oversaturated with Transformer-based research, advocating for more exploratory work. Darlow introduces the Continuous Thought Machine (CTM), a new recurrent model with native adaptive compute, drawing inspiration from biological systems. They explore the limitations of current AI models, particularly their "jagged intelligence" and the tendency to brute-force solutions rather than developing genuine understanding. Jones and Darlow highlight the importance of research freedom and the potential for AI to drive future scientific progress, emphasizing the need for models that can reason more like humans. The conversation also covers the Sudoku Bench dataset as a challenging reasoning benchmark and the potential of the CTM architecture to address limitations in current language models.

Outlines

Part 1: Transformer Oversaturation

Part 2: Introducing the Continuous Thought Machine (CTM)

Part 3: CTM Applications and Future Outlook

Sign in to continue reading, translating and more.

Continue

"I Co-Invented the Transformer. Now I'm Replacing It." & Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

Machine Learning Street Talk (MLST)

Part 1: Transformer Oversaturation

The Evolution and Oversaturation of Transformer Models in AI Research

The Risk of Stagnation and the Need for Architectural Innovation Beyond Transformers

Limitations of Current Neural Networks and the Importance of Representation

Part 2: Introducing the Continuous Thought Machine (CTM)

Introducing the Continuous Thought Machine (CTM) and its Novel Approach to Recurrent Modeling

Adaptive Computation, Neuron-Level Models, and Synchronization in the CTM

Calibration, Complexification, and the CTM's Approach to Reasoning

Part 3: CTM Applications and Future Outlook

Future Directions for CTM and the Language Maze

The Deductive Closure of Knowledge and Sakana AI's Hiring

"I Co-Invented the Transformer. Now I'm Replacing It." & Continuous Thought Machines - Llion Jones and Luke Darlow [Sakana AI]

Machine Learning Street Talk (MLST)

Part 1: Transformer Oversaturation

00:00The Evolution and Oversaturation of Transformer Models in AI Research

The Evolution and Oversaturation of Transformer Models in AI Research

07:56The Risk of Stagnation and the Need for Architectural Innovation Beyond Transformers

The Risk of Stagnation and the Need for Architectural Innovation Beyond Transformers

17:32Limitations of Current Neural Networks and the Importance of Representation

Limitations of Current Neural Networks and the Importance of Representation

Part 2: Introducing the Continuous Thought Machine (CTM)

28:50Introducing the Continuous Thought Machine (CTM) and its Novel Approach to Recurrent Modeling

Introducing the Continuous Thought Machine (CTM) and its Novel Approach to Recurrent Modeling

37:19Adaptive Computation, Neuron-Level Models, and Synchronization in the CTM

Adaptive Computation, Neuron-Level Models, and Synchronization in the CTM

48:16Calibration, Complexification, and the CTM's Approach to Reasoning

Calibration, Complexification, and the CTM's Approach to Reasoning

Part 3: CTM Applications and Future Outlook

55:30Future Directions for CTM and the Language Maze

Future Directions for CTM and the Language Maze

1:07:29The Deductive Closure of Knowledge and Sakana AI's Hiring

The Deductive Closure of Knowledge and Sakana AI's Hiring