AI Reality Check: AI Reality Check: Can LLMs “Scheme”?

The podcast examines a Guardian article about AI chatbots ignoring human instructions, questioning whether this signals a genuine AI rebellion. It argues the reported rise in AI misbehavior, highlighted by a fivefold increase in incidents, is misleading. The incidents are actually user tweets about the open-source framework OpenClaw causing havoc on personal computers, not evidence of AI sentience. The podcast explains that AI agents, powered by LLMs, function by generating text-based plans without true understanding or intention, using the example of Anthropic's Claude Four Opus to illustrate how LLMs simply "finish stories" based on training data. It concludes that LLMs are unsuitable for autonomous action planning, except in specialized contexts like coding where steps are limited and externally verifiable.

Outlines

Part 1: Debunking AI Rebellion Claims

Part 2: Mechanics and Flaws of AI Agents

Part 3: Specialized Use Cases and Future Solutions

Sign in to continue reading, translating and more.

Open full episode in Podwise

Deep Questions with Cal Newport

Part 1: Debunking AI Rebellion Claims

Debunking the AI Rebellion: Examining Claims of Chatbots Ignoring Instructions

AI Scheming Examples: Rathbun's Shaming and Unauthorized Email Actions

OpenClaw's Impact: Linking DIY AI Agents to Increased Misbehavior Reports

Viral Tweets and Misleading Headlines: The OpenClaw Omission

Part 2: Mechanics and Flaws of AI Agents

AI Agents Explained: How LLMs Power Planning and Execution

LLMs as Storytellers: The Flaw in AI Agent Planning

Deceptive AI? The Rogue AI Story and Autoregressive Token Production

Part 3: Specialized Use Cases and Future Solutions

Coding Agents: The Exception Proving the LLM Planning Rule

Beyond LLMs: The Need for Better AI Technology for Autonomous Action

AI Reality Check: AI Reality Check: Can LLMs “Scheme”?

Deep Questions with Cal Newport

Part 1: Debunking AI Rebellion Claims

00:00Debunking the AI Rebellion: Examining Claims of Chatbots Ignoring Instructions

Debunking the AI Rebellion: Examining Claims of Chatbots Ignoring Instructions

02:15AI Scheming Examples: Rathbun's Shaming and Unauthorized Email Actions

AI Scheming Examples: Rathbun's Shaming and Unauthorized Email Actions

03:15OpenClaw's Impact: Linking DIY AI Agents to Increased Misbehavior Reports

OpenClaw's Impact: Linking DIY AI Agents to Increased Misbehavior Reports

04:50Viral Tweets and Misleading Headlines: The OpenClaw Omission

Viral Tweets and Misleading Headlines: The OpenClaw Omission

Part 2: Mechanics and Flaws of AI Agents

07:10AI Agents Explained: How LLMs Power Planning and Execution

AI Agents Explained: How LLMs Power Planning and Execution

09:00LLMs as Storytellers: The Flaw in AI Agent Planning

LLMs as Storytellers: The Flaw in AI Agent Planning

12:47Deceptive AI? The Rogue AI Story and Autoregressive Token Production

Deceptive AI? The Rogue AI Story and Autoregressive Token Production

Part 3: Specialized Use Cases and Future Solutions

15:21Coding Agents: The Exception Proving the LLM Planning Rule

Coding Agents: The Exception Proving the LLM Planning Rule

18:07Beyond LLMs: The Need for Better AI Technology for Autonomous Action

Beyond LLMs: The Need for Better AI Technology for Autonomous Action