The Agentic AI Engineer - Benedikt Sanftl, Mutagent | AI Engineer

The "Agentic AI Engineer" framework replaces manual, slow development cycles with automated, agentic loops to build and maintain AI agents at scale. This approach utilizes two primary phases: an offline loop for initial specification, build, and evaluation, and an online loop for production monitoring and continuous improvement. By implementing spec-driven and eval-driven development, teams establish clear success criteria and automate the identification of failure modes. Specialized agents, such as the Evaluator and Diagnostics agents, streamline the process by filtering massive trace volumes, performing root cause analysis, and generating actionable remedies. This shift from human-centric debugging to autonomous, loop-based optimization enables organizations to deploy hundreds of agents reliably, ensuring that systems continuously evolve based on real-world performance data and learned failure patterns.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise

The Agentic AI Engineer - Benedikt Sanftl, Mutagent

AI Engineer

Transitioning to Automated Agentic AI Engineering Workflows

Spec-Driven Development for Scalable Agent Architecture

Implementing Eval-Driven Development for Agent Reliability

Autonomous Diagnostics and Root Cause Analysis

Orchestrating Agentic Workflows in Production Environments

The Agentic AI Engineer - Benedikt Sanftl, Mutagent

AI Engineer

00:01Transitioning to Automated Agentic AI Engineering Workflows

Transitioning to Automated Agentic AI Engineering Workflows

06:00Spec-Driven Development for Scalable Agent Architecture

Spec-Driven Development for Scalable Agent Architecture

11:13Implementing Eval-Driven Development for Agent Reliability

Implementing Eval-Driven Development for Agent Reliability

18:58Autonomous Diagnostics and Root Cause Analysis

Autonomous Diagnostics and Root Cause Analysis

25:02Orchestrating Agentic Workflows in Production Environments

Orchestrating Agentic Workflows in Production Environments