YouTube

This episode explores the challenges and opportunities surrounding the development and deployment of AI agents in enterprise settings. Against the backdrop of rapid advancements in large language models (LLMs), the discussion highlights the complexities of building reliable and safe AI agents capable of handling real-world tasks, such as booking flights or automating customer service. More significantly, the panelists emphasize the importance of robust evaluation frameworks, encompassing not only accuracy but also factors like convergence, router efficiency, and the handling of multimodal inputs (e.g., voice and text). For instance, one speaker details the challenges of building agents at a major credit bureau, emphasizing the need for rigorous security and compliance measures. In contrast, another speaker showcases how an AI coding agent helped build itself, highlighting the potential of agents to automate toil and accelerate software development. Emerging industry patterns reflected in the discussion include the growing importance of data curation, the shift towards specialized agents over general-purpose ones, and the need for a collaborative human-AI workflow. What this means for businesses is a need to move beyond simple chatbot implementations towards more sophisticated agent systems that deliver tangible business value and address the context paradox, where seemingly simple tasks prove surprisingly complex for AI.

Outlines

Part 1: Summit Introduction and AI Agent Challenges

Part 2: AI Strategy, Safety, and Coding Agents

Part 3: GenAI Implementation and Integration

Part 4: Evaluation and Enterprise Best Practices

Part 5: Workflow Automation and Agent Evaluation

Part 6: Infrastructure and Implementation Strategies

Part 7: Hiring, Platform Building, and RAG

Sign in to continue reading, translating and more.

Continue

AI Engineer Summit 2025 - AI Leadership (Day 1)

AI Engineer

Part 1: Summit Introduction and AI Agent Challenges

AI Engineering Summit 2025 Introduction and Welcome

The State of the AI Frontier and Challenges in Building AI Agents

Part 2: AI Strategy, Safety, and Coding Agents

How to Spectacularly Fail at Your AI Strategy

Building Safe and Reliable AI Agents

AI Coding Agents That Build Themselves

Part 3: GenAI Implementation and Integration

Navigating the Challenges of Implementing GenAI in Large Organizations

Integrating AI Coding Agents into Large-Scale Software Development

Part 4: Evaluation and Enterprise Best Practices

Building Trust in Enterprise AI through Robust Evaluation

Building and Scaling AI Use Cases with OpenAI: Enterprise Best Practices and Agent Development

Frontier Feud: A Family Feud-Style AI Trivia Game

Part 5: Workflow Automation and Agent Evaluation

Missing Pieces for Workflow Automation with AI in Enterprises

Evaluating AI Agents and Assistants for Production Readiness

AI Agents for DevOps: Building the DevOps Engineer Who Never Sleeps

Part 6: Infrastructure and Implementation Strategies

Building Self-Managed AI Networks for Training and Inference

Implementing AI and Best Practices for Enterprise Success with Anthropic's Claude

Part 7: Hiring, Platform Building, and RAG

Strategies for Effective AI Hiring in a Competitive Market

Building LinkedIn's GenAI Platform: A Journey from Simple Features to Multi-Agent Systems

Retrieval Augmented Generation (RAG) Agents in Production: Lessons Learned

AI Engineer Summit 2025 - AI Leadership (Day 1)

AI Engineer

Part 1: Summit Introduction and AI Agent Challenges

11:50AI Engineering Summit 2025 Introduction and Welcome

AI Engineering Summit 2025 Introduction and Welcome

17:29The State of the AI Frontier and Challenges in Building AI Agents

The State of the AI Frontier and Challenges in Building AI Agents

Part 2: AI Strategy, Safety, and Coding Agents

34:57How to Spectacularly Fail at Your AI Strategy

How to Spectacularly Fail at Your AI Strategy

51:59Building Safe and Reliable AI Agents

Building Safe and Reliable AI Agents

1:10:00AI Coding Agents That Build Themselves

AI Coding Agents That Build Themselves

Part 3: GenAI Implementation and Integration

2:06:38Navigating the Challenges of Implementing GenAI in Large Organizations

Navigating the Challenges of Implementing GenAI in Large Organizations

2:28:39Integrating AI Coding Agents into Large-Scale Software Development

Integrating AI Coding Agents into Large-Scale Software Development

Part 4: Evaluation and Enterprise Best Practices

2:49:27Building Trust in Enterprise AI through Robust Evaluation

Building Trust in Enterprise AI through Robust Evaluation

3:01:26Building and Scaling AI Use Cases with OpenAI: Enterprise Best Practices and Agent Development

Building and Scaling AI Use Cases with OpenAI: Enterprise Best Practices and Agent Development

4:24:24Frontier Feud: A Family Feud-Style AI Trivia Game

Frontier Feud: A Family Feud-Style AI Trivia Game

Part 5: Workflow Automation and Agent Evaluation

4:50:37Missing Pieces for Workflow Automation with AI in Enterprises

Missing Pieces for Workflow Automation with AI in Enterprises

5:06:16Evaluating AI Agents and Assistants for Production Readiness

Evaluating AI Agents and Assistants for Production Readiness

5:27:59AI Agents for DevOps: Building the DevOps Engineer Who Never Sleeps

AI Agents for DevOps: Building the DevOps Engineer Who Never Sleeps

Part 6: Infrastructure and Implementation Strategies

5:44:39Building Self-Managed AI Networks for Training and Inference

Building Self-Managed AI Networks for Training and Inference

6:07:35Implementing AI and Best Practices for Enterprise Success with Anthropic's Claude

Implementing AI and Best Practices for Enterprise Success with Anthropic's Claude

Part 7: Hiring, Platform Building, and RAG

7:04:40Strategies for Effective AI Hiring in a Competitive Market

Strategies for Effective AI Hiring in a Competitive Market

7:26:15Building LinkedIn's GenAI Platform: A Journey from Simple Features to Multi-Agent Systems

Building LinkedIn's GenAI Platform: A Journey from Simple Features to Multi-Agent Systems

7:43:59Retrieval Augmented Generation (RAG) Agents in Production: Lessons Learned

Retrieval Augmented Generation (RAG) Agents in Production: Lessons Learned