YouTube24 Sept 2025
1h 22m

Stanford CS229 I Machine Learning I Building Agents That Do the Work of Human Software Engineers

Podcast cover

Stanford Online

The podcast explores the construction and utilization of multi-agent systems for software engineering, emphasizing a balance between simplicity and complexity. It begins with basic LLM calls and progresses to multi-agent systems, assessing the problem-solving capabilities at each stage, using a payment system failure as a case study. The discussion covers active critic systems, tool use, and agentic systems, highlighting the importance of planning loops, autonomous execution engines, and memories. Multi-agent systems address limitations like breadth versus depth, planning fragility, and tool overload. The podcast also tackles challenges like multi-agent context, statefulness, and asynchronous activities, advocating for memory and post-training strategies to facilitate learning. It concludes by addressing open questions on tail-patching models and prompting techniques.

Outlines

Part 1: Context, Challenges

Part 2: System Demo, Architecture

Part 3: Agent Autonomy, Limitations

Part 4: Multi-Agent Design, Orchestration

Part 5: Model Selection, Learning

Part 6: Future Outlook, Best Practices

Sign in to continue reading, translating and more.

Open full episode in Podwise