Software Engineering in the Age of Coding Agents: Testing, Evals, and Shipping Safely at Scale
MLOps.community
The podcast explores the evolving landscape of software engineering with AI, particularly focusing on agentic systems in security. It highlights the hybrid nature of these systems, blending traditional software engineering with data science practices due to the stochastic nature of LLMs. The conversation emphasizes the challenges of prompt engineering, including the need for guardrails, testing methodologies, and managing context within LLMs. The discussion also covers the importance of domain knowledge, hybrid approaches combining LLMs with traditional machine learning, and the crucial role of UX design in building trust with users. The guest shares insights from building agentic systems for security, emphasizing the need for feedback loops, real-world data, and human oversight.
Part 1: Value, Context, and the Shift to Agentic Systems
Part 2: Optimization, Context, and Hybrid Architectures
Part 3: UX, Debugging, and Prompt Engineering
Part 4: Evaluation, Failure Modes, and Frameworks
Sign in to continue reading, translating and more.
Open full episode in Podwise