How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS | AI Engineer

Building AI systems that ship requires moving beyond simple prompting toward robust, verifiable engineering harnesses. By implementing state machines to manage agent workflows—covering implementation, verification, and retrospective analysis—developers can ensure agents perform tasks accurately rather than hallucinating results. Cryptographic verification of test outputs and UI-based evidence, such as Playwright recordings, replaces blind trust with objective proof. For public-facing tools like the WorkOS CLI, success relies on identifying specific "gotchas" rather than providing exhaustive documentation, as smaller, focused skill sets often outperform bloated, token-heavy instructions. Ultimately, treating agent failures as system bugs within the harness, rather than individual errors, allows for continuous improvement. Rigorous evaluation is essential to measure performance, ensuring that agents remain productive tools rather than sources of noise or inefficiency in the development pipeline.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

AI Engineer

Scaling Agentic Workflows and Building Robust Internal Harnesses

Optimizing Public-Facing AI Tools through Evals and Focused Skills

Strategic Principles for Developing Reliable AI-Native Systems

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

AI Engineer

00:14Scaling Agentic Workflows and Building Robust Internal Harnesses

Scaling Agentic Workflows and Building Robust Internal Harnesses

06:00Optimizing Public-Facing AI Tools through Evals and Focused Skills

Optimizing Public-Facing AI Tools through Evals and Focused Skills

10:40Strategic Principles for Developing Reliable AI-Native Systems

Strategic Principles for Developing Reliable AI-Native Systems