The discussion centers on the interplay between AI development and traditional systems engineering, questioning whether the "bitter lesson" of simply scaling compute power will continue to dominate or if engineering rigor will become essential. Ankur Goyal, founder and CEO of Braintrust, shares his experience on the necessity of well-engineered testing and feedback loops (evals) to manage AI model complexity, even as models improve. Goyal and Martin Casado explore the dynamics between closed and open-source AI models, noting Chinese models exhibit high token usage but lower dollar spend, and discuss the balance between model quality and engineering efficiency, suggesting that enterprises may face a limit in their capacity to absorb ever-improving AI capabilities. The conversation also touches on the surprising finding that SQL outperforms Bash in agent benchmarks, highlighting the value of computer science fundamentals in AI development.
Sign in to continue reading, translating and more.
Continue