08 Jan 2026
1h 18m

Artificial Analysis: Independent LLM Evals as a Service — with George Cameron and Micah-Hill Smith

Podcast cover

Latent Space: The AI Engineer Podcast

Artificial Analysis founders Micah-Hill Smith and George Cameron discuss the evolution and future of AI benchmarking with Swyx. They detail their journey from a side project to a company providing independent AI model analysis, emphasizing the importance of objective metrics. They cover their business model, which includes enterprise subscriptions and private benchmarking, and the tech stack behind their public benchmarks. The conversation explores the nuances of AI model evaluation, including cost considerations, the challenges of parsing model responses, and the importance of controlling for variance in benchmarks. They also introduce new metrics like the Omniscience Index for measuring hallucination and discuss the trend of decreasing costs for AI intelligence alongside increasing overall spending due to new use cases.

Outlines

Part 1: Origins, Mission, and Business Model

Part 2: The Science of Independent Benchmarking

Part 3: The Intelligence Index and Market Landscape

Part 4: Knowledge, Hallucination, and Physics

Part 5: Model Architecture and Parameters

Part 6: Agentic Performance and Real-World Tasks

Part 7: Openness and Transparency

Part 8: Economics and Token Efficiency

Part 9: Future Outlook and Community

Sign in to continue reading, translating and more.

Open full episode in Podwise