
Artificial Analysis founders Micah Hill-Smith and George Cameron discuss the evolution and future of AI benchmarking with Swyx. They detail their journey from a side project to a company providing independent AI model analysis, emphasizing their commitment to unbiased metrics. They address the challenges of evaluating AI models, including cost, prompt engineering, and data contamination, and introduce new metrics like the Omniscience Index to combat hallucination. They explore the balance between openness and intelligence in AI models, and the trend of decreasing costs coupled with increasing overall AI spending. They also preview upcoming features for their Intelligence Index, including agentic performance and new evaluation datasets.
Sign in to continue reading, translating and more.
Continue