Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind | Latent Space: The AI Engineer Podcast

This podcast episode features Nick Carlini, a research scientist at DeepMind, who delves into the complexities of adversarial AI security and the balanced use of large language models (LLMs). Carlini combines a playful exploratory approach with rigorous analysis, promoting the understanding of AI's dual capabilities and pitfalls. He advocates for a grounded outlook on LLMs, encouraging their utilization as practical tools while remaining vigilant about their limitations, particularly in terms of security vulnerabilities. Throughout the discussion, Carlini underscores the critical need for tailored evaluation benchmarks and the exploration of AI's dark side to effectively navigate and enhance the safety of emerging technologies.

Outlines

Sign in to continue reading, translating and more.

Continue

Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind

Latent Space: The AI Engineer Podcast

Nick Carlini: AI Security Researcher and Enthusiast

Debunking AI Hype: A Grounded Perspective on LLMs

How I Use AI: Practical Applications for Everyday Workflows

LLMs for Experts and Non-Experts: A Balanced Perspective

Speculations on the Future of LLMs: A Balanced View

The Importance of Tailored Benchmarks for Evaluating LLMs

AI Security: Beyond Jailbreaks and Doomsday Scenarios

Why I Attack: The Importance of Exploring AI's Dark Side

Why you should write your own LLM benchmarks — with Nicholas Carlini, Google DeepMind

Latent Space: The AI Engineer Podcast

00:07Nick Carlini: AI Security Researcher and Enthusiast

Nick Carlini: AI Security Researcher and Enthusiast

01:31Debunking AI Hype: A Grounded Perspective on LLMs

Debunking AI Hype: A Grounded Perspective on LLMs

09:53How I Use AI: Practical Applications for Everyday Workflows

How I Use AI: Practical Applications for Everyday Workflows

24:23LLMs for Experts and Non-Experts: A Balanced Perspective

LLMs for Experts and Non-Experts: A Balanced Perspective

33:11Speculations on the Future of LLMs: A Balanced View

Speculations on the Future of LLMs: A Balanced View

40:22The Importance of Tailored Benchmarks for Evaluating LLMs

The Importance of Tailored Benchmarks for Evaluating LLMs

54:53AI Security: Beyond Jailbreaks and Doomsday Scenarios

AI Security: Beyond Jailbreaks and Doomsday Scenarios

1:06:11Why I Attack: The Importance of Exploring AI's Dark Side

Why I Attack: The Importance of Exploring AI's Dark Side