15 Jan 2024
6m

[Linkpost] “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training” by evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer, Ethan Perez

Podcast cover

LessWrong (30+ Karma)

Open in Podwise to generate AI notes

Sign in to process this episode and unlock summaries, transcripts, highlights and translations.

Open in Podwise

Shownotes are not generated by Podwise.

[Linkpost] “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training” by evhub, Carson Denison, Meg, Monte M, David Duvenaud, Nicholas Schiefer, Ethan Perez