“Sabotage Evaluations for Frontier Models” by David Duvenaud, evhub, Joe Benton, Misha Wagner, Eric Christiansen, Ethan Perez, Buck, HoldenKarnofsky, Sam Bowman | LessWrong (30+ Karma) | Podwise
“Sabotage Evaluations for Frontier Models” by David Duvenaud, evhub, Joe Benton, Misha Wagner, Eric Christiansen, Ethan Perez, Buck, HoldenKarnofsky, Sam Bowman
LessWrong (30+ Karma) - “Sabotage Evaluations for Frontier Models” by David Duvenaud, evhub, Joe Benton, Misha Wagner, Eric Christiansen, Ethan Perez, Buck, HoldenKarnofsky, Sam Bowman
Sign in to continue reading, translating and more.