17 Mar 2025
16m
“Notable utility-monster-like failure modes on Biologically and Economically aligned AI safety benchmarks for LLMs with simplified observation format” by Roland Pihlakas, Sruthi Kuriakose
LessWrong (30+ Karma)
Open in Podwise to generate AI notes
Sign in to process this episode and unlock summaries, transcripts, highlights and translations.
Shownotes are not generated by Podwise.

