LessWrong (30+ Karma) - “Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen
Sign in to continue reading, translating and more.