LessWrong (30+ Karma) - “Opus 4.6 Reasoning Doesn’t Verbalize Alignment Faking, but Behavior Persists” by Daan Henselmans, Arno Libert, LennardZ
Sign in to continue reading, translating and more.