“Split Personality Training can detect Alignment Faking” by Florian_Dietz | LessWrong (30+ Karma) | Podwise