“[Paper] Stress-testing capability elicitation by training password-locked models” by Fabien Roger, ryan_greenblatt | LessWrong (30+ Karma) | Podwise