“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks | LessWrong (30+ Karma) | Podwise