“Training on Documents About Reward Hacking Induces Reward Hacking” by evhub | LessWrong (30+ Karma) | Podwise