“2025-Era “Reward Hacking” Does Not Show that Reward Is the Optimization Target” by TurnTrout | LessWrong (30+ Karma) | Podwise