LessWrong (30+ Karma) - “Realistic Reward Hacking Induces Different and Deeper Misalignment” by Jozdien
Sign in to continue reading, translating and more.