LessWrong (30+ Karma) - “Reward hacking behavior can generalize across tasks” by Kei Nishimura-Gasparian, Isaac Dunn, Henry Sleight, miles, evhub, Carson Denison, Ethan Perez
Sign in to continue reading, translating and more.