“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison | LessWrong (30+ Karma) | Podwise