LessWrong (30+ Karma) - “Why imperfect adversarial robustness doesn’t doom AI control” by Buck, Claude+
Sign in to continue reading, translating and more.