LessWrong (30+ Karma) - “Cross-Layer Transcoders are incentivized to learn Unfaithful Circuits” by Georg Lange, RGRGRG, Kat Dearstyne, Kamal Maher
Sign in to continue reading, translating and more.