LessWrong (30+ Karma) - “Steering Language Models with Weight Arithmetic” by Fabien Roger, constanzafierro
Sign in to continue reading, translating and more.