LessWrong (30+ Karma) - “Concept Poisoning: Probing LLMs without probes” by Jan Betley, jorio, dylan_f, Owain_Evans
Sign in to continue reading, translating and more.