“Discovering Backdoor Triggers” by andrq, Tim Hua, Sam Marks, Arthur Conmy, Neel Nanda | LessWrong (30+ Karma) | Podwise