“[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations” by Teun van der Weij, Felix Hofstätter, Ollie J, Sam F. Brown, Francis Rhys Ward | LessWrong (30+ Karma) | Podwise