LW - New paper shows truthfulness & instruction-following don't generalize by default by joshc | The Nonlinear Library | Podwise