“Vestigial reasoning in RL” by Caleb Biddulph | LessWrong (30+ Karma) | Podwise