“instruction tuning and autoregressive distribution shift ” by nostalgebraist | LessWrong (30+ Karma) | Podwise