LessWrong (30+ Karma) - “Tensor-Transformer Variants are Surprisingly Performant” by Logan Riggs
Sign in to continue reading, translating and more.