“Decomposing the QK circuit with Bilinear Sparse Dictionary Learning” by keith_wynroe, Lee Sharkey | LessWrong (30+ Karma) | Podwise