Xiaol.x - Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache
Sign in to continue reading, translating and more.