ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer | Xiaol.x | Podwise