This podcast recap highlights the latest advancements in alternative architectures to transformers for AI. Dan Fu and Eugene Cheah discussed the significant strides made since 2020, focusing on improvements in both quality and computational efficiency. Key developments include more systematic approaches to sequence modeling, specialized kernels designed for better hardware efficiency, enhanced selection mechanisms in models, and innovative testing strategies. Eugene also shared exciting updates on RWKV, including the successful conversion of a large language model that utilizes RWKV's linear attention layers, achieving comparable performance while requiring much less computational power. The discussion wrapped up with thoughts on future possibilities, such as hardware-model co-design and applications extending beyond just language modeling.