Shusen Wang - Transformer模型(1/2): 剥离RNN,保留Attention
Sign in to continue reading, translating and more.