Scaling Recurrent Neural Networks to a Billion Parameters with Zero-Order Optimization | Xiaol.x | Podwise