Scalable training of language models using Ray, JAX, and TPUv4 at Cohere | Anyscale | Podwise