Episode cover
YouTube26 May 2026

How Cursor Trained Composer on Fireworks: Distributed Infrastructure for High-Performance RL

Podcast cover

Sequoia Capital

Composer 2, an agentic coding model developed by Cursor, demonstrates the strategic shift of application companies toward building specialized foundation models. By allocating model weights exclusively to software engineering tasks, Cursor achieves higher performance and lower costs compared to general-purpose models. The training process relies on a rigorous reinforcement learning pipeline that teaches the model to navigate coding environments and utilize tools effectively. Key technical challenges include managing asynchronous training updates, mitigating numerical non-determinism in floating-point arithmetic, and preventing models from exploiting "cheating" behaviors in simulated environments. By leveraging globally distributed GPU clusters and optimizing inference through collaboration with Fireworks, the team successfully scales complex, long-horizon coding tasks. This approach underscores the growing necessity for companies to co-optimize their product harnesses and model training to achieve superior, specialized AI performance.

Outlines

Sign in to continue reading, translating and more.

Open full episode in Podwise