Minimax M2 is a 10-billion parameter open-weight model engineered specifically for coding and agentic workplace tasks. Developed by a team where researchers and expert developers work side-by-side, the model leverages scaled environments and "expert developer" reward models to ensure high reliability in real-world programming workflows like bug fixing and repository refactoring. A core innovation is interleaved thinking, which allows the model to cycle between reasoning and tool calling up to a hundred times within a single turn to adapt to noisy environments and unexpected tool errors. This architecture supports robust generalization across various agent scaffolds by utilizing a data pipeline that perturbs operational spaces like system prompts and chat templates. Because of its small footprint and cost efficiency, M2 enables multi-agent scalability, allowing several model instances to work in parallel on complex research and reporting tasks with minimal human intervention.
Sign in to continue reading, translating and more.
Continue