MiniMax M2, M2.1 & M2.5 — MoE models for coding and agents

MiniMax’s M2 model family brings a distinct agentic philosophy to DeepMask: all three models are built around Interleaved Thinking, maintaining coherent state across multi-turn tool interactions without logic drift. MiniMax M2 focuses on full-stack development and office automation. M2.1 extends this to mobile app development and 3D visualization. MiniMax M2.5 (Infercom) adds EU hosting via Infercom with a massive 1M-token context window optimized for long-running autonomous agents.

MiniMax M2
MiniMax M2.1
MiniMax M2.5 (Infercom)

About

MiniMax M2 is an expert-level Mixture-of-Experts model built from the ground up for the agent universe. It introduces Interleaved Thinking, where it natively uses internal planning steps to separate its reasoning from its final output. Trained via a Forge RL framework across 200,000+ complex environments, it is highly optimized for agentic loops — tasks where the model must search, act, and reason repeatedly to solve a problem.

MiniMax M2 provides native support for generating and editing high-fidelity Office documents (Word, PowerPoint, Excel) — a capability not found in most other models on DeepMask.

Key Capabilities

Robust Task Execution

Optimized for reliable task execution across complex, real-world agentic environments.

Interleaved Thinking

Maintains coherent state across multi-turn tool interactions, reducing logic drift in long agentic loops.

Visual Agentic Logic

Sees UI screenshots and translates them into executable code or precise navigation steps.

Office Document Generation

Natively generates and edits Word, PowerPoint, and Excel files from natural language instructions.

Use Cases

Autonomous office assistants — Build complex financial models in Excel or strategy decks in PowerPoint from natural language instructions.
Full-stack web development — Write 1,000+ line TypeScript files with an 80%+ first-run pass rate.
Strategy consulting — Synthesize massive market datasets into professional presentations automatically.
Agent scaffolding — Build reliable multi-step agentic systems that loop across search, code execution, and document generation.

Use MiniMax M2 when your workflow involves repeated search-act-reason cycles, especially tasks that produce Office documents or require long-horizon coherence across many tool calls.

Specifications

Specification	Value
Model Provider	MiniMax
Main Use Cases	Efficient Coding, Agent Scaffolding, MoE Research
Reasoning Effort	Interleaved Thinking
GPQA Diamond	78.2%
Max Context	196.6K Tokens
Latency (TTFT)	0.35s
Throughput	95 Tokens/sec

About

MiniMax M2.1 is a specialized model designed to close the mobile development gap in AI. While most models focus on Python and web, M2.1 is fine-tuned for Swift (iOS), Kotlin (Android), and 3D visualization (Three.js). It is the premier model for “Vibe Coding” — describing an app’s aesthetic and interaction logic and having the AI build the entire functional package, including backend logic.

MiniMax M2.1 is the strongest model on DeepMask for native iOS and Android app development. Its training focus on Swift and Kotlin sets it apart from general-purpose coding models.

Key Capabilities

Native App Development

Outperforms other models in building functional Android and iOS application logic from natural language descriptions.

3D Scene Generation

Generates complex 3D web scenes with physics simulation and collision detection.

Multi-Language Development

Bridges backend and frontend languages in a single, coherent development workflow.

UI/UX Generation

Understands aesthetic feel and UX principles, generating UIs that are visually polished and user-friendly.

Use Cases

Rapid app prototyping — Turn a two-paragraph idea into a downloadable iOS or Android mockup.
Game development tools — Create browser-based 3D simulations and mini-games with physics-aware logic.
Enterprise office automation — Develop custom internal tools for complex Excel and CRM data management.
Full-stack vibe coding — Describe an app’s look and feel in natural language and receive complete, functional source code.

MiniMax M2.1 is the best choice on DeepMask for mobile app development and 3D web experiences. If your project targets iOS, Android, or requires Three.js, this model will significantly outperform general-purpose alternatives.

Specifications

Specification	Value
Model Provider	MiniMax
Main Use Cases	Multilingual AppDev, Full-Stack Agents, Vibe Coding
Reasoning Effort	Adaptive (Concise Thinking)
GPQA Diamond	83.0%
Max Context	205K Tokens
Latency (TTFT)	0.25s
Throughput	113 Tokens/sec

About

MiniMax M2.5 (Infercom) is a 229B parameter Mixture-of-Experts model utilizing a breakthrough Hybrid Attention architecture — a 7:1 ratio of Lightning to SoftMax attention — to provide linear scaling for long contexts. The Infercom variant is EU-hosted and specifically optimized for sub-second responses in messaging-based autonomous agents and high-traffic production systems.

MiniMax M2.5 (Infercom) is EU-hosted via Infercom, providing European data residency for organizations with compliance requirements. Image input is not supported on this variant.

Key Capabilities

Long-Context Retrieval

Industry-leading retrieval across its 1M token context window, virtually eliminating lost-in-the-middle errors.

Agentic Orchestration

Optimized for multi-step tool-calling sequences for high-reliability task execution in production agent loops.

High Throughput

Delivers 100+ tokens/sec while maintaining EU data residency via Infercom hosting.

Persistent Memory

Well-suited for long-running AI assistants that need to retain context across extended sessions.

Use Cases

24/7 messaging agents — Run high-traffic customer support and sales bots where cost-per-token is a critical business factor.
Full-stack vibe coding — Prototype and iterate on code generation tasks with a 1M-token context for large codebases.
Persistent memory systems — Build long-running AI assistants that remember context across extended sessions.
Efficient RAG — Power retrieval-augmented generation pipelines at scale with EU data residency.

Use MiniMax M2.5 (Infercom) for production agentic systems that need EU hosting, a massive context window, and high throughput at reasonable cost. Its 1M-token context and linear scaling make it well-suited for persistent, long-running assistants.

Specifications

Specification	Value
Model Provider	MiniMax
Hosting	EU-hosted via Infercom
Main Use Cases	Multi-step Agents, Efficient RAG, Knowledge Work
Reasoning Effort	Adaptive (Concise)
GPQA Diamond	80.0%
Max Context	1.0M Tokens
Latency (TTFT)	1.17s
Throughput	100+ Tokens/sec

Model Guide

OpenAI

Anthropic

Google & Others

MiniMax M2, M2.1 & M2.5 — MoE models for coding and agents

About

Key Capabilities

Robust Task Execution

Interleaved Thinking

Visual Agentic Logic

Office Document Generation

Use Cases

Specifications

About

Key Capabilities

Native App Development

3D Scene Generation

Multi-Language Development

UI/UX Generation

Use Cases

Specifications

About

Key Capabilities

Long-Context Retrieval

Agentic Orchestration

High Throughput

Persistent Memory

Use Cases

Specifications

Model Guide

OpenAI

Anthropic

Google & Others

Documentation Index

​About

​Key Capabilities

Robust Task Execution

Interleaved Thinking

Visual Agentic Logic

Office Document Generation

​Use Cases

​Specifications

​About

​Key Capabilities

Native App Development

3D Scene Generation

Multi-Language Development

UI/UX Generation

​Use Cases

​Specifications

​About

​Key Capabilities

Long-Context Retrieval

Agentic Orchestration

High Throughput

Persistent Memory

​Use Cases

​Specifications

About

Key Capabilities

Use Cases

Specifications

About

Key Capabilities

Use Cases

Specifications

About

Key Capabilities

Use Cases

Specifications