MoonshotAI’s Kimi K2 family introduces a new paradigm for agentic AI. Both models are built on a 1-trillion parameter Mixture-of-Experts architecture and natively support Agent Swarm Mode — coordinating up to 100 parallel sub-agents to decompose and execute complex tasks. Kimi K2 (DeepMask) is optimized for deep reasoning and long-context retrieval at 2M tokens, while Kimi K2.5 extends this with a focus on visual-to-code generation and full-stack prototyping from UI screenshots and video.Documentation Index
Fetch the complete documentation index at: https://documentation.deepmask.io/llms.txt
Use this file to discover all available pages before exploring further.
- Kimi K2 (DeepMask)
- Kimi K2.5
About
Kimi K2 (DeepMask) is MoonshotAI’s massive Mixture-of-Experts breakthrough. It is the first model to natively support Agent Swarm Mode, allowing the main model to coordinate up to 100 specialized sub-agents working in parallel. Despite its 1-trillion total parameters, it activates only 32B per request — making it exceptionally efficient. It supports a 2M-token context window and uses MoonViT for native multimodal processing across text, images, and video.Kimi K2 (DeepMask) is available directly on DeepMask with a 2M-token context window and 256K-token support on standard hardware via Multi-Head Latent Attention.
Key Capabilities
Agent Swarm Mode
Decomposes a complex task into up to 100 parallel sub-tasks and executes them simultaneously.
Native Multimodal
Processes text, images, and video with equal fluency through MoonshotAI’s multimodal architecture.
2M Token Context
Handles up to 2 million tokens, enabling analysis of massive documents and long-running agent sessions.
Stable Execution
Maintains coherence across 300+ sequential tool calls without logic drift.
Use Cases
- Massive research synthesis — Search hundreds of web sources simultaneously to compile comprehensive reports.
- Vision-to-code — Upload a UI walkthrough video and have Kimi K2 rebuild the entire website structure.
- Batch data processing — Analyze thousands of legal or medical records in a single swarm session.
- Multi-step agent workflows — Run long-horizon tasks across tools, documents, and APIs without losing track of the goal.
Specifications
| Specification | Value |
|---|---|
| Model Provider | MoonshotAI |
| Main Use Cases | Visual Debugging, Agent Swarms |
| Reasoning Effort | High |
| GPQA Diamond | 87.6% |
| Max Context | 2.0M Tokens |
| Latency (TTFT) | 0.31s |
| Throughput | 111 Tokens/sec |