Skip to main content

Documentation Index

Fetch the complete documentation index at: https://documentation.deepmask.io/llms.txt

Use this file to discover all available pages before exploring further.

MoonshotAI’s Kimi K2 family introduces a new paradigm for agentic AI. Both models are built on a 1-trillion parameter Mixture-of-Experts architecture and natively support Agent Swarm Mode — coordinating up to 100 parallel sub-agents to decompose and execute complex tasks. Kimi K2 (DeepMask) is optimized for deep reasoning and long-context retrieval at 2M tokens, while Kimi K2.5 extends this with a focus on visual-to-code generation and full-stack prototyping from UI screenshots and video.

About

Kimi K2 (DeepMask) is MoonshotAI’s massive Mixture-of-Experts breakthrough. It is the first model to natively support Agent Swarm Mode, allowing the main model to coordinate up to 100 specialized sub-agents working in parallel. Despite its 1-trillion total parameters, it activates only 32B per request — making it exceptionally efficient. It supports a 2M-token context window and uses MoonViT for native multimodal processing across text, images, and video.
Kimi K2 (DeepMask) is available directly on DeepMask with a 2M-token context window and 256K-token support on standard hardware via Multi-Head Latent Attention.

Key Capabilities

Agent Swarm Mode

Decomposes a complex task into up to 100 parallel sub-tasks and executes them simultaneously.

Native Multimodal

Processes text, images, and video with equal fluency through MoonshotAI’s multimodal architecture.

2M Token Context

Handles up to 2 million tokens, enabling analysis of massive documents and long-running agent sessions.

Stable Execution

Maintains coherence across 300+ sequential tool calls without logic drift.

Use Cases

  • Massive research synthesis — Search hundreds of web sources simultaneously to compile comprehensive reports.
  • Vision-to-code — Upload a UI walkthrough video and have Kimi K2 rebuild the entire website structure.
  • Batch data processing — Analyze thousands of legal or medical records in a single swarm session.
  • Multi-step agent workflows — Run long-horizon tasks across tools, documents, and APIs without losing track of the goal.
Kimi K2 (DeepMask) is the strongest choice on DeepMask when you need to run complex, long-horizon tasks that span multiple tools and document types. Its 0.31s TTFT and 111 tokens/sec throughput make it fast enough for interactive use.

Specifications

SpecificationValue
Model ProviderMoonshotAI
Main Use CasesVisual Debugging, Agent Swarms
Reasoning EffortHigh
GPQA Diamond87.6%
Max Context2.0M Tokens
Latency (TTFT)0.31s
Throughput111 Tokens/sec