Kimi K2 & K2.5 - DeepMask

MoonshotAI’s Kimi K2 family introduces a new paradigm for agentic AI. Both models are built on a 1-trillion parameter Mixture-of-Experts architecture and natively support Agent Swarm Mode — coordinating up to 100 parallel sub-agents to decompose and execute complex tasks. Kimi K2 (DeepMask) is optimized for deep reasoning and long-context retrieval at 2M tokens, while Kimi K2.5 extends this with a focus on visual-to-code generation and full-stack prototyping from UI screenshots and video.

Kimi K2 (DeepMask)
Kimi K2.5

About

Kimi K2 (DeepMask) is MoonshotAI’s massive Mixture-of-Experts breakthrough. It is the first model to natively support Agent Swarm Mode, allowing the main model to coordinate up to 100 specialized sub-agents working in parallel. Despite its 1-trillion total parameters, it activates only 32B per request — making it exceptionally efficient. It supports a 2M-token context window and uses MoonViT for native multimodal processing across text, images, and video.

Kimi K2 (DeepMask) is available directly on DeepMask with a 2M-token context window and 256K-token support on standard hardware via Multi-Head Latent Attention.

Key Capabilities

Agent Swarm Mode

Decomposes a complex task into up to 100 parallel sub-tasks and executes them simultaneously.

Native Multimodal

Processes text, images, and video with equal fluency through MoonshotAI’s multimodal architecture.

2M Token Context

Handles up to 2 million tokens, enabling analysis of massive documents and long-running agent sessions.

Stable Execution

Maintains coherence across 300+ sequential tool calls without logic drift.

Use Cases

Massive research synthesis — Search hundreds of web sources simultaneously to compile comprehensive reports.
Vision-to-code — Upload a UI walkthrough video and have Kimi K2 rebuild the entire website structure.
Batch data processing — Analyze thousands of legal or medical records in a single swarm session.
Multi-step agent workflows — Run long-horizon tasks across tools, documents, and APIs without losing track of the goal.

Kimi K2 (DeepMask) is the strongest choice on DeepMask when you need to run complex, long-horizon tasks that span multiple tools and document types. Its 0.31s TTFT and 111 tokens/sec throughput make it fast enough for interactive use.

Specifications

Specification	Value
Model Provider	MoonshotAI
Main Use Cases	Visual Debugging, Agent Swarms
Reasoning Effort	High
GPQA Diamond	87.6%
Max Context	2.0M Tokens
Latency (TTFT)	0.31s
Throughput	111 Tokens/sec

About

Kimi K2.5 by MoonshotAI is a 1-trillion parameter MoE model built for the agentic era. It shifts from single-prompt interactions to a self-directed paradigm where it decomposes a project into parallel sub-tasks. Trained on 15 trillion mixed visual and text tokens, it is the top choice for visual-to-code generation, complex UI automation, and full-stack prototyping directly from screenshots or Figma links.

Kimi K2.5 is a powerful open-source model. Its multimodal training on 15 trillion tokens makes it particularly strong for tasks that combine visual and code reasoning.

Key Capabilities

Agent Swarm Orchestration

Decomposes one goal into a coordinated team of sub-agents working in parallel across search, code, and terminal tools.

Coding with Vision

Generates functional code directly from a UI screenshot or design link.

Visual Web Navigation

Browses the live web visually — clicking buttons and filling forms like a human user.

Multimodal Reasoning

Integrates document analysis with real-time image search to solve data-heavy, cross-modal problems.

Use Cases

Visual asset refinement — Autonomously find, edit, and place assets into a web layout.
Complex tool-augmented search — Manage 100+ steps across search and terminal tools to verify scientific data.
Full-stack prototyping — Build functional web and mobile apps from zero to one, including backend logic.
UI automation — Navigate live interfaces and perform multi-step workflows like a human operator.

Use Kimi K2.5 when your task starts with a visual input — a screenshot, video walkthrough, or Figma design — and ends with running code or a deployed prototype. It is the strongest model on DeepMask for visual-to-code generation.

Specifications

Specification	Value
Model Provider	MoonshotAI
Main Use Cases	Visual Debugging, Coding, Agent Swarms
Reasoning Effort	High
GPQA Diamond	87.6%
Max Context	262K Tokens
Latency (TTFT)	0.52s
Throughput	138 Tokens/sec

​About

​Key Capabilities

Agent Swarm Mode

Native Multimodal

2M Token Context

Stable Execution

​Use Cases

​Specifications

​About

​Key Capabilities

Agent Swarm Orchestration

Coding with Vision

Visual Web Navigation

Multimodal Reasoning

​Use Cases

​Specifications

About

Key Capabilities

Use Cases

Specifications

About

Key Capabilities

Use Cases

Specifications