Kimi K2 and Kimi K2.5 on DeepMask. 1-trillion parameter MoE models with Agent Swarm Mode, multimodal reasoning, and 2M token context for complex agentic tasks.
MoonshotAI’s Kimi K2 family introduces a new paradigm for agentic AI. Both models are built on a 1-trillion parameter Mixture-of-Experts architecture and natively support Agent Swarm Mode — coordinating up to 100 parallel sub-agents to decompose and execute complex tasks. Kimi K2 (DeepMask) is optimized for deep reasoning and long-context retrieval at 2M tokens, while Kimi K2.5 extends this with a focus on visual-to-code generation and full-stack prototyping from UI screenshots and video.
Kimi K2 (DeepMask) is MoonshotAI’s massive Mixture-of-Experts breakthrough. It is the first model to natively support Agent Swarm Mode, allowing the main model to coordinate up to 100 specialized sub-agents working in parallel. Despite its 1-trillion total parameters, it activates only 32B per request — making it exceptionally efficient. It supports a 2M-token context window and uses MoonViT for native multimodal processing across text, images, and video.
Kimi K2 (DeepMask) is available directly on DeepMask with a 2M-token context window and 256K-token support on standard hardware via Multi-Head Latent Attention.
Massive research synthesis — Search hundreds of web sources simultaneously to compile comprehensive reports.
Vision-to-code — Upload a UI walkthrough video and have Kimi K2 rebuild the entire website structure.
Batch data processing — Analyze thousands of legal or medical records in a single swarm session.
Multi-step agent workflows — Run long-horizon tasks across tools, documents, and APIs without losing track of the goal.
Kimi K2 (DeepMask) is the strongest choice on DeepMask when you need to run complex, long-horizon tasks that span multiple tools and document types. Its 0.31s TTFT and 111 tokens/sec throughput make it fast enough for interactive use.
Kimi K2.5 by MoonshotAI is a 1-trillion parameter MoE model built for the agentic era. It shifts from single-prompt interactions to a self-directed paradigm where it decomposes a project into parallel sub-tasks. Trained on 15 trillion mixed visual and text tokens, it is the top choice for visual-to-code generation, complex UI automation, and full-stack prototyping directly from screenshots or Figma links.
Kimi K2.5 is a powerful open-source model. Its multimodal training on 15 trillion tokens makes it particularly strong for tasks that combine visual and code reasoning.
Visual asset refinement — Autonomously find, edit, and place assets into a web layout.
Complex tool-augmented search — Manage 100+ steps across search and terminal tools to verify scientific data.
Full-stack prototyping — Build functional web and mobile apps from zero to one, including backend logic.
UI automation — Navigate live interfaces and perform multi-step workflows like a human operator.
Use Kimi K2.5 when your task starts with a visual input — a screenshot, video walkthrough, or Figma design — and ends with running code or a deployed prototype. It is the strongest model on DeepMask for visual-to-code generation.