GPT-OSS 120B — EU-hosted open-weight reasoning model

GPT-OSS 120B is OpenAI’s 2026 open-source contribution to the frontier model ecosystem, available in DeepMask through two EU-hosted infrastructure providers: StackIT and Infercom. It delivers GPT-4-tier intelligence under an Apache 2.0 license, with full chain-of-thought transparency and adjustable reasoning effort — making it the model of choice for organizations that need frontier-class AI without black-box opacity or data leaving European infrastructure.

About GPT-OSS 120B

Built on a Mixture-of-Experts (MoE) architecture, GPT-OSS 120B uses sparse activation to stay fast and efficient — activating only a fraction of its parameters per request. The Infercom variant is optimized for high-throughput deployments, reaching up to 544 tokens/sec. The StackIT variant is tuned for sovereign enterprise deployments with a focus on transparent reasoning and strict schema enforcement. Neither variant supports image inputs.

Both the StackIT and Infercom variants of GPT-OSS 120B are hosted entirely within the European Union, making them suitable for use cases governed by GDPR and sector-specific data residency requirements.

Key Capabilities

Transparent Chain-of-Thought

Full visibility into internal reasoning steps — critical for legal, medical, and compliance use cases where “black box” AI is unacceptable.

Adjustable Reasoning Effort

Switch between Low (fast), Medium (balanced), and High (deep analytical thinking) per request to control cost and latency.

JSON Mode Precision

Native strict schema enforcement ensures near-perfect reliability for API-driven agents and structured output pipelines.

High-Speed Throughput

The Infercom variant exceeds 500 tokens/sec on optimized stacks — one of the fastest models in its weight class.

Best For

GPT-OSS 120B is ideal when you need frontier-level reasoning on-premises or within EU-hosted infrastructure. It is the right choice for legal and clinical workflows where reasoning transparency is mandatory, for privacy-sensitive production environments in finance and healthcare, and for high-volume agentic pipelines that need both speed and analytical depth. It does not support image inputs or tool use in the DeepMask interface — for those capabilities, see GPT-4o or the GPT-5 series.

For legal and compliance workflows, use High reasoning effort to maximize analytical depth. For high-volume document classification or extraction pipelines, Medium effort typically provides the best cost-per-quality tradeoff.

Use Cases

Clinical summarization — Processing patient histories locally under HIPAA- or GDPR-equivalent data residency requirements.
Legal research — Analyzing sensitive litigation documents without any cloud exposure outside the EU.
Local coding assistants — Running a high-intelligence coding model entirely on private, EU-resident infrastructure.
STEM and technical research — Graduate-level science and mathematics reasoning with verifiable reasoning steps.

Specifications

Specification	StackIT	Infercom
Provider	OpenAI (open-source)	OpenAI (open-source)
Hosting	EU (StackIT)	EU (Infercom)
Context Window	131K tokens	131K tokens
Reasoning	High	Adaptive (Low, Medium, High)
GPQA Diamond	80.9%	80.9%
Latency (TTFT)	0.27s	0.37s
Throughput	262 tokens/sec	313–544 tokens/sec
Image support	No	No
Key use cases	Agentic security, sovereign DevOps	High-speed agents, API orchestration, coding

Try GPT-OSS 120B in DeepMask →

Model Guide

OpenAI

Anthropic

Google & Others

GPT-OSS 120B — EU-hosted open-weight reasoning model

About GPT-OSS 120B

Key Capabilities

Transparent Chain-of-Thought

Adjustable Reasoning Effort

JSON Mode Precision

High-Speed Throughput

Best For

Use Cases

Specifications

Model Guide

OpenAI

Anthropic

Google & Others

Documentation Index

​About GPT-OSS 120B

​Key Capabilities

Transparent Chain-of-Thought

Adjustable Reasoning Effort

JSON Mode Precision

High-Speed Throughput

​Best For

​Use Cases

​Specifications

About GPT-OSS 120B

Key Capabilities

Best For

Use Cases

Specifications