GPT-OSS 20B: Specifications and GPU VRAM Requirements

GPT-OSS 20B

Open Source

Open Weights

Active Parameters

21B

Context Length

128K

Modality

Text

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

5 Aug 2025

Knowledge Cutoff

Jun 2024

Technical Specifications

Total Expert Parameters

3.6B

Number of Experts

Active Experts

Attention Structure

Multi-Head Attention

Hidden Dimension Size

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

SwigLU

Normalization

Position Embedding

Absolute Position Embedding

System Requirements

VRAM requirements for different quantization methods and context sizes

GPT-OSS 20B

GPT-OSS 20B is a text-based language model developed by OpenAI, engineered for efficient operation on consumer-grade hardware, including desktops and laptops with constrained memory resources. It functions as a versatile instrument for a range of natural language processing tasks, with a particular emphasis on capabilities requiring robust reasoning and integration with external tools. This model is a component of the broader GPT-OSS family, which aims to provide powerful AI capabilities in an accessible and deployable format, facilitating both local and enterprise-specific applications.

About GPT-OSS

Open-weight language models from OpenAI.

Other GPT-OSS Models

GPT-OSS 120B

Evaluation Benchmarks

Ranking is for Local LLMs.

Rank

#16

Benchmark	Score	Rank
StackUnseen ProLLM Stack Unseen	0.8	🥈 2

Rankings

Overall Rank

#16

Coding Rank

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

63k

125k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Release Notes Read the Paper Download Weights Source Code