Active Parameters
80B
Context Length
66K
Modality
Reasoning
Architecture
Mixture of Experts (MoE)
License
Apache-2.0
Release Date
1 Feb 2026
Knowledge Cutoff
-
Total Expert Parameters
-
Number of Experts
-
Active Experts
-
Attention Structure
Multi-Head Attention
Hidden Dimension Size
-
Number of Layers
-
Attention Heads
-
Key-Value Heads
-
Activation Function
-
Normalization
-
Position Embedding
Absolute Position Embedding
VRAM requirements for different quantization methods and context sizes
Qwen3 Next 80B A3B is a high-performance reasoning model from Alibaba. It utilizes a Mixture-of-Experts (MoE) architecture with 80 billion parameters and 3 billion activated parameters (A3B). The model is optimized for complex reasoning tasks, providing a 66,000 token context window and high benchmark scores.
The Alibaba Qwen 3 model family comprises dense and Mixture-of-Experts (MoE) architectures, with parameter counts from 0.6B to 235B. Key innovations include a hybrid reasoning system, offering 'thinking' and 'non-thinking' modes for adaptive processing, and support for extensive context windows, enhancing efficiency and scalability.
No evaluation benchmarks for Qwen3 Next 80B A3B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens