趋近智
活跃参数
80B
上下文长度
66K
模态
Reasoning
架构
Mixture of Experts (MoE)
许可证
Apache-2.0
发布日期
1 Feb 2026
训练数据截止日期
-
专家参数总数
-
专家数量
-
活跃专家
-
注意力结构
Multi-Head Attention
隐藏维度大小
-
层数
-
注意力头
-
键值头
-
激活函数
-
归一化
-
位置嵌入
Absolute Position Embedding
不同量化方法和上下文大小的显存要求
Qwen3 Next 80B A3B is a high-performance reasoning model from Alibaba. It utilizes a Mixture-of-Experts (MoE) architecture with 80 billion parameters and 3 billion activated parameters (A3B). The model is optimized for complex reasoning tasks, providing a 66,000 token context window and high benchmark scores.
The Alibaba Qwen 3 model family comprises dense and Mixture-of-Experts (MoE) architectures, with parameter counts from 0.6B to 235B. Key innovations include a hybrid reasoning system, offering 'thinking' and 'non-thinking' modes for adaptive processing, and support for extensive context windows, enhancing efficiency and scalability.
没有可用的 Qwen3 Next 80B A3B 评估基准。