Qwen3.5-122B-A10B：规格和 GPU 显存要求

Qwen3.5-122B-A10B

开源

开放权重

活跃参数

122B

上下文长度

262.144K

模态

Multimodal

架构

Mixture of Experts (MoE)

许可证

Apache 2.0

发布日期

24 Feb 2026

训练数据截止日期

技术规格

专家参数总数

10.0B

专家数量

256

活跃专家

注意力结构

Grouped-Query Attention

隐藏维度大小

3072

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

Qwen3.5-122B-A10B

Qwen3.5-122B-A10B is Alibaba Cloud's mid-tier multimodal foundation model, released February 2026. With 122B total parameters and 10B activated through a Mixture-of-Experts architecture (256 experts), it balances high performance with computational efficiency. It achieves strong scores on MMLU-Pro (86.1%), GPQA Diamond (85.5%), SWE-bench Verified (72.4%), and Terminal-Bench 2.0 (41.6%). Features unified vision-language capabilities, 262k native context (extensible to 1M), and excels across reasoning, coding, agentic workflows, and multilingual tasks.

关于 Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.

其他 Qwen 3.5 模型

评估基准

没有可用的 Qwen3.5-122B-A10B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

128k

256k

所需显存:

资源

官方文档下载权重

Qwen3.5-122B-A10B

技术规格

Qwen3.5-122B-A10B

关于 Qwen 3.5

其他 Qwen 3.5 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源