趋近智
活跃参数
397B
上下文长度
262.144K
模态
Multimodal
架构
Mixture of Experts (MoE)
许可证
Apache 2.0
发布日期
24 Feb 2026
训练数据截止日期
-
专家参数总数
17.0B
专家数量
512
活跃专家
11
注意力结构
Grouped-Query Attention
隐藏维度大小
4096
层数
60
注意力头
32
键值头
2
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
ROPE
Qwen3.5-397B-A17B is Alibaba Cloud's largest and most capable multimodal foundation model, released February 2026. With 397B total parameters and 17B activated through a Mixture-of-Experts architecture (512 experts), it achieves state-of-the-art scores on MMLU-Pro (87.8%), GPQA Diamond (88.4%), SWE-bench Verified (80.0%), and Terminal-Bench 2.0 (54.0%). It features unified vision-language capabilities, extended context up to 1M tokens, and excels in coding agents, general agents, multimodal reasoning, and multilingual understanding across 201 languages.
Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.
没有可用的 Qwen3.5-397B-A17B 评估基准。