趋近智
参数
4B
上下文长度
262.144K
模态
Multimodal
架构
Dense
许可证
Apache 2.0
发布日期
24 Feb 2026
训练数据截止日期
-
注意力结构
Grouped-Query Attention
隐藏维度大小
2560
层数
32
注意力头
16
键值头
4
激活函数
SwigLU
归一化
RMS Normalization
位置嵌入
ROPE
Qwen3.5-4B is Alibaba Cloud's compact multimodal foundation model with 4B parameters, released February 2026. It uses a hybrid architecture combining Gated Delta Networks and Gated Attention in an 8×(3×DeltaNet→FFN→1×Attention→FFN) pattern. It achieves MMLU-Pro (79.1%), GPQA Diamond (76.2%), HMMT benchmarks (74%/77%), and strong vision-language scores. Features unified vision-language capabilities, 262k native context (extensible to 1M), multi-token prediction training, and delivers efficient performance across reasoning, coding, multimodal understanding, and multilingual tasks covering 201 languages.
Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.
没有可用的 Qwen3.5-4B 评估基准。