Qwen3.5-2B：规格和 GPU 显存要求

Qwen3.5-2B

开源

开放权重

参数

上下文长度

262.144K

模态

Multimodal

架构

Dense

许可证

Apache 2.0

发布日期

24 Feb 2026

训练数据截止日期

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

2048

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

Qwen3.5-2B

Qwen3.5-2B is Alibaba Cloud's small-scale multimodal foundation model with 2B parameters, released February 2026. It uses a hybrid architecture combining Gated Delta Networks and Gated Attention in a 6×(3×DeltaNet→FFN→1×Attention→FFN) pattern. In thinking mode, it achieves MMLU-Pro (74.0%), GPQA Diamond (65.8%), and GPQA (51.6%). Features unified vision-language capabilities, 262k native context, multi-token prediction training, and supports both thinking and non-thinking modes for prototyping, fine-tuning, and research purposes across 201 languages.

关于 Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.

其他 Qwen 3.5 模型

评估基准

没有可用的 Qwen3.5-2B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

128k

256k

所需显存:

资源

官方文档下载权重

Qwen3.5-2B

技术规格

Qwen3.5-2B

关于 Qwen 3.5

其他 Qwen 3.5 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源