ApX 标志ApX 标志

趋近智

Qwen3.5-0.8B

参数

800M

上下文长度

262.144K

模态

Multimodal

架构

Dense

许可证

Apache 2.0

发布日期

24 Feb 2026

训练数据截止日期

-

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

1024

层数

24

注意力头

8

键值头

2

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

Qwen3.5-0.8B

Qwen3.5-0.8B is Alibaba Cloud's ultra-compact multimodal foundation model with 0.8B parameters, released February 2026. It uses a hybrid architecture combining Gated Delta Networks and Gated Attention in a 6×(3×DeltaNet→FFN→1×Attention→FFN) pattern. In thinking mode, it achieves MMLU-Pro (66.5%), GPQA Diamond (51.6%), and GPQA (11.9%). Features unified vision-language capabilities, 262k native context, multi-token prediction training, and supports both thinking and non-thinking modes, designed for prototyping, fine-tuning, and research purposes across 201 languages.

关于 Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.


其他 Qwen 3.5 模型

评估基准

没有可用的 Qwen3.5-0.8B 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
128k
256k

所需显存:

推荐 GPU