Qwen3.5-35B-A3B: Specifications and GPU VRAM Requirements

Qwen3.5-35B-A3B

Open Source

Open Weights

Active Parameters

35B

Context Length

262.144K

Modality

Multimodal

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

24 Feb 2026

Knowledge Cutoff

Technical Specifications

Total Expert Parameters

3.0B

Number of Experts

256

Active Experts

Attention Structure

Grouped-Query Attention

Hidden Dimension Size

2048

Number of Layers

Attention Heads

Key-Value Heads

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

ROPE

Qwen3.5-35B-A3B

Qwen3.5-35B-A3B is Alibaba Cloud's efficient multimodal foundation model, released February 2026. With 35B total parameters and 3B activated through a Mixture-of-Experts architecture (256 experts), it delivers strong performance with minimal compute. It achieves MMLU-Pro (85.3%), GPQA Diamond (84.2%), SWE-bench Verified (69.2%), and Terminal-Bench 2.0 (40.5%). Qwen3.5-Flash is the hosted API version. Features unified vision-language capabilities, 262k native context (extensible to 1M), and strong performance on multimodal reasoning, coding, and multilingual tasks.

About Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.

Other Qwen 3.5 Models

Evaluation Benchmarks

No evaluation benchmarks for Qwen3.5-35B-A3B available.

Rankings

Overall Rank

Coding Rank

GPU Requirements

Full Calculator

Quantization

Choose the quantization method for model weights

Context Size: 1,024 tokens

128k

256k

VRAM Required:

Recommended GPUs

Resources

Official Documentation Download Weights