Qwen3.5-122B-A10B

Open Source

Open Weights

Active Parameters

122B

Context Length

262K

Modality

Multimodal

Architecture

Mixture of Experts (MoE)

License

Apache 2.0

Release Date

24 Feb 2026

Knowledge Cutoff

System Requirements

VRAM requirements for different quantization methods and context sizes

1,024 tokens

257.81 GB VRAM

Consumer

13x RTX 4090

24GB VRAM

Datacenter

4x NVIDIA A100

80GB VRAM

Apple Silicon

3x Apple M3 Max

128GB VRAM

262,144 tokens

284.76 GB VRAM

Consumer

15x RTX 4090

24GB VRAM

Datacenter

4x NVIDIA A100

80GB VRAM

Apple Silicon

3x Apple M3 Max

128GB VRAM

Architecture Diagram

Evaluation Benchmarks

Rank

#46

Benchmark	Score	Rank
General Text Text Arena	1417	46
Web Development WebDev Arena	1365	52

Rankings

Overall Rank

#46

Coding Rank

#61

About Qwen3.5-122B-A10B

Qwen3.5-122B-A10B is Alibaba Cloud's mid-tier multimodal foundation model, released February 2026. With 122B total parameters and 10B activated through a Mixture-of-Experts architecture (256 experts), it balances high performance with computational efficiency. It achieves strong scores on MMLU-Pro (86.1%), GPQA Diamond (85.5%), SWE-bench Verified (72.4%), and Terminal-Bench 2.0 (41.6%). Features unified vision-language capabilities, 262k native context (extensible to 1M), and excels across reasoning, coding, agentic workflows, and multilingual tasks.

Technical Specifications

Attention

Attention Structure

Grouped-Query Attention

Attention Heads

Key-Value Heads

Attention Head Dimension

256

Position Embedding

ROPE

RoPE Theta

10,000,000

Sliding Window Attention

Sliding Window Size

Sliding Window Ratio

Linear Attention

Yes

Linear Attention Ratio

75.0%

Normalization

RMS Normalization

Activation Function

SwigLU

Dimensions

Hidden Dimension Size

3,072

Number of Layers

FFN Intermediate Size (Dense)

1,024

Multi-Token Prediction Heads

Tokenizer

Vocabulary Size

248,320

Mixture of Experts

Total Expert Parameters

10.0B

Number of Experts

256

Active Experts

Shared Experts

FFN Intermediate Size (per Expert)

1,024

Dense Layers Before MoE

Model Integrity

Total Score

C+

60 / 100

Upstream

18.5 / 30

Model

20.0 / 40

Downstream

21.0 / 30

Qwen3.5-122B-A10B Model Integrity Report

Total Score

/ 100

C+

Audit Note

Qwen3.5-122B-A10B exhibits a bifurcated transparency profile, offering high clarity on its complex hybrid architecture and parameter density while remaining almost entirely opaque regarding its training data and compute resources. While the model is highly accessible through open weights and a permissive license, its internal identity consistency is compromised by traces of other models. Users can rely on detailed community-verified hardware requirements, but must contend with a lack of formal documentation on data provenance and training methodology.

Upstream

18.5 / 30

Architectural Provenance

8.0 / 10

The model's architecture is extensively documented in official Hugging Face and NVIDIA model cards. It utilizes a hybrid 'Gated DeltaNet' (linear attention) and sparse Mixture-of-Experts (MoE) transformer architecture with 48 layers. Specific structural details are provided, including a 3:1 ratio of DeltaNet to standard attention cycles, hidden dimensions of 3072, and a 256-expert MoE setup. While the high-level methodology is clear, a full peer-reviewed technical paper detailing the specific pre-training curriculum or architectural ablation is not yet publicly linked, though a 'coming soon' documentation placeholder exists on GitHub.

Dataset Composition

2.0 / 10

Information regarding the training data is extremely limited. Official documentation on NVIDIA and Hugging Face lists the training dataset, collection methodology, and labeling as 'Undisclosed'. While the model claims support for 201 languages and multimodal inputs (text, image, video), there is no public breakdown of the data proportions (e.g., web vs. code) or specific sources used for the early-fusion multimodal training.

Tokenizer Integrity

8.5 / 10

The tokenizer is publicly accessible via the Hugging Face repository and integrated into standard libraries like Transformers and vLLM. It uses Byte Pair Encoding (BPE) with a stated vocabulary size of 248,320 tokens (padded). Documentation confirms support for 201 languages and provides specific token-to-character ratios for English and Chinese. Verification by third-party users in local deployment (llama.cpp) confirms the tokenizer's functional integrity.

Model

20.0 / 40

Parameter Density

9.0 / 10

The model provides exemplary transparency regarding its parameter density. It explicitly states a total of 122B parameters with 10B active parameters per token. The MoE structure is detailed as having 256 experts per layer, with 8 routed experts and 1 shared expert activated per token. This prevents the common 'parameter inflation' marketing trap by clearly distinguishing between total and active weights.

Training Compute

1.0 / 10

There is virtually no public information regarding the compute resources used to train the model. No GPU/TPU hours, hardware cluster specifications, training duration, or carbon footprint data have been disclosed by Alibaba Cloud. The only hardware mentions relate to inference requirements, not training provenance.

Benchmark Reproducibility

6.0 / 10

The model provides scores for several standard benchmarks (MMLU-Pro, GPQA Diamond, SWE-bench) and some newer ones (Terminal-Bench 2.0). While evaluation results are detailed in the model card, the specific evaluation code and exact prompts used for these internal results are not fully public. However, the model is available for third-party testing on platforms like Artificial Analysis and OpenRouter, allowing for independent verification of performance claims.

Identity Consistency

4.0 / 10

While the model generally identifies as part of the Qwen family in standard instruction tasks, there are documented instances of identity confusion in its internal reasoning traces. Users have reported the model claiming to be 'Gemini' or an API-based service even when running locally. This suggests significant identity drift likely stemming from the training or distillation process, which undermines its self-identification reliability.

Downstream

21.0 / 30

License Clarity

9.0 / 10

The model is clearly released under the Apache 2.0 license, which is a standard, permissive open-source license. This is explicitly stated across all primary distribution channels (Hugging Face, GitHub, Kaggle, and NVIDIA). The terms for commercial and non-commercial use are well-defined by the license itself, providing high legal clarity for developers.

Hardware Footprint

8.0 / 10

Hardware requirements are well-documented by both the provider and the community. Official specs recommend 8 GPUs for full context (262k) in BF16, while community documentation provides detailed VRAM breakdowns for various quantization levels (Q4_K_M, NVFP4). For example, a Q4_K_M quant is verified to run on ~73GB-80GB of VRAM. The scaling of memory with context length is also noted, with native support up to 262k tokens.

Versioning Drift

4.0 / 10

The model uses a clear naming convention (Qwen3.5-122B-A10B), but there is a lack of a formal, public changelog or version history for weight updates. While the release date is clear, there is no established mechanism for tracking silent updates or performance drift over time. The 'coming soon' status of official documentation further limits the transparency of its versioning lifecycle.

Resources

Official Documentation Download Weights

About Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.

Qwen3.5-122B-A10B

System Requirements

Architecture Diagram

Evaluation Benchmarks

Rankings

About Qwen3.5-122B-A10B

Technical Specifications

Model Integrity

Qwen3.5-122B-A10B Model Integrity Report

Audit Note

Upstream

Model

Downstream

Resources

About Qwen 3.5

Other Qwen 3.5 Models