ApX logoApX logo

Qwen3.5-27B

Parameters

27B

Context Length

262K

Modality

Multimodal

Architecture

Dense

License

Apache 2.0

Release Date

24 Feb 2026

Knowledge Cutoff

-

System Requirements

VRAM requirements for different quantization methods and context sizes

1,024 tokens

58.48 GB VRAM

Consumer

3x RTX 4090

24GB VRAM

Datacenter

1x NVIDIA A100

80GB VRAM

Apple Silicon

1x Apple M3 Max

128GB VRAM

262,144 tokens

130.36 GB VRAM

Consumer

7x RTX 4090

24GB VRAM

Datacenter

2x NVIDIA A100

80GB VRAM

Apple Silicon

2x Apple M3 Max

128GB VRAM

Architecture Diagram

Input TokensToken EmbeddingPosition: RoPEHidden: 5.1k · Context: 262K · Vocab: 248.3kx 64 layersRMSNormPre-AttentionGrouped-Query Attention24Q / 4KV headsHead dim: 256+RMSNormPre-FFNFeed-Forward NetworkSwiGLUIntermediate: 17.4k+Final RMSNormOutput Logits

Evaluation Benchmarks

Rank

#53

BenchmarkScoreRank

General Text

Text Arena

1409

50

Web Development

WebDev Arena

1357

56

Rankings

Overall Rank

#53

Coding Rank

#65

About Qwen3.5-27B

Qwen3.5-27B is Alibaba Cloud's dense multimodal foundation model with 27B parameters, released February 2026. Unlike the MoE variants, it uses a dense architecture combining Gated Delta Networks and Feed Forward Networks. It achieves MMLU-Pro (86.1%), GPQA Diamond (85.5%), SWE-bench Verified (72.4%), and Terminal-Bench 2.0 (41.6%). Features unified vision-language capabilities, 262k native context (extensible to 1M), and excels across reasoning, coding, multimodal understanding, and multilingual tasks spanning 201 languages.

Technical Specifications

Attention

Attention Structure

Grouped-Query Attention

Attention Heads

24

Key-Value Heads

4

Attention Head Dimension

256

Position Embedding

ROPE

RoPE Theta

10,000,000

Sliding Window Attention

No

Sliding Window Size

-

Normalization

RMS Normalization

Activation Function

SwigLU

Dimensions

Hidden Dimension Size

5,120

Number of Layers

64

FFN Intermediate Size (Dense)

17,408

Multi-Token Prediction Heads

1

Tokenizer

Vocabulary Size

248,320

Model Integrity

Total Score

B

69 / 100

Qwen3.5-27B Model Integrity Report

Total Score

69

/ 100

B

Audit Note

Qwen3.5-27B exhibits strong transparency in its architectural design and licensing, providing deep technical details on its hybrid attention mechanism and a permissive Apache 2.0 license. However, the model suffers from significant opacity regarding its training data composition and the total compute resources utilized during development. While hardware requirements and identity consistency are well-handled, the lack of a reproducible evaluation suite and granular dataset disclosure limits its overall transparency profile.

Upstream

20.0 / 30

Architectural Provenance

8.0 / 10

The model architecture is extensively documented in official Hugging Face model cards and technical blog posts. It utilizes a novel hybrid design consisting of 64 layers organized into 16 groups, where each group contains three Gated DeltaNet (linear attention) layers and one Gated Attention layer. Detailed specifications including hidden dimensions (5120), head dimensions (128 for GDN, 256 for Gated Attention), and intermediate FFN dimensions (17408) are publicly available. While the high-level methodology of 'early-fusion' multimodal training is described, the specific pre-training recipe and architectural modifications for the vision encoder integration are less detailed than the language backbone.

Dataset Composition

3.5 / 10

Information regarding the training data remains largely high-level and lacks granular transparency. Official sources mention 'trillions of multimodal tokens' and a 'multilingual data annotation system' labeling over 30T tokens across 201 languages. However, there is no specific breakdown of dataset proportions (e.g., % web, % code, % books) or a comprehensive list of data sources. While some evaluation datasets like HLE-Verified are open-sourced, the primary pre-training corpus composition and specific filtering/cleaning methodologies are not disclosed in detail, relying on vague descriptors like 'carefully curated'.

Tokenizer Integrity

8.5 / 10

The tokenizer is publicly accessible via the Hugging Face repository and is well-documented. It uses a Byte-level Byte Pair Encoding (BPE) approach with a vocabulary size of 248,320 (padded), significantly expanded from previous generations to support 201 languages. Vocabulary size and tokenization logic are verified through both official documentation and third-party implementations (e.g., .NET ports). The alignment between claimed language support and tokenizer efficiency is documented, though some internal token normalization details are proprietary.

Model

26.5 / 40

Parameter Density

9.0 / 10

As a dense model, parameter density is straightforward and clearly stated at 27.0B total parameters. Unlike the MoE variants in the Qwen 3.5 family, all parameters are active during inference. The architectural breakdown is highly detailed, specifying the exact number of layers (64), attention heads (24 Q, 4 KV), and the specific layout of Gated DeltaNet vs. Gated Attention blocks. This level of detail allows for precise calculation of computational requirements and memory overhead.

Training Compute

2.0 / 10

Transparency regarding training compute is extremely low. While the use of 'Next-Generation Training Infrastructure' and 'asynchronous RL frameworks' is mentioned in marketing materials, there is no public disclosure of the total GPU/TPU hours consumed, the specific hardware clusters used for the 27B variant's training, or the associated carbon footprint. The company cites efficiency gains but provides no verifiable metrics to back these claims, scoring poorly on environmental and resource transparency.

Benchmark Reproducibility

6.0 / 10

The model provides scores for standard benchmarks (MMLU-Pro: 86.1%, GPQA Diamond: 85.5%, SWE-bench Verified: 72.4%) which are verifiable through third-party leaderboards like Artificial Analysis and OpenRouter. However, the exact evaluation code, specific prompts, and few-shot examples used for official reporting are not fully public in a centralized repository. While some third-party audits exist, the lack of a comprehensive, reproducible evaluation suite from the provider prevents a higher score.

Identity Consistency

9.5 / 10

The model demonstrates high identity consistency, correctly identifying itself as Qwen 3.5 and maintaining version awareness across different deployment frameworks (vLLM, SGLang, Ollama). It is transparent about its multimodal capabilities and its position within the broader Qwen 3.5 ecosystem. There are no documented instances of the model claiming to be a competitor's product or misrepresenting its dense architecture as an MoE variant.

Downstream

22.5 / 30

License Clarity

10.0 / 10

The model is released under the Apache 2.0 license, which is a standard, highly permissive open-source license. The license terms are clearly stated in the Hugging Face repository and official announcements, explicitly allowing for commercial use, modification, and redistribution. There are no conflicting custom terms or restrictive 'acceptable use' policies that override the base license, providing exemplary legal transparency.

Hardware Footprint

7.5 / 10

Hardware requirements are well-documented by both the provider and the community. VRAM requirements for various quantization levels (Q4, Q8, FP16) are publicly available, with specific guidance for single-GPU deployment (e.g., ~16-18GB for Q4 GGUF). Memory scaling for the 262k context window is documented, and third-party tools like Unsloth provide detailed VRAM calculators. However, official documentation on the specific accuracy-performance tradeoffs of the new 'Gated DeltaNet' layers under heavy quantization is still emerging.

Versioning Drift

5.0 / 10

The model uses a clear naming convention (Qwen3.5-27B), but the changelog and version history are somewhat fragmented across blog posts and GitHub commits. While major releases are announced, minor weight updates or 'silent' optimizations (such as the March 5 GGUF update) are often communicated through third-party partners rather than a centralized, formal versioning system. There is no clear public roadmap or deprecation policy for previous versions.

About Qwen 3.5

Qwen 3.5 is Alibaba Cloud's latest-generation foundation model family, released February 2026. It represents a significant leap forward, integrating breakthroughs in multimodal learning (unified vision-language foundation), efficient hybrid architecture (Gated Delta Networks with sparse Mixture-of-Experts), scalable reinforcement learning across million-agent environments, and global linguistic coverage spanning 201 languages. Available under Apache 2.0 license with open weights.


Other Qwen 3.5 Models
Qwen3.5-27B: Specifications and GPU VRAM Requirements