Qwen3-4B: Specifications and GPU VRAM Requirements

Qwen3-4B

闭源

开放权重

参数

上下文长度

32.768K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

29 Apr 2025

知识截止

Mar 2025

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

层数

注意力头

键值头

激活函数

Swish

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Qwen3-4B

Qwen3-4B is a foundational large language model developed by Alibaba, forming a part of the comprehensive Qwen3 series. This model is engineered to facilitate advanced natural language processing tasks, encompassing both general-purpose conversational abilities and specialized reasoning. A distinguishing architectural characteristic of the Qwen3 series is its dual-mode operation, which enables dynamic switching between a 'thinking mode' for complex, multi-step logical reasoning and a 'non-thinking mode' for efficient, direct responses. This adaptability optimizes performance across diverse application scenarios, ranging from intricate problem-solving to rapid-fire dialogue.

Architecturally, Qwen3-4B is a dense transformer model with 4.0 billion parameters, comprising 36 layers. It employs Grouped Query Attention (GQA) with 32 attention heads for queries and 8 key-value heads, which contributes to its computational efficiency during inference while maintaining performance. The model incorporates Rotary Position Embeddings (RoPE) for handling sequence length, natively supporting a context length of up to 32,768 tokens. This context length can be extended to 131,072 tokens through YaRN (Yet another RoPE N-dimensional extension) scaling techniques. The activation function utilized within the model is SwiGLU, and normalization is applied using RMSNorm, further contributing to stable training and performance.

Qwen3-4B is intended for a range of applications requiring sophisticated language understanding and generation. Its capabilities extend to areas such as mathematical problem-solving, code generation, creative writing, and multi-turn dialogue systems. The model's design facilitates its integration into agentic workflows, enabling precise interaction with external tools. Furthermore, Qwen3-4B demonstrates robust multilingual support, processing information across more than 100 languages and dialects. This combination of architectural design, reasoning flexibility, and broad language coverage positions it as a suitable candidate for a variety of academic and commercial deployments.

关于 Qwen 3

The Alibaba Qwen 3 model family comprises dense and Mixture-of-Experts (MoE) architectures, with parameter counts from 0.6B to 235B. Key innovations include a hybrid reasoning system, offering 'thinking' and 'non-thinking' modes for adaptive processing, and support for extensive context windows, enhancing efficiency and scalability.

其他 Qwen 3 模型

评估基准

排名适用于本地LLM。

没有可用的 Qwen3-4B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

16k

32k

所需显存:

资源

官方文档发布说明阅读论文下载权重

Qwen3-4B

技术规格

系统要求

Qwen3-4B

关于 Qwen 3

其他 Qwen 3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源