Gemma 4 E4B

开源

开放权重

参数

上下文长度

128K

模态

Multimodal

架构

Dense

许可证

Apache 2.0

发布日期

2 Apr 2026

训练数据截止日期

系统要求

不同量化方法和上下文大小的显存要求

1024 个令牌

18.39 GB VRAM

消费级

1x RTX 4090

24GB VRAM

数据中心

1x NVIDIA A100

80GB VRAM

Apple Silicon

1x Apple M3 Max

128GB VRAM

128000 个令牌

29.86 GB VRAM

消费级

2x RTX 4090

24GB VRAM

数据中心

1x NVIDIA A100

80GB VRAM

Apple Silicon

1x Apple M3 Max

128GB VRAM

架构图

评估基准

没有可用的 Gemma 4 E4B 评估基准。

排名

编程排名

关于 Gemma 4 E4B

Gemma 4 E4B 是一款针对边缘优化的模型，拥有 45 亿有效参数（采用逐层嵌入时为 80 亿），专为移动和边缘部署而设计。支持多模态输入（文本、图像、音频）并具备 128K 上下文窗口。在保持高效端侧执行的同时，提供了优于 E2B 的增强性能。具备思考模式和原生函数调用功能。

技术规格

注意力

注意力结构

Grouped-Query Attention

注意力头

键值头

注意力头维度

256

位置嵌入

ROPE

RoPE Theta

10,000

滑动窗口注意力

Yes

滑动窗口大小

512

滑动窗口比例

83.3%

线性注意力

线性注意力比例

归一化

RMS Normalization

激活函数

GELU

维度

隐藏维度大小

10,240

层数

FFN 中间层大小（稠密层）

10,240

多 Token 预测头数

分词器

词汇量大小

262,144

模型完整性

总分

68 / 100

上游

20.0 / 30

模型

24.5 / 40

下游

23.5 / 30

资源

官方文档下载权重源代码

关于 Gemma 4

Gemma 4 is Google DeepMind's most advanced open model family, built from Gemini 3 research and technology. Featuring both Dense and Mixture-of-Experts (MoE) architectures, these multimodal models handle text, images, and audio (on smaller variants), with context windows up to 256K tokens. Designed for frontier-level performance across reasoning, coding, and agentic workflows, Gemma 4 delivers unprecedented intelligence-per-parameter from mobile devices to enterprise servers. Released under Apache 2.0 license.