Gemma 1 2B: Specifications and GPU VRAM Requirements

Gemma 1 2B

闭源

开放权重

参数

上下文长度

8.192K

模态

Text

架构

Dense

许可证

Gemma Terms of Use

发布日期

21 Feb 2024

训练数据截止日期

技术规格

注意力结构

Multi-Query Attention

隐藏维度大小

2048

层数

注意力头

键值头

激活函数

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Gemma 1 2B

Gemma 1 2B is a lightweight, state-of-the-art open language model developed by Google, stemming from the same research and technology that underpins the Gemini family of models. This model is designed as a text-to-text, decoder-only transformer, primarily available in English, with both pre-trained and instruction-tuned variants. Its architectural design focuses on efficiency, making it suitable for deployment in environments with limited computational resources, such as laptops, desktops, or personal cloud infrastructure.

Architecturally, Gemma 1 2B incorporates several advanced components. It utilizes Multi-Query Attention (MQA) with a single key-value head, a design choice that optimizes for faster inference by sharing key and value projections across attention heads. Positional encoding is handled through Rotary Positional Embeddings (RoPE). The model's non-linear activation function is GeGLU (Gated Linear Unit), a variant of GLU that enhances expressive power. Normalization within the network is performed using RMSNorm. These elements contribute to the model's performance while maintaining a compact footprint.

The 2B variant is well-suited for a variety of text generation applications, including question answering, summarization, and reasoning tasks. The instruction-tuned versions of Gemma 1 2B are specifically refined to follow instructions effectively and engage in multi-turn conversations, making them adaptable for interactive applications like chatbots. Its compact size ensures it can operate on consumer-grade hardware, democratizing access to advanced AI capabilities for developers and researchers.

关于 Gemma 1

Gemma 1 is a family of lightweight, decoder-only transformer models from Google, available in 2B and 7B parameter sizes. Designed for various text generation tasks, they incorporate rotary positional embeddings, shared input/output embeddings, GEGLU activation, and RMSNorm. The 2B model uses multi-query attention, while 7B uses multi-head attention.

其他 Gemma 1 模型

Gemma 1 7B

评估基准

排名适用于本地LLM。

没有可用的 Gemma 1 2B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

Gemma 1 2B

技术规格

系统要求

Gemma 1 2B

关于 Gemma 1

其他 Gemma 1 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源