Falcon3-3B: Specifications and GPU VRAM Requirements

Falcon3-3B

开源

开放权重

参数

上下文长度

32.768K

模态

Text

架构

Dense

许可证

TII Falcon-LLM License 2.0

发布日期

17 Dec 2024

训练数据截止日期

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

1536

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Falcon3-3B

The Falcon3-3B model is part of the Falcon 3 family of open foundation models developed by the Technology Innovation Institute (TII). This model is designed for a balance of performance and efficiency, enabling its deployment on a range of computing infrastructures, including smaller devices. It is developed to support advancements in capabilities related to science, mathematics, and code generation. The Falcon 3 series includes both base models for general-purpose generative tasks and instruct models for conversational applications, emphasizing accessibility in advanced artificial intelligence systems.

Architecturally, Falcon3-3B employs a transformer-based causal decoder-only design. It incorporates 22 decoder blocks, contributing to its processing depth. For attention mechanisms, the model utilizes Grouped Query Attention (GQA) with 12 query heads and 4 key-value heads, along with a wider head dimension of 256. This configuration supports efficient inference operations. The model integrates SwiGLU as its activation function and RMSNorm for normalization, in addition to using Rotary Position Embeddings (RoPE) with a high value to handle extended context. It also leverages Flash Attention 2 for optimized memory and speed during operations.

The Falcon3-3B model, particularly its instruct variant, supports a context length of up to 32,768 tokens, while the base version supports 8,192 tokens. It is engineered to perform on tasks such as reasoning, language understanding, instruction following, and mathematical problem-solving. The model has been trained to support four languages: English, French, Spanish, and Portuguese. Its design considerations include the availability of quantized versions, such as int4, int8, and 1.58 Bitnet, which further enhance its efficiency and suitability for resource-constrained environments.

关于 Falcon 3

The TII Falcon 3 model family comprises open-source, decoder-only language models (1B-10B parameters) designed for efficiency. Key innovations include an extended 32K token context window, Grouped-Query Attention (GQA), and specialized versions for scientific and code-oriented applications. Some variants integrate Mamba-based architectures.

其他 Falcon 3 模型

评估基准

排名适用于本地LLM。

没有可用的 Falcon3-3B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

16k

32k

所需显存:

资源

官方文档发布说明下载权重

Falcon3-3B

技术规格

系统要求

Falcon3-3B

关于 Falcon 3

其他 Falcon 3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源