ApX 标志

趋近智

Falcon3-3B

参数

3B

上下文长度

32.768K

模态

Text

架构

Dense

许可证

TII Falcon-LLM License 2.0

发布日期

17 Dec 2024

知识截止

-

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

1536

层数

28

注意力头

24

键值头

6

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Falcon3-3B

The Falcon3-3B model is part of the Falcon 3 family of open foundation models developed by the Technology Innovation Institute (TII). This model is designed for a balance of performance and efficiency, enabling its deployment on a range of computing infrastructures, including smaller devices. It is developed to support advancements in capabilities related to science, mathematics, and code generation. The Falcon 3 series includes both base models for general-purpose generative tasks and instruct models for conversational applications, emphasizing accessibility in advanced artificial intelligence systems.

Architecturally, Falcon3-3B employs a transformer-based causal decoder-only design. It incorporates 22 decoder blocks, contributing to its processing depth. For attention mechanisms, the model utilizes Grouped Query Attention (GQA) with 12 query heads and 4 key-value heads, along with a wider head dimension of 256. This configuration supports efficient inference operations. The model integrates SwiGLU as its activation function and RMSNorm for normalization, in addition to using Rotary Position Embeddings (RoPE) with a high value to handle extended context. It also leverages Flash Attention 2 for optimized memory and speed during operations.

The Falcon3-3B model, particularly its instruct variant, supports a context length of up to 32,768 tokens, while the base version supports 8,192 tokens. It is engineered to perform on tasks such as reasoning, language understanding, instruction following, and mathematical problem-solving. The model has been trained to support four languages: English, French, Spanish, and Portuguese. Its design considerations include the availability of quantized versions, such as int4, int8, and 1.58 Bitnet, which further enhance its efficiency and suitability for resource-constrained environments.

关于 Falcon 3

The TII Falcon 3 model family comprises open-source, decoder-only language models (1B-10B parameters) designed for efficiency. Key innovations include an extended 32K token context window, Grouped-Query Attention (GQA), and specialized versions for scientific and code-oriented applications. Some variants integrate Mamba-based architectures.


其他 Falcon 3 模型

评估基准

排名适用于本地LLM。

没有可用的 Falcon3-3B 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
16k
32k

所需显存:

推荐 GPU

Falcon3-3B: Specifications and GPU VRAM Requirements