ApX 标志

趋近智

Falcon3-1B

参数

1B

上下文长度

8.192K

模态

Text

架构

Dense

许可证

TII Falcon-LLM License 2.0

发布日期

17 Dec 2024

知识截止

-

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

768

层数

18

注意力头

16

键值头

4

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Falcon3-1B

The Falcon3-1B model is a member of the Falcon 3 family of decoder-only large language models, developed by the Technology Innovation Institute (TII). This family of models emphasizes enhancing capabilities in scientific, mathematical, and coding domains, while maintaining a strong focus on training efficiency. The Falcon3-1B variant is specifically engineered to operate effectively on lightweight computational infrastructures, including devices such as laptops, thereby broadening the accessibility of advanced AI capabilities. It supports multilingual applications, including English, French, Spanish, and Portuguese.

Architecturally, Falcon3-1B is built upon a Transformer-based causal decoder-only design, incorporating 18 decoder blocks. The model utilizes Grouped Query Attention (GQA), configured with 8 query heads and 4 key-value heads, which contributes to efficient inference by minimizing memory consumption for the Key-Value (KV) cache. For activation, the model employs SwiGLU, and for normalization, it integrates RMSNorm. Positional embeddings are handled via Rotary Position Embeddings (RoPE), facilitating effective long-context understanding. The tokenizer for Falcon3-1B supports an extensive vocabulary of 131,000 tokens, which aids in data compression and downstream performance. Furthermore, the architecture incorporates Flash Attention 2 for optimized computational throughput.

Falcon3-1B is designed for a variety of natural language processing tasks, including but not limited to reasoning, language comprehension, instruction following, code generation, and mathematical problem-solving. Its design allows for its deployment in generative AI applications and conversational AI systems. The model's efficiency and optimized variants, such as quantized versions, enable its use in environments with constrained resources, providing a practical solution for diverse real-world applications.

关于 Falcon 3

The TII Falcon 3 model family comprises open-source, decoder-only language models (1B-10B parameters) designed for efficiency. Key innovations include an extended 32K token context window, Grouped-Query Attention (GQA), and specialized versions for scientific and code-oriented applications. Some variants integrate Mamba-based architectures.


其他 Falcon 3 模型

评估基准

排名适用于本地LLM。

没有可用的 Falcon3-1B 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
4k
8k

所需显存:

推荐 GPU