Falcon3-1B: Specifications and GPU VRAM Requirements

Falcon3-1B

开源

开放权重

参数

上下文长度

8.192K

模态

Text

架构

Dense

许可证

TII Falcon-LLM License 2.0

发布日期

17 Dec 2024

训练数据截止日期

技术规格

注意力结构

Grouped-Query Attention

隐藏维度大小

768

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

ROPE

系统要求

不同量化方法和上下文大小的显存要求

Falcon3-1B

The Falcon3-1B model is a member of the Falcon 3 family of decoder-only large language models, developed by the Technology Innovation Institute (TII). This family of models emphasizes enhancing capabilities in scientific, mathematical, and coding domains, while maintaining a strong focus on training efficiency. The Falcon3-1B variant is specifically engineered to operate effectively on lightweight computational infrastructures, including devices such as laptops, thereby broadening the accessibility of advanced AI capabilities. It supports multilingual applications, including English, French, Spanish, and Portuguese.

Architecturally, Falcon3-1B is built upon a Transformer-based causal decoder-only design, incorporating 18 decoder blocks. The model utilizes Grouped Query Attention (GQA), configured with 8 query heads and 4 key-value heads, which contributes to efficient inference by minimizing memory consumption for the Key-Value (KV) cache. For activation, the model employs SwiGLU, and for normalization, it integrates RMSNorm. Positional embeddings are handled via Rotary Position Embeddings (RoPE), facilitating effective long-context understanding. The tokenizer for Falcon3-1B supports an extensive vocabulary of 131,000 tokens, which aids in data compression and downstream performance. Furthermore, the architecture incorporates Flash Attention 2 for optimized computational throughput.

Falcon3-1B is designed for a variety of natural language processing tasks, including but not limited to reasoning, language comprehension, instruction following, code generation, and mathematical problem-solving. Its design allows for its deployment in generative AI applications and conversational AI systems. The model's efficiency and optimized variants, such as quantized versions, enable its use in environments with constrained resources, providing a practical solution for diverse real-world applications.

关于 Falcon 3

The TII Falcon 3 model family comprises open-source, decoder-only language models (1B-10B parameters) designed for efficiency. Key innovations include an extended 32K token context window, Grouped-Query Attention (GQA), and specialized versions for scientific and code-oriented applications. Some variants integrate Mamba-based architectures.

其他 Falcon 3 模型

评估基准

排名适用于本地LLM。

没有可用的 Falcon3-1B 评估基准。

排名

编程排名

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

所需显存:

资源

官方文档发布说明下载权重

Falcon3-1B

技术规格

系统要求

Falcon3-1B

关于 Falcon 3

其他 Falcon 3 模型

评估基准

排名

GPU 要求

所需显存:

推荐 GPU

资源