Parameters
11B
Context Length
8.192K
Modality
Text
Architecture
Dense
License
TII Falcon License 2.0
Release Date
20 Jul 2024
Knowledge Cutoff
-
Attention Structure
Multi-Query Attention
Hidden Dimension Size
5632
Number of Layers
40
Attention Heads
44
Key-Value Heads
1
Activation Function
-
Normalization
-
Position Embedding
ROPE
VRAM requirements for different quantization methods and context sizes
Falcon 2 11B is an 11 billion parameter large language model developed by the Technology Innovation Institute (TII). This causal decoder-only model is designed to serve as a foundational component for various natural language processing applications. Its development focuses on enhancing accessibility and inference efficiency, thereby encouraging broader adoption and the creation of specialized downstream applications. The model supports multilingual understanding and generation, making it suitable for diverse linguistic contexts.
Architecturally, Falcon 2 11B is built upon the transformer framework, specifically employing a causal decoder-only configuration that operates on a next-token prediction objective. The model incorporates several key innovations adapted from the GPT-3 architecture, including the use of rotary positional embeddings for improved sequence length handling and FlashAttention-2 for optimized attention mechanisms. A notable feature is the implementation of Grouped Query Attention (GQA) with 8 key-value heads, which aims to balance efficiency and performance in attention computations. The decoder blocks utilize a parallel attention/MLP structure. The training regimen involved a four-stage process, progressively extending the effective context window to 8192 tokens. It was trained on an extensive dataset exceeding 5 trillion tokens, primarily derived from RefinedWeb, a high-quality filtered and deduplicated web corpus, augmented with curated data including code and conversational content.
Falcon 2 11B is equipped with multilingual capabilities, trained on data spanning languages such as English, German, Spanish, French, Italian, Dutch, Polish, Portuguese, Czech, Romanian, and Swedish. This broad linguistic coverage enables the model to perform effectively across multiple languages. The model serves as a base for tasks such as text generation, language translation, and summarization, emphasizing its role as a versatile foundation model for fine-tuning to specific domain requirements and applications. Its optimized design supports faster processing, contributing to more efficient deployment in various use cases.
The Falcon 2 model family by TII encompasses the 11B language model and its Vision Language Model (VLM) counterpart. These open-source models, with 11 billion parameters, are trained on over five trillion tokens, providing multilingual support. The VLM variant integrates vision-to-language capabilities, enabling the processing of visual inputs for textual outputs.
Ranking is for Local LLMs.
No evaluation benchmarks for Falcon2-11B available.
Overall Rank
-
Coding Rank
-
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens