ApX 标志

趋近智

OLMo 3 7B Base

参数

7B

上下文长度

65.536K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

25 Oct 2025

训练数据截止日期

Dec 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

32

注意力头

32

键值头

32

激活函数

SwigLU

归一化

-

位置嵌入

Absolute Position Embedding

系统要求

不同量化方法和上下文大小的显存要求

OLMo 3 7B Base

OLMo 3 7B Base represents a foundational component within the Allen Institute for AI's (AI2) OLMo 3 family of language models, designed to advance the scientific understanding and development of large language models. This variant features 7 billion parameters and is trained on 5.93 trillion tokens sourced from the Dolma 3 dataset. A key characteristic of the OLMo 3 project is its commitment to full transparency, offering public access to not only the model weights but also the comprehensive training data, code, intermediate checkpoints, logs, and evaluation methodologies. This approach facilitates reproducibility and supports detailed research into model behavior and development processes.

Architecturally, the OLMo 3 7B Base model is a dense, decoder-only transformer. Its training employs a staged approach, encompassing distinct pretraining, mid-training, and long-context phases to optimize for diverse linguistic capabilities and extended input handling. The model incorporates 32 layers, a hidden dimension size of 4096, and utilizes multi-head attention with 32 query heads and 32 key-value heads. Rotary Positional Embeddings (RoPE) are integrated, with scaling mechanisms implemented to support a substantial context length of 65,536 tokens.

As a base model, OLMo 3 7B is intended primarily for pretraining research and serves as a robust starting point for subsequent fine-tuning across various downstream tasks. Its design prioritizes general capabilities, laying the groundwork for specialized applications in areas such as reasoning, tool use, and instruction following through further post-training. The model's open licensing under Apache 2.0 permits broad usage, including commercial applications, fostering community collaboration and innovation in the AI ecosystem.

关于 OLMo 3

OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.


其他 OLMo 3 模型

评估基准

排名适用于本地LLM。

没有可用的 OLMo 3 7B Base 评估基准。

排名

排名

-

编程排名

-

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
32k
64k

所需显存:

推荐 GPU