OLMo 3 7B Instruct：规格和 GPU 显存要求

OLMo 3 7B Instruct

开源

开放权重

参数

上下文长度

65.536K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

25 Oct 2025

训练数据截止日期

Dec 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

4096

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

OLMo 3 7B Instruct

OLMo 3 7B Instruct is a specialized large language model developed by the Allen Institute for AI (AI2), designed to advance the scientific study of language modeling through complete transparency. As a core component of the OLMo 3 family, this instruction-tuned variant is optimized for low-latency, multi-turn dialogue, complex instruction following, and function-calling capabilities. It serves as a highly accessible and efficient workhorse for both research and production environments, bridging the gap between open-weights and fully open-source initiatives.

Technically, the model utilizes a standard decoder-only Transformer architecture with 7 billion parameters. The training pipeline is notably rigorous, involving a staged progression that begins with pre-training on the Dolma 3 dataset, followed by mid-training on targeted data mixes and context extension to support a 65,536-token window. The post-training methodology for the Instruct variant integrates Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Verifiable Rewards (RLVR) on the Dolci-Instruct datasets, focusing on accuracy and adherence to user intent.

Innovation in the OLMo 3 series lies not in exotic architecture but in its exhaustive transparency. AI2 provides unrestricted access to the training code, pre-training data recipes, intermediate checkpoints, and detailed training logs. This enables practitioners to audit the model's lineage, reproduce results, or continue pre-training from specific historical states. The 7B Instruct model is particularly well-suited for applications requiring a balance of reasoning capability and computational efficiency, such as conversational agents, local coding assistants, and educational tools.

关于 OLMo 3

OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.

其他 OLMo 3 模型

评估基准

没有可用的 OLMo 3 7B Instruct 评估基准。

排名

编程排名

模型透明度

总分

B+

86 / 100

上游

27.0 / 30

模型

34.5 / 40

下游

24.5 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

32k

64k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

OLMo 3 7B Instruct

技术规格

OLMo 3 7B Instruct

关于 OLMo 3

其他 OLMo 3 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源