OLMo 3.1 32B Think：规格和 GPU 显存要求

OLMo 3.1 32B Think

开源

开放权重

参数

32B

上下文长度

65.536K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

12 Dec 2025

训练数据截止日期

Dec 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

注意力头

键值头

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

OLMo 3.1 32B Think

OLMo 3.1 32B Think is a large-scale autoregressive language model developed by the Allen Institute for AI, specifically engineered to excel in complex reasoning and multi-step logic. As part of the OLMo 3.1 series, this variant represents a significant evolution in the initiative's commitment to open science, providing an end-to-end transparent pipeline that includes model weights, training code, and the underlying data. The model is optimized for tasks requiring extended chains of thought, particularly in mathematics and programming, where it leverages specialized post-training to generate detailed, verifiable logical steps before arriving at a final solution.

Built on a decoder-only Transformer architecture, OLMo 3.1 32B Think utilizes 64 layers with a hidden dimension of 5120, incorporating architectural refinements to balance high performance with computational efficiency. It employs Grouped-Query Attention (GQA) with 40 query heads and 8 key-value heads, a configuration that significantly reduces the memory footprint of the key-value cache and enables efficient inference. The model utilizes SwiGLU activation functions and RMSNorm for stable training dynamics. For positional encoding, it implements Rotary Position Embeddings (RoPE) with YaRN-style scaling, supporting a substantial context window of 65,536 tokens.

The training regimen for this model involves a sophisticated multi-stage process starting with pretraining on the 9.3-trillion-token Dolma 3 dataset, followed by mid-training on higher-quality reasoning data. The Think variant is further refined through supervised fine-tuning and Reinforcement Learning from Verifiable Rewards (RLVR) using the Dolci-Think-RL dataset. This specialized reinforcement learning stage is designed to cultivate persistent internal reasoning, allowing the model to navigate intricate problems by exploring multiple logical paths. Because the model is released under the Apache 2.0 license with full access to the training recipes and data provenance tools, it serves as a transparent foundation for researchers and developers building auditable AI systems.

关于 OLMo 3

OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.

其他 OLMo 3 模型

评估基准

排名

#60

基准	分数	排名
Web Development WebDev Arena	1285	45

排名

#60

编程排名

#61

模型透明度

总分

B+

86 / 100

上游

27.0 / 30

模型

33.5 / 40

下游

25.0 / 30

GPU 要求

完整计算器

量化

选择模型权重的量化方法

上下文大小：1024 个令牌

32k

64k

所需显存:

资源

官方文档发布说明阅读论文下载权重源代码

OLMo 3.1 32B Think

技术规格

OLMo 3.1 32B Think

关于 OLMo 3

其他 OLMo 3 模型

评估基准

排名

模型透明度

GPU 要求

所需显存:

推荐 GPU

资源