ApX 标志ApX 标志

趋近智

OLMo 3.1 32B Think

参数

32B

上下文长度

65.536K

模态

Text

架构

Dense

许可证

Apache 2.0

发布日期

12 Dec 2025

训练数据截止日期

Dec 2024

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

5120

层数

64

注意力头

40

键值头

8

激活函数

SwigLU

归一化

RMS Normalization

位置嵌入

Absolute Position Embedding

OLMo 3.1 32B Think

OLMo 3.1 32B Think is a large-scale autoregressive language model developed by the Allen Institute for AI, specifically engineered to excel in complex reasoning and multi-step logic. As part of the OLMo 3.1 series, this variant represents a significant evolution in the initiative's commitment to open science, providing an end-to-end transparent pipeline that includes model weights, training code, and the underlying data. The model is optimized for tasks requiring extended chains of thought, particularly in mathematics and programming, where it leverages specialized post-training to generate detailed, verifiable logical steps before arriving at a final solution.

Built on a decoder-only Transformer architecture, OLMo 3.1 32B Think utilizes 64 layers with a hidden dimension of 5120, incorporating architectural refinements to balance high performance with computational efficiency. It employs Grouped-Query Attention (GQA) with 40 query heads and 8 key-value heads, a configuration that significantly reduces the memory footprint of the key-value cache and enables efficient inference. The model utilizes SwiGLU activation functions and RMSNorm for stable training dynamics. For positional encoding, it implements Rotary Position Embeddings (RoPE) with YaRN-style scaling, supporting a substantial context window of 65,536 tokens.

The training regimen for this model involves a sophisticated multi-stage process starting with pretraining on the 9.3-trillion-token Dolma 3 dataset, followed by mid-training on higher-quality reasoning data. The Think variant is further refined through supervised fine-tuning and Reinforcement Learning from Verifiable Rewards (RLVR) using the Dolci-Think-RL dataset. This specialized reinforcement learning stage is designed to cultivate persistent internal reasoning, allowing the model to navigate intricate problems by exploring multiple logical paths. Because the model is released under the Apache 2.0 license with full access to the training recipes and data provenance tools, it serves as a transparent foundation for researchers and developers building auditable AI systems.

关于 OLMo 3

OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.


其他 OLMo 3 模型

评估基准

排名

#60

基准分数排名

Web Development

WebDev Arena

1285

45

排名

排名

#60

编程排名

#61

模型透明度

总分

B+

86 / 100

GPU 要求

完整计算器

选择模型权重的量化方法

上下文大小:1024 个令牌

1k
32k
64k

所需显存:

推荐 GPU