Parameters
32B
Context Length
65.536K
Modality
Text
Architecture
Dense
License
Apache 2.0
Release Date
12 Dec 2025
Knowledge Cutoff
Dec 2024
Attention Structure
Multi-Head Attention
Hidden Dimension Size
5120
Number of Layers
64
Attention Heads
40
Key-Value Heads
8
Activation Function
SwigLU
Normalization
RMS Normalization
Position Embedding
Absolute Position Embedding
OLMo 3.1 32B Think is a large-scale autoregressive language model developed by the Allen Institute for AI, specifically engineered to excel in complex reasoning and multi-step logic. As part of the OLMo 3.1 series, this variant represents a significant evolution in the initiative's commitment to open science, providing an end-to-end transparent pipeline that includes model weights, training code, and the underlying data. The model is optimized for tasks requiring extended chains of thought, particularly in mathematics and programming, where it leverages specialized post-training to generate detailed, verifiable logical steps before arriving at a final solution.
Built on a decoder-only Transformer architecture, OLMo 3.1 32B Think utilizes 64 layers with a hidden dimension of 5120, incorporating architectural refinements to balance high performance with computational efficiency. It employs Grouped-Query Attention (GQA) with 40 query heads and 8 key-value heads, a configuration that significantly reduces the memory footprint of the key-value cache and enables efficient inference. The model utilizes SwiGLU activation functions and RMSNorm for stable training dynamics. For positional encoding, it implements Rotary Position Embeddings (RoPE) with YaRN-style scaling, supporting a substantial context window of 65,536 tokens.
The training regimen for this model involves a sophisticated multi-stage process starting with pretraining on the 9.3-trillion-token Dolma 3 dataset, followed by mid-training on higher-quality reasoning data. The Think variant is further refined through supervised fine-tuning and Reinforcement Learning from Verifiable Rewards (RLVR) using the Dolci-Think-RL dataset. This specialized reinforcement learning stage is designed to cultivate persistent internal reasoning, allowing the model to navigate intricate problems by exploring multiple logical paths. Because the model is released under the Apache 2.0 license with full access to the training recipes and data provenance tools, it serves as a transparent foundation for researchers and developers building auditable AI systems.
OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.
Rank
#60
| Benchmark | Score | Rank |
|---|---|---|
Web Development WebDev Arena | 1285 | 45 |
Overall Rank
#60
Coding Rank
#61
Full Calculator
Choose the quantization method for model weights
Context Size: 1,024 tokens