ApX logoApX logo

OLMo 3.1 32B Think

Parameters

32B

Context Length

65.536K

Modality

Text

Architecture

Dense

License

Apache 2.0

Release Date

12 Dec 2025

Knowledge Cutoff

Dec 2024

Technical Specifications

Attention Structure

Multi-Head Attention

Hidden Dimension Size

5120

Number of Layers

64

Attention Heads

40

Key-Value Heads

8

Activation Function

SwigLU

Normalization

RMS Normalization

Position Embedding

Absolute Position Embedding

OLMo 3.1 32B Think

OLMo 3.1 32B Think is a large-scale autoregressive language model developed by the Allen Institute for AI, specifically engineered to excel in complex reasoning and multi-step logic. As part of the OLMo 3.1 series, this variant represents a significant evolution in the initiative's commitment to open science, providing an end-to-end transparent pipeline that includes model weights, training code, and the underlying data. The model is optimized for tasks requiring extended chains of thought, particularly in mathematics and programming, where it leverages specialized post-training to generate detailed, verifiable logical steps before arriving at a final solution.

Built on a decoder-only Transformer architecture, OLMo 3.1 32B Think utilizes 64 layers with a hidden dimension of 5120, incorporating architectural refinements to balance high performance with computational efficiency. It employs Grouped-Query Attention (GQA) with 40 query heads and 8 key-value heads, a configuration that significantly reduces the memory footprint of the key-value cache and enables efficient inference. The model utilizes SwiGLU activation functions and RMSNorm for stable training dynamics. For positional encoding, it implements Rotary Position Embeddings (RoPE) with YaRN-style scaling, supporting a substantial context window of 65,536 tokens.

The training regimen for this model involves a sophisticated multi-stage process starting with pretraining on the 9.3-trillion-token Dolma 3 dataset, followed by mid-training on higher-quality reasoning data. The Think variant is further refined through supervised fine-tuning and Reinforcement Learning from Verifiable Rewards (RLVR) using the Dolci-Think-RL dataset. This specialized reinforcement learning stage is designed to cultivate persistent internal reasoning, allowing the model to navigate intricate problems by exploring multiple logical paths. Because the model is released under the Apache 2.0 license with full access to the training recipes and data provenance tools, it serves as a transparent foundation for researchers and developers building auditable AI systems.

About OLMo 3

OLMo (Open Language Model) is a series of fully open language models designed to enable the science of language models. Released by the Allen Institute for AI (Ai2), OLMo 3 provides complete access to training data (Dolma 3), code, checkpoints, logs, and evaluation methodologies. The family includes Base models for pretraining research, Instruct variants for chat and tool use, and Think variants with chain-of-thought reasoning capabilities. All models are trained with staged approach including pretraining, mid-training, and long-context phases.


Other OLMo 3 Models

Evaluation Benchmarks

Rank

#60

BenchmarkScoreRank

Web Development

WebDev Arena

1285

45

Rankings

Overall Rank

#60

Coding Rank

#61

GPU Requirements

Full Calculator

Choose the quantization method for model weights

Context Size: 1,024 tokens

1k
32k
64k

VRAM Required:

Recommended GPUs

OLMo 3.1 32B Think: Specifications and GPU VRAM Requirements