ApX logoApX logo

MiniMax M3

Parameters

428B

Context Length

1M

Modality

Multimodal

Architecture

Dense

License

Proprietary

Release Date

1 Jun 2026

Knowledge Cutoff

-

System Requirements

VRAM requirements for different quantization methods and context sizes

1,024 tokens

900.43 GB VRAM

Consumer

51x RTX 4090

24GB VRAM

Datacenter

14x NVIDIA A100

80GB VRAM

Apple Silicon

11x Apple M3 Max

128GB VRAM

1,000,000 tokens

1029.32 GB VRAM

Consumer

60x RTX 4090

24GB VRAM

Datacenter

16x NVIDIA A100

80GB VRAM

Apple Silicon

13x Apple M3 Max

128GB VRAM

Architecture Diagram

Input TokensToken EmbeddingPosition: AbsoluteHidden: 6.1k · Context: 1M · Vocab: 200.1kx 60 layersRMSNormPre-AttentionMulti-Head Attention64Q / 4KV headsHead dim: 128+RMSNormPre-FFNFeed-Forward NetworkSwiGLUIntermediate: 12.3k+Final RMSNormOutput Logits

Evaluation Benchmarks

Rank

#14

BenchmarkScoreRank

Web Development

WebDev Arena

1521

10

General Text

Text Arena

1451

25

Rankings

Overall Rank

#14

Coding Rank

#23

About MiniMax M3

MiniMax's flagship multimodal model released June 1, 2026. Powered by MiniMax Sparse Attention (MSA) architecture, which replaces traditional full attention with a KV-block selection pattern, drastically reducing compute costs to 1/20th of the previous generation. It is highly optimized for long-horizon agentic workflows, complex software engineering, and video understanding. Features a 1M token context window, supports text, image, and video inputs, and is priced at $0.30 per million input tokens and $1.20 per million output tokens.

Technical Specifications

Attention

Attention Structure

Multi-Head Attention

Attention Heads

64

Key-Value Heads

4

Attention Head Dimension

128

Position Embedding

Absolute Position Embedding

RoPE Theta

5,000,000

Sliding Window Attention

No

Sliding Window Size

-

Normalization

RMS Normalization

Activation Function

SwigLU

Dimensions

Hidden Dimension Size

6,144

Number of Layers

60

FFN Intermediate Size (Dense)

12,288

Multi-Token Prediction Heads

1

Tokenizer

Vocabulary Size

200,064

About MiniMax M3

MiniMax's flagship M3 model family, released June 1, 2026, is powered by MiniMax Sparse Attention (MSA) architecture, offering 1M context capabilities at exceptionally low compute cost and optimized for long-horizon agentic workflows.


Other MiniMax M3 Models
  • No related models available
MiniMax M3: Specifications and GPU VRAM Requirements