ApX 标志ApX 标志

趋近智

GPT-5.3-Codex-Spark

参数

-

上下文长度

131.072K

模态

Code

架构

Dense

许可证

Proprietary

发布日期

12 Feb 2026

训练数据截止日期

-

技术规格

注意力结构

Multi-Head Attention

隐藏维度大小

-

层数

-

注意力头

-

键值头

-

激活函数

-

归一化

-

位置嵌入

Absolute Position Embedding

GPT-5.3-Codex-Spark

GPT-5.3-Codex-Spark is a specialized, low-latency large language model designed for real-time, interactive software development. Developed through a collaboration between OpenAI and Cerebras Systems, it functions as a streamlined variant of the broader GPT-5.3-Codex family. The model is engineered to provide a responsive experience during live coding sessions, enabling immediate feedback for tasks such as targeted logic adjustments, interface refinements, and incremental refactoring. By prioritizing inference speed, the model facilitates a collaborative workflow where developers can steer code generation in real time, effectively reducing the temporal gap between intent and execution.

The technical foundation of GPT-5.3-Codex-Spark is defined by its deployment on the Cerebras Wafer-Scale Engine 3 (WSE-3). Unlike traditional distributed GPU architectures that are often constrained by the 'memory wall' and interconnect latency, the WSE-3 utilizes a single, massive silicon wafer with integrated high-bandwidth memory (SRAM) and hundreds of thousands of optimized cores. This hardware synergy allows the model to achieve a throughput exceeding 1,000 tokens per second. To further minimize end-to-end latency, the system utilizes persistent WebSocket connections and a revised inference stack that accelerates session initialization and reduces network overhead by approximately 80 percent compared to standard RESTful API implementations.

Architecturally, the model is a dense transformer optimized for high-velocity text-only generation. It supports a 128k token context window, which is tailored to handle significant portions of active files and immediate project dependencies. The model's behavior is specifically tuned for a lightweight interaction style, favoring precise, minimal edits over extensive, autonomous code rewrites. This design choice ensures that the developer remains the primary driver of the logic while the model serves as an ultra-fast, interruptible completion engine. It is delivered via a dedicated latency-first serving tier that operates alongside OpenAI's existing GPU infrastructure.

关于 GPT-5.3 Codex

GPT-5.3 Codex is OpenAI's ultra-fast coding model family, developed in partnership with Cerebras and the Wafer Scale Engine 3, purpose-built for real-time interactive coding with minimal latency.


其他 GPT-5.3 Codex 模型
  • 没有相关模型

评估基准

排名

#58

基准分数排名

Instruction Following

IFEval

0.92

🥈

2

Grade School Math

GSM8K

0.97

4

Software Engineering (Verified)

SWE-bench Verified

0.66

20

General Knowledge

MMLU

0.89

21

Graduate-Level QA

GPQA

0.51

84

排名

排名

#58

编程排名

#73