GPT-5.3-Codex-Spark: An Ultra-Fast Model from OpenAI and Cerebras
14:29, 13.02.2026
OpenAI has released a specialized version for high-speed code generation, GPT-5.3-Codex-Spark. This new model stands out with incredible speed, capable of delivering over 1,000 tokens per second.
Spark Performance
Spark's performance was made possible by shifting from traditional GPUs to specialized Cerebras Wafer Scale Engine 3 chips. Thanks to the new architecture and a persistent WebSocket connection, token delivery latency has been reduced by 80%.
This release is the first result of the OpenAI and Cerebras collaboration; however, it is a version of GPT-5.3-Codex rather than a whole new model. Spark is designed with a focus on real-time coding, offering pinpoint edits and code adaptation 15 times faster than conventional models.
Benchmarks
Phenomenal speed is balanced by lower accuracy.
SWE-Bench Pro shows that Spark handles a request in 2–3 minutes with a score of 52%, while the flagship model reaches 57% but takes 16 minutes.
Terminal-Bench 2.0 indicates that Spark’s accuracy is 58.4% compared to 77.3% for the flagship.
Nevertheless, Spark significantly outperforms the GPT-5.1-Codex-mini model. Cerebras suggests that the new model could make instant response times the new industry standard.
How can you test Spark?
Currently, the model is in the research preview stage and is available to ChatGPT Pro subscribers via CLI, a VS Code extension, and the Codex app. The model does not support images, and API access is open only to a small circle of partners.