Nvidia Is Preparing a New Generation of GPUs to Support Millions of Contexts

watch 1m, 17s
views 2

13:36, 10.09.2025

Article Content
arrow

  • Disaggregated Inference Architecture
  • A Breakthrough for Business and Science
  • Focus on Inference, not Training
  • Market Launch

Nvidia has unveiled the Rubin CPX graphics processor, designed specifically for language and multimodal models that need to store and analyze huge amounts of data. The chip is optimized to process contexts of over 1 million tokens,  a figure that far exceeds the capabilities of current systems.

Disaggregated Inference Architecture

The key innovation of Rubin CPX is the use of disaggregated inference architecture. With this approach, multiple GPUs process different parts of the task and then combine the results into a single answer. This increases speed, reduces latency, and makes resource usage more efficient. This is especially useful for document analysis, multimedia content generation, and working with large code projects.

A Breakthrough for Business and Science

Nvidia notes that Rubin CPX opens up new horizons for lawyers, doctors, and developers. In law, it will help work with hundreds of pages of laws; in medicine, it will help compare large arrays of patient data; and in IT, it will help analyze entire projects instead of individual files. In the creative field, the GPU will allow you to generate long videos and complex multimedia projects.

Focus on Inference, not Training

Unlike traditional solutions, Rubin CPX is primarily aimed at optimizing inference, accelerating the performance of existing models. This makes it attractive to companies that want to implement AI into their real-world business faster while reducing costs.

Market Launch

Rubin CPX is expected to hit the market in late 2026. Experts suggest that this processor could set a new standard for the industry, where working with long contexts will no longer be a rarity but the norm.

Share

Was this article helpful to you?

VPS popular offers

-15.4%

CPU
CPU
4 Xeon Cores
RAM
RAM
4 GB
Space
Space
100 GB SSD
Bandwidth
Bandwidth
60 Mbps
DDoS Protected SSD-wKVM 4096 Windows

73 /mo

/mo

Billed annually

-10%

CPU
CPU
3 Epyc Cores
RAM
RAM
2 GB
Space
Space
20 GB NVMe
Bandwidth
Bandwidth
Unlimited
KVM-NVMe 2048 Linux

14.9 /mo

/mo

Billed annually

-10%

CPU
CPU
3 Xeon Cores
RAM
RAM
1 GB
Space
Space
40 GB HDD
Bandwidth
Bandwidth
300 Gb
KVM-HDD HK 1024 Linux

4.98 /mo

/mo

Billed annually

-5%

CPU
CPU
3 Xeon Cores
RAM
RAM
1 GB
Space
Space
40 GB HDD
Bandwidth
Bandwidth
Unlimited
wKVM-HDD 1024 Windows

12.1 /mo

/mo

Billed annually

-10%

CPU
CPU
4 Xeon Cores
RAM
RAM
2 GB
Space
Space
30 GB SSD
Bandwidth
Bandwidth
Unlimited
10Ge-KVM-SSD 2048 Linux

30.3 /mo

/mo

Billed annually

-8.4%

CPU
CPU
4 Xeon Cores
RAM
RAM
2 GB
Space
Space
75 GB SSD
Bandwidth
Bandwidth
Unlimited
10Ge-wKVM-SSD 2048 Windows

37.4 /mo

/mo

Billed annually

CPU
CPU
8 Epyc Cores
RAM
RAM
32 GB
Space
Space
200 GB NVMe
Bandwidth
Bandwidth
Unlimited
Keitaro KVM 32768
OS
CentOS
Software
Software
Keitaro
/mo

Billed monthly

-20.6%

CPU
CPU
6 Xeon Cores
RAM
RAM
8GB
Space
Space
100GB SSD
Bandwidth
Bandwidth
500GB
KVM-SSD 8192 HK Linux

59 /mo

/mo

Billed annually

-10%

CPU
CPU
4 Xeon Cores
RAM
RAM
4 GB
Space
Space
50 GB SSD
Bandwidth
Bandwidth
Unlimited
10Ge-KVM-SSD 4096 Linux

60.5 /mo

/mo

Billed annually

-13.1%

CPU
CPU
2 Xeon Cores
RAM
RAM
512 MB
Space
Space
10 GB SSD
Bandwidth
Bandwidth
300 GB
KVM-SSD 512 HK Linux

7 /mo

/mo

Billed annually

Other articles on this topic

cookie

Accept cookies & privacy policy?

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we'll assume that you are happy to receive all cookies on the HostZealot website.