A new Blackwell architecture by Nvidia – a new milestone in the evolution of GPUs

1m, 36s

09:07, 21.03.2024

At the GTC March 20224 event, NVIDIA presented a new chip architecture Blackwell as well as B200 GPUs based on it together with Grace Blackwell GB200 chips where both architectures will be combined.

The B200 GPU has 208 billion transistors compared to 80 billion H100/H200, previously used in data centers, offering 20 petaflops of AI performance per single GPU (VS. 4 petaflops of H100). Such a chip will feature 192 GB HBM3e memory with up to 8 TBps of bandwidth.

Unlike more conventional GPUs, Blackwell B200 is a kind of double processor, as it’s composed of two joint crystals working as a single CUDA processor, being connected with NV-HBI NVIDIA High Bandwidth Interface at 10 TBps. Blackwell B200 is manufactured using TSMC's 4NP process. The crystals feature HMB3e stacks, with 24 GB and 1 TBps bandwidth each.

For now, the most powerful solution announced is the GB200 chip, consisting of two B200 GPUs.

For connecting multiple nodes, Nvidia presents the fifth NVLink chip generation with bidirectional 1,8 TBps bandwidth, consisting of 50 billion transistors and manufactured using the TSMC 4NP technical process.

Every Blackwell GPU features 18 links through NVLink, which is 18 times more than in the case of H100. Since each link has 50 GBps of bidirectional bandwidth which means 100 GBps per connection, big groups of GPU nodes will function almost as one huge GPU unit.

Furthermore, the chips with new interfaces make up the NVIDIA B200 NVL72 server, which is an 18-server full-fledged rack solution with 18 1U servers, each having GB200 chips and a Grace CPU per each two GPU B200. This means that each computing node of GB200 NVL72 has two GB200 Superchips, with each rack containing two grace CPUs and four B200 GPUs featuring 80 petaflops FP4 AI and 40 petaflops FP8 AI performance.

A full GB200 has 36 Grace CPUs and 72 Blackwell GPUs with with 720 FP8 petaflops and 1440 FP4 petaflops. The 130 TBps of multinode bandwidth of this server is capable of processing up to 27 trillion AI language model parameters.

VPS popular offers

See all products

KVM-SSD 4096 HK

-22.2%

CPU

4 Xeon Cores

RAM

4 GB

Space

50 GB SSD

Bandwidth

300 GB

Linux

€ 33 /mo

€

/mo

Billed annually

wKVM-HDD 8192

-8.1%

CPU

6 Xeon Cores

RAM

8 GB

Space

200 GB HDD

Bandwidth

Unlimited

Windows

€ 31.25 /mo

€

/mo

Billed annually

wKVM-HDD HK 1024

-4.7%

CPU

3 Xeon Cores

RAM

1 GB

Space

40 GB HDD

Bandwidth

300 Gb

Windows

€ 10.26 /mo

€

/mo

Billed annually

KVM-NVMe 2048

-10%

CPU

3 Epyc Cores

RAM

2 GB

Space

20 GB NVMe

Bandwidth

Unlimited

Linux

€ 8.8 /mo

€

/mo

Billed annually

wKVM-SSD 2048 HK

-21.5%

CPU

2 Xeon Cores

RAM

2 GB

Space

75 GB SSD

Bandwidth

300 GB

Windows

€ 26 /mo

€

/mo

Billed annually

wKVM-SSD 1024 HK

-20.2%

CPU

1 Xeon Core

RAM

1 GB

Space

50 GB SSD

Bandwidth

300 GB

Windows

€ 19 /mo

€

/mo

Billed annually

wKVM-SSD 2048

-10%

CPU

4 Xeon Cores

RAM

2 GB

Space

75 GB SSD

Bandwidth

Unlimited

Windows

€ 10.23 /mo

€

/mo

Billed annually

10Ge-wKVM-SSD 16384

-12.3%

CPU

6 Xeon Cores

RAM

16 GB

Space

150 GB SSD

Bandwidth

Unlimited

Windows

€ 237 /mo

€

/mo

Billed annually

KVM-NVMe 8192

-10%

CPU

6 Epyc Cores

RAM

8 GB

Space

100 GB NVMe

Bandwidth

Unlimited

Linux

€ 26.35 /mo

€

/mo

Billed annually

wKVM-SSD 65536

-9.7%

CPU

10 Xeon Cores

RAM

64 GB

Space

300 GB SSD

Bandwidth

Unlimited

Windows

€ 138.99 /mo

€

/mo

Billed annually

A new Blackwell architecture by Nvidia – a new milestone in the evolution of GPUs

Was this article helpful to you?

VPS popular offers

KVM-SSD 4096 HK

wKVM-HDD 8192

wKVM-HDD HK 1024

KVM-NVMe 2048

wKVM-SSD 2048 HK

wKVM-SSD 1024 HK

wKVM-SSD 2048

10Ge-wKVM-SSD 16384

KVM-NVMe 8192

wKVM-SSD 65536

Other articles on this topic