A new HGX H200 AI accelerator on Hopper architecture and HBM3e memory was presented by NVIDIA
16:37, 14.11.2023
NVIDIA announced HGX H200, a new hardware computing platform for artificial intelligence based on NVIDIA hopper architecture with H200 Tensor Core graphic processor.
NVIDIA HGX H200 offers the HBM3e high-speed memory for the first time. The accelerator features 141 GB memory with 4,8 TB per second speed. This is 2,4 times faster than NVIDIA A100. Major server and cloud providers are expected to start using H200 in their systems by the second quarter of 2024.
Besides hardware, NVIDIA also develops the software aspects of AI, offering TensorRT-LLM open-source libraries along with the new accelerator.
For instance, the H200 accelerator can provide the Llama 2 model featuring 70 billion parameters with a two times faster speed compared to the H100. New software is expected to increase the performance even more.
The NVIDIA H200, available in four- and eight-channel configurations, will be compatible with HGX H100 hardware and software. It’s also used in combination with processors NVIDIA Grace with ultrafast NVLink-C2C connection to make GH200 Grace Hopper with HBM3e. With NVLink and NVSwitch, the HGX H200 excels in LLM training and heavy modeling, offering over 32 petaflops of FP8 deep learning and 1.1 TB of memory.
The accelerators are deployable in any data center, and partners like ASRock Rack, ASUS, Dell, and others can upgrade the existing systems with H200. Cloud providers like AWS, Google Cloud, Microsoft Azure, and Oracle will deploy H200-based stations next year.