Red Hat Launches llm-d, a Kubernetes-Based Platform for Scalable AI Inference

watch 1m, 1s
views 2

13:10, 22.05.2025

Article Content
arrow

  • Key Features of llm-d
  • Cooperation with Leading Players in the AI Industry
  • Technology and Architecture

Red Hat has introduced llm-d, a new open source project designed for high-performance distributed inference of large language models (LLMs). The platform is developed on Kubernetes and is focused on simplifying the scaling of generative AI. The source code is available on GitHub under the Apache 2.0 license.

Key Features of llm-d

The main features of the platform include

  • Optimized Inference Scheduler for vLLM;
  • Disaggregated service architecture;
  • Reuse of prefix caches;
  • Flexible scaling depending on traffic, tasks, and available resources.

Cooperation with Leading Players in the AI Industry

The development is carried out in partnership with such companies as Nvidia, AMD, Intel, IBM Research, Google Cloud, CoreWeave, Hugging Face, and others. Such cooperation emphasizes the seriousness of the approach to llm-d and the potential of the platform as an industry standard.

Technology and Architecture

The project uses the vLLM library for distributed inference, as well as components such as LMCache for KV cache offloading, AI-enabled intelligent traffic routing, highly efficient communication APIs, and automatic scaling to load and infrastructure.

All this allows you to adapt the system to different usage scenarios and performance requirements. And the launch of llm-d can be a significant step towards democratizing powerful AI systems and making them accessible to a wide audience of developers and researchers.

Share

Was this article helpful to you?

VPS popular offers

-24.4%

CPU
CPU
2 Xeon Cores
RAM
RAM
1 GB
Space
Space
20 GB SSD
Bandwidth
Bandwidth
300 GB
KVM-SSD 1024 HK Linux

13 /mo

/mo

Billed annually

-15.3%

CPU
CPU
4 Xeon Cores
RAM
RAM
2 GB
Space
Space
75 GB SSD
Bandwidth
Bandwidth
40 Mbps
DDoS Protected SSD-wKVM 2048 Windows

54 /mo

/mo

Billed annually

-13.1%

CPU
CPU
2 Xeon Cores
RAM
RAM
512 MB
Space
Space
10 GB SSD
Bandwidth
Bandwidth
300 GB
KVM-SSD 512 HK Linux

7 /mo

/mo

Billed annually

-15.4%

CPU
CPU
6 Xeon Cores
RAM
RAM
16 GB
Space
Space
150 GB SSD
Bandwidth
Bandwidth
100 Mbps
DDoS Protected SSD-wKVM 16384 Windows

130 /mo

/mo

Billed annually

-10%

CPU
CPU
3 Epyc Cores
RAM
RAM
2 GB
Space
Space
25 GB NVMe
Bandwidth
Bandwidth
Unlimited
wKVM-NVMe 2048 Windows

9.9 /mo

/mo

Billed annually

-8.8%

CPU
CPU
6 Xeon Cores
RAM
RAM
16 GB
Space
Space
400 GB HDD
Bandwidth
Bandwidth
300 Gb
wKVM-HDD HK 16384 Windows

45.61 /mo

/mo

Billed annually

-10%

CPU
CPU
4 Epyc Cores
RAM
RAM
4 GB
Space
Space
50 GB NVMe
Bandwidth
Bandwidth
Unlimited
Keitaro KVM 4096
OS
CentOS
Software
Software
Keitaro

18.1 /mo

/mo

Billed annually

-20.5%

CPU
CPU
6 Xeon Cores
RAM
RAM
16 GB
Space
Space
150 GB SSD
Bandwidth
Bandwidth
10 TB
KVM-SSD 16384 Metered Linux

95 /mo

/mo

Billed annually

-10%

CPU
CPU
6 Epyc Cores
RAM
RAM
8 GB
Space
Space
100 GB NVMe
Bandwidth
Bandwidth
Unlimited
aiKVM-NVMe 8192 Linux

26.87 /mo

/mo

Billed annually

-10%

CPU
CPU
4 Xeon Cores
RAM
RAM
2 GB
Space
Space
30 GB SSD
Bandwidth
Bandwidth
Unlimited
10Ge-KVM-SSD 2048 Linux

30.3 /mo

/mo

Billed annually

Other articles on this topic

cookie

Accept cookies & privacy policy?

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we'll assume that you are happy to receive all cookies on the HostZealot website.