GPUs can now use PCIe-attached memory or SSDs to boost VRAM capacity

Modern GPUs for AI and HPC applications come with a finite amount of high-bandwidth memory (HBM) built into the device, limiting their performance in AI and other workloads. However, new tech will allow companies to expand GPU memory capacity by slotting in more memory with devices connected to the PCIe bus instead of being limited to the memory built into the GPU — it even allows using SSDs for memory capacity expansion, too. Panmnesia, a company backed by South Korea's renowned KAIST research institute, has developed a low-latency CXL IP that could be used to expand GPU memory using CXL memory expanders.

The memory requirements of more advanced datasets for AI training are growing rapidly, which means that AI companies either have to buy new GPUs, use less sophisticated datasets, or use CPU memory at the cost of performance. Although CXL is a protocol that formally works on top of a PCIe link, thus enabling users to connect more memory to a system via the PCIe bus, the technology has to be recognized by an ASIC and its subsystem, so just adding a CXL controller is not enough to make the technology work, especially on a GPU.

Panmnesia faced challenges integrating CXL for GPU memory expansion due to the absence of a CXL logic fabric and subsystems that support DRAM and/or SSD endpoints in GPUs. In addition, GPU cache and memory subsystems do not recognize any expansions except unified virtual memory (UVM), which tends to be slow.

Comments