New Heterogeneous Memory Management For Linux, Will Be Supported By NVIDIA/Nouveau
It's been a while since hearing anything about Heterogeneous Memory Management and frankly I even forgot about these pending HMM patches or for seeing any new work from Jerome Glisse at Red Hat. For those that forgot, Jerome was one of the early contributors to the open-souce AMD driver work going back to the xf86-video-avivo (pre-RadeonHD) days when wanting to make an open-source R500 graphics driver.
Anyhow, on Friday he posted version 13 of the HMM patches that contains various improvements. He also commented, "I am hoping that this can now be consider for inclusion upstream. Bottom line is that without HMM we can not support some of the new hardware features on x86 PCIE. I do believe we need some solution to support those features or we won't be able to use such hardware in standard like C++17, OpenCL 3.0 and others."
HMM is a layer for a device wanting to mirror a process address space into their own MMU. HMM is designed for GPUs and others in needing to support the latest OpenCL for mirroring a process address space. HMM also makes it possible for using the discrete GPU memory in a transparent manner to the application/game and more.
With earlier versions of the HMM patches, Jerome was talking about Mellanox that made use of some of the HMM functionality but now it's turned to NVIDIA. Jerome commented, "I have been working with NVidia to bring up this feature on their Pascal GPU. There are real hardware that you can buy today that could benefit from HMM. We also intend to leverage this inside the open source nouveau driver."
John Hubbard of NVIDIA also commented:
We (NVIDIA engineering) have been working closely with Jerome on this for several years now, and I wanted to mention that NVIDIA is committed to using HMM. We've done initial testing of this patchset on Pascal GPUs (a bit more detail below) and it is looking good.
The HMM features are a prerequisite to an important part of NVIDIA's efforts to make writing code for GPUs (and other page-faulting devices) easier--by making it more like writing code for CPUs. A big part of that story involves being able to use malloc'd memory transparently everywhere. Here's a tiny example (in case it's not obvious from the HMM patchset documentation) of HMM in action:
int *p = (int*)malloc(SIZE); *p = 5; /* on the CPU */
x = *p; /* on a GPU, or on any page-fault-capable device */
1. A device page fault occurs because the malloc'd memory was never allocated in the device's page tables.
2. The device driver receives a page fault interrupt, but fails to recognize the address, so it calls into HMM.
3. HMM knows that p is valid on the CPU, and coordinates with the device driver to unmap the CPU page, allocate a page on the device, and then migrate (copy) the data to the device. This allows full device memory bandwidth to be available, which is critical to getting good performance.
a) Alternatively, leave the page on the CPU, and create a device PTE to point to that page. This might be done if our performance counters show that a page is thrashing.
4. The device driver issues a replay-page-fault to the device.
5. The device program continues running, and x == 5 now.
When version 1 of this patchset was created (2.5 years ago! in May, 2014), one huge concern was that we didn't yet have hardware that could use it. But now we do: Pascal GPUs, which have been shipping this year, all support replayable page faults.
Great to see this come about and that they will also commit patches to Nouveau -- likely for needing to demonstrate an open-source "client" of this code rather than just their binary driver, in order to avoid difficulties getting this mainlined. But the good news is this is only supported by Pascal so for them to support this with Nouveau they would first need to provide the signed firmware images for Pascal for bringing up that hardware-acceleration support.
With the Linux 4.10 merge window being just weeks away, it's not clear if they will have HMM reviewed and readied in time otherwise would be looking more likely Linux 4.11+ material.
More details on the kernel mailing list.