Announcement

**Jumbotron** · 27 March 2024, 11:12 AM

Ummm…is this Intel tooling up something like AMDs Infinity Architecture ?

**pong** · 28 March 2024, 12:12 AM

Originally posted by Jumbotron View Post

Ummm…is this Intel tooling up something like AMDs Infinity Architecture ?

What is that (infinity architecture)?

I thought SVM is just generically conceptually similar to (identical to?) having 64 (or some subset thereof) bit virtual addresses which when accessed resolve to memory which can reside in "any" location (system RAM, VRAM, other device memory, mmaped storage, ...) within a system so that when doing reads / writes / DMA or whatever the addressed memory would read / write as one would hope would be possible if the memory region is intended to be shared between devices / processes in the system.

There's some commentary about SVM in some of the ARC related documentation IIRC.

I think a fairly ideal case would be like the HMM heterogeneous memory management nvidia talks about:

Simplifying GPU Application Development with Heterogeneous Memory Management | NVIDIA Technical Blog

https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/

Heterogeneous Memory Management (HMM) is a CUDA memory management feature that improves programmer productivity for all programming models built on top of CUDA.

..."
Heterogeneous Memory Management (HMM) is a CUDA memory management feature that extends the simplicity and productivity of the CUDA Unified Memory programming model to include system allocated memory on systems with PCIe-connected NVIDIA GPUs. System allocated memory refers to memory that is ultimately allocated by the operating system; for example, through malloc, mmap, the C++ new operator (which of course uses the preceding mechanisms), or related system routines that set up CPU-accessible memory for the application.

Previously, on PCIe-based machines, system allocated memory was not directly accessible by the GPU. The GPU could only access memory that came from special allocators such as cudaMalloc or cudaMallocManaged.

With HMM enabled, all application threads (GPU or CPU) can directly access all of the application’s system allocated memory. As with Unified Memory (which can be thought of as a subset of, or precursor to HMM), there is no need to manually copy system allocated memory between processors. This is because it is automatically placed on the CPU or GPU, based on processor usage.

Within the CUDA driver stack, CPU and GPU page faults are typically used to discover where the memory should be placed. Again, this automatic placement already happens with Unified Memory—HMM simply extends the behavior to cover system allocated memory as well as cudaMallocManaged memory.

This new ability to directly read or write to the full application memory address space will significantly improve programmer productivity for all programming models built on top of CUDA: CUDA C++, Fortran, standard parallelism in Python, ISO C++, ISO Fortran, OpenACC, OpenMP, and many others.

In fact, as the upcoming examples demonstrate, HMM simplifies GPU programming to the point that GPU programming is nearly as accessible as CPU programming. Some highlights:

Explicit memory management is not required for functionality when writing a GPU program; therefore, an initial “first draft” program can be small and simple. Explicit memory management (for performance tuning) can be deferred to a later phase of development.
GPU programming is now practical for programming languages that do not distinguish between CPU and GPU memory.
Large applications can be GPU-accelerated without requiring large memory management refactoring, or changes to third-party libraries (for which source code is not always available).

As an aside, new hardware platforms such as NVIDIA Grace Hopper natively support the Unified Memory programming model through hardware-based memory coherence among all CPUs and GPUs. For such systems, HMM is not required, and in fact, HMM is automatically disabled there. One way to think about this is to observe that HMM is effectively a software-based way of providing the same programming model as an NVIDIA Grace Hopper Superchip.

..."

Announcement

Intel Xe Developers Begin Looking At Cross-Device & Cross-Driver HMM

Intel Xe Developers Begin Looking At Cross-Device & Cross-Driver HMM

Comment

Comment