AMD has released ROCm 5.2 as the newest version of its open-source GPU compute stack.New with ROCm 5.2 for this Linux open-source GPU compute stack are a number of new HIP APIs, support for device-side memory allocations (malloc) within the HIP-Clang compiler, the introduction of the new rocWMMA library, new test/benchmark executables for various components, some new routines for rocSOLVER, dropping Navi 12 / GFX1011 support for rocBLAS' fat binary, and OpenMP tracing (OMPT) target support for device tracing.The new HIP API additions are in the areas of device management, HIP run-time for memory management, Graph Management, and Virtual Memory Management.

"rocWMMA provides a C++ API to facilitate breaking down matrix multiply accumulate problems into fragments and using them in block-wise operations that are distributed in parallel across GPU wavefronts. The API is a header library of GPU device code, meaning matrix core acceleration may be compiled directly into your kernel device code. This can benefit from compiler optimization in the generation of kernel assembly and does not incur additional overhead costs of linking to external runtime libraries or having to launch separate kernels.



rocWMMA is released as a header library and includes test and sample projects to validate and illustrate example usages of the C++ API. GEMM matrix multiplication is used as primary validation given the heavy precedent for the library. However, the usage portfolio is growing significantly and demonstrates different ways rocWMMA may be consumed."