AMD Releases ROCm 6.2 With New Components, Improves PyTorch & TensorFlow
As expected, AMD has released ROCm 6.2 as the newest version of their open-source GPU compute stack for Radeon graphics cards and Instinct accelerators. ROCm 6.2 is a big update with several new software components, improving the existing PyTorch and TensorFlow support, and a variety of other enhancements as AMD works to better compete with NVIDIA's CUDA.
The new components to ROCm 6.2 are Omniperf, Omnitrace, rocPyDecode, and the ROCprofiler SDK. Omniperf is a kernel-level profiling tool for machine learning and HPC workloads. Omnitrace is a new multi-purpose analysis tool for profiling and tracong both on the CPU and GPU(s). The rocPyDecode package is for accessing ROCm's rocDecode APIs from the Python programming language. Lastly ROCprofiler-SDK is for profiling and tracing for HIP and ROCm applications.
In addition to the new software components, ROCm 6.2 has its math libraries now defaulting to using the Clang compiler rather than HIPCC. There is also better PyTorch support with v2.2/v2.3 enabled and TensorFlow integration now working for TensorFlow 2.16. ROCm PyTorch support also includes support now for Autocast as the automatic mixed precision mode. There is also optimized native framework support introduced for OpenCLX.
ROCm 6.2 further provides memory savings for the "bitsandbytes" model quantization, improves vLLM support and enhances performance tuning for AMD Instinct accelerators. On the vLLM side there is now FP16 and BF16 precision for large language models and FP8 support is working for Llama. There is additional work around multi-GPU execution and other refinements in vLLM support.
ROCm 6.2 also introduces an offline installer to help those running ROCm on systems without an active Internet connection. ROCm 6.2 is also the first release officially supporting Ubuntu 24.04 LTS.
Downloads and more information on the ROCm 6.2 compute stack release via rocm.docs.amd.com. On the AMD Radeon consumer side (besides Radeon PRO and Instinct), the officially supported graphics cards with ROCm 6.2 are the Radeon RX 7900 GRE / RX 7900 XT / RX 7900 XTX and deprecated support for the Radeon VII.
The new components to ROCm 6.2 are Omniperf, Omnitrace, rocPyDecode, and the ROCprofiler SDK. Omniperf is a kernel-level profiling tool for machine learning and HPC workloads. Omnitrace is a new multi-purpose analysis tool for profiling and tracong both on the CPU and GPU(s). The rocPyDecode package is for accessing ROCm's rocDecode APIs from the Python programming language. Lastly ROCprofiler-SDK is for profiling and tracing for HIP and ROCm applications.
In addition to the new software components, ROCm 6.2 has its math libraries now defaulting to using the Clang compiler rather than HIPCC. There is also better PyTorch support with v2.2/v2.3 enabled and TensorFlow integration now working for TensorFlow 2.16. ROCm PyTorch support also includes support now for Autocast as the automatic mixed precision mode. There is also optimized native framework support introduced for OpenCLX.
ROCm 6.2 further provides memory savings for the "bitsandbytes" model quantization, improves vLLM support and enhances performance tuning for AMD Instinct accelerators. On the vLLM side there is now FP16 and BF16 precision for large language models and FP8 support is working for Llama. There is additional work around multi-GPU execution and other refinements in vLLM support.
ROCm 6.2 also introduces an offline installer to help those running ROCm on systems without an active Internet connection. ROCm 6.2 is also the first release officially supporting Ubuntu 24.04 LTS.
Downloads and more information on the ROCm 6.2 compute stack release via rocm.docs.amd.com. On the AMD Radeon consumer side (besides Radeon PRO and Instinct), the officially supported graphics cards with ROCm 6.2 are the Radeon RX 7900 GRE / RX 7900 XT / RX 7900 XTX and deprecated support for the Radeon VII.
25 Comments