RADV's ACO Back-End Can Be A Massive Win For Vulkan Compute - Not Just Gaming
While the Mesa "RADV" Radeon Vulkan driver's "ACO" back-end was developed and funded by Valve with gaming in mind to optimize game load times and help with delivering optimal performance, it turns out ACO works damn well for some Vulkan compute workloads too.
With the recent Vulkan neural network performance tests and the follow-up NCNN inference Vulkan tests on the AMD side they were already done with Mesa 20.3-devel where ACO is already the default and delivered strong performance generally against NVIDIA. The performance was great on Mesa's RADV with ACO but with one of the Tencent developers working on NCNN having mentioned that ACO is a big help, I was curious to see what the previous state is -- or when manually opting for the AMDGPU LLVM compiler back-end rather than ACO.
So first I ran some tests of Mesa 20.0 (as packaged by default on Ubuntu 20.04 LTS, AMDGPU LLVM by default) compared to Mesa 20.2.0 stable and then Mesa 20.3-devel. All tests were done on an AMD Ryzen 5 4500U "Renoir" laptop.
Damn! Mesa 20.2/20.3-devel are incredibly faster than Mesa 20.0 as shipped by Ubuntu 20.04 for the Vulkan neural network framework benchmarks with NCNN... In many cases, the worst case performance with the newer Mesa releases were around the speed of the best case Mesa 20.0 numbers on this AMD Renoir system.
For confirming that it was indeed due to ACO and not other changes, I did a follow-up run on the same Mesa 20.3-devel build while running the default (ACO) and then another run when forcing the AMDGPU LLVM back-end:
RADV ACO across the board is massively faster than using the AMDGPU LLVM back-end officially developed by AMD. The size of these wins are even greater than the difference we see in gaming for which Valve originally funded and continues to focus on ACO. Granted, with RADV+ACO it takes the AMD Radeon performance to compare speeds of NVIDIA's GeForce GPUs on their proprietary Linux driver, as shown in the previous articles. Had those comparisons been done prior to RADV switching to ACO, NVIDIA would have won by a massive margin.
Now if only more deep learning software and other compute workloads supported Vulkan...
With the recent Vulkan neural network performance tests and the follow-up NCNN inference Vulkan tests on the AMD side they were already done with Mesa 20.3-devel where ACO is already the default and delivered strong performance generally against NVIDIA. The performance was great on Mesa's RADV with ACO but with one of the Tencent developers working on NCNN having mentioned that ACO is a big help, I was curious to see what the previous state is -- or when manually opting for the AMDGPU LLVM compiler back-end rather than ACO.
So first I ran some tests of Mesa 20.0 (as packaged by default on Ubuntu 20.04 LTS, AMDGPU LLVM by default) compared to Mesa 20.2.0 stable and then Mesa 20.3-devel. All tests were done on an AMD Ryzen 5 4500U "Renoir" laptop.
Damn! Mesa 20.2/20.3-devel are incredibly faster than Mesa 20.0 as shipped by Ubuntu 20.04 for the Vulkan neural network framework benchmarks with NCNN... In many cases, the worst case performance with the newer Mesa releases were around the speed of the best case Mesa 20.0 numbers on this AMD Renoir system.
For confirming that it was indeed due to ACO and not other changes, I did a follow-up run on the same Mesa 20.3-devel build while running the default (ACO) and then another run when forcing the AMDGPU LLVM back-end:
RADV ACO across the board is massively faster than using the AMDGPU LLVM back-end officially developed by AMD. The size of these wins are even greater than the difference we see in gaming for which Valve originally funded and continues to focus on ACO. Granted, with RADV+ACO it takes the AMD Radeon performance to compare speeds of NVIDIA's GeForce GPUs on their proprietary Linux driver, as shown in the previous articles. Had those comparisons been done prior to RADV switching to ACO, NVIDIA would have won by a massive margin.
Now if only more deep learning software and other compute workloads supported Vulkan...
24 Comments