I would suggest forcing the clocks to high on the AMD cards as well. Some compute loads don't work well with the automatic settings.
Announcement
Collapse
No announcement yet.
NVIDIA Linux OpenCL Performance vs. Radeon ROCm / AMDGPU-PRO
Collapse
X
-
-
Originally posted by andrei_me View Post
Do you mean DPM not setting the relevant parts to higher clocks with OpenCL like they do on graphics workloads?
- Likes 2
Comment
-
Originally posted by bug77 View PostHm, I thought Nvidia were gimping OpenCL to push CUDA instead. And yet, with only OpenCL 1.2 support, they're pretty competitive with AMD's OpenCL 2.0 solutions.
It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.
Comment
-
Originally posted by smitty3268 View Post
Nvidia isn't really gimping CL performance (though it's not great compared to the effort they put into other APIs) but rather features. As in, they only support 1.2.
It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.
Comment
-
Is there a guide/howto to build ROCm & use ROCm openCL for Darktable on Fedora/25?
For amdgpu-pro the OpenCL-specific RHEL pakcages that AMD has in their site seem to work for darktable.
This is for a RX 460 and I'm only interested in openCL acceleration for darktable (and gimp).
Comment
-
Being unable to install full AMDGPU-PRO due to package dependencies and version conflicts, I have installed only the few shared libraries from opencl-amdgpu-pro-icd* and libdrm-amdgpu-pro-amdgpu1_*4. I am using kernel 4.9 and there is no need from DKMS. It seems the OpenCL libraries from AMDGPU-PRO work just fine with amdgpu driver in kernel.
I wanted to see if the OpenCL performance was the same as seen in Phoronix article. I tried to run the phoronix test suite to verify performance. Some of the tests didn't run (Rodinia and Luxmark) or some parts of the test didn't run (SHOC Triad, FFT SP, MAX SP Flops, and Texture Read Bandwidth). But I was able to run all but Rodinia test manually from command line. My card is MSI RX 480 with 1303 MHz clock, +3% difference to reference card with 1266 MHz.
The largest gainer and loser were:
julianGPU 113365184 vs. 81972594 (+38%)
SHPC BusSpeedReadback 12.68 vs. 14.20 (-11%)
Luxmark varied from +2 to +18%. SHOC FFT sp +4%, SHOC MaxFlops +4%. SHOC BusSpeedDownload -9%.
It is as if there are two kind of results.
GPU oriented benchmarks which run mostly in card's memory and do not use CPU or don't do transfers between CPU and GPU while benchmark is running, show some 2-4% improvement.
And the tests which depend more on CPU speed or data transfers between CPU and GPU show more larger 2-digit gain or loss when compared to reference card figures by Phoronix.
Comment
Comment