Announcement

Collapse
No announcement yet.

NVIDIA Linux OpenCL Performance vs. Radeon ROCm / AMDGPU-PRO

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • jlavi
    replied
    Being unable to install full AMDGPU-PRO due to package dependencies and version conflicts, I have installed only the few shared libraries from opencl-amdgpu-pro-icd* and libdrm-amdgpu-pro-amdgpu1_*4. I am using kernel 4.9 and there is no need from DKMS. It seems the OpenCL libraries from AMDGPU-PRO work just fine with amdgpu driver in kernel.

    I wanted to see if the OpenCL performance was the same as seen in Phoronix article. I tried to run the phoronix test suite to verify performance. Some of the tests didn't run (Rodinia and Luxmark) or some parts of the test didn't run (SHOC Triad, FFT SP, MAX SP Flops, and Texture Read Bandwidth). But I was able to run all but Rodinia test manually from command line. My card is MSI RX 480 with 1303 MHz clock, +3% difference to reference card with 1266 MHz.

    The largest gainer and loser were:
    julianGPU 113365184 vs. 81972594 (+38%)
    SHPC BusSpeedReadback 12.68 vs. 14.20 (-11%)

    Luxmark varied from +2 to +18%. SHOC FFT sp +4%, SHOC MaxFlops +4%. SHOC BusSpeedDownload -9%.

    It is as if there are two kind of results.

    GPU oriented benchmarks which run mostly in card's memory and do not use CPU or don't do transfers between CPU and GPU while benchmark is running, show some 2-4% improvement.

    And the tests which depend more on CPU speed or data transfers between CPU and GPU show more larger 2-digit gain or loss when compared to reference card figures by Phoronix.

    Leave a comment:


  • ariel
    replied
    Is there a guide/howto to build ROCm & use ROCm openCL for Darktable on Fedora/25?

    For amdgpu-pro the OpenCL-specific RHEL pakcages that AMD has in their site seem to work for darktable.

    This is for a RX 460 and I'm only interested in openCL acceleration for darktable (and gimp).

    Leave a comment:


  • bug77
    replied
    Originally posted by smitty3268 View Post

    Nvidia isn't really gimping CL performance (though it's not great compared to the effort they put into other APIs) but rather features. As in, they only support 1.2.

    It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.
    And yet they beat AMD in several benchmarks. That's a pretty nifty checkbox feature.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by bug77 View Post
    Hm, I thought Nvidia were gimping OpenCL to push CUDA instead. And yet, with only OpenCL 1.2 support, they're pretty competitive with AMD's OpenCL 2.0 solutions.
    Nvidia isn't really gimping CL performance (though it's not great compared to the effort they put into other APIs) but rather features. As in, they only support 1.2.

    It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.

    Leave a comment:


  • duby229
    replied
    It's really not that surprising. AMD's proprietary OpenGL driver may have performance issues, but their OpenCL drive has always performed pretty good. In past hardware generations it put nVidia's to shame.

    Leave a comment:


  • agd5f
    replied
    Originally posted by andrei_me View Post

    Do you mean DPM not setting the relevant parts to higher clocks with OpenCL like they do on graphics workloads?
    DPM sets the clocks dynamically based on gpu load, but the load patterns with some compute tasks don't always create the loads patterns necessary to keep the clocks at optimal levels.

    Leave a comment:


  • andrei_me
    replied
    Originally posted by agd5f View Post
    I would suggest forcing the clocks to high on the AMD cards as well. Some compute loads don't work well with the automatic settings.
    Do you mean DPM not setting the relevant parts to higher clocks with OpenCL like they do on graphics workloads?

    Leave a comment:


  • agd5f
    replied
    I would suggest forcing the clocks to high on the AMD cards as well. Some compute loads don't work well with the automatic settings.

    Leave a comment:


  • taxi_bs
    replied
    Michael, there are some indications that the opencL-numbers could be a lot better on some nvidia-cards with changed configuration. Try to set opencl_memory_headroom from 300 to 600 (for example, maybe less or more) in darktablerc-file.

    Leave a comment:


  • r1348
    replied
    Originally posted by devius View Post

    And some are far worse... no idea what to make of these results.
    Yes, very inconsistent results across the spectrum. Right now there's no real best-buy if OpenCL computing is your thing.

    Leave a comment:

Working...
X