Announcement

Collapse
No announcement yet.

NVIDIA Linux OpenCL Performance vs. Radeon ROCm / AMDGPU-PRO

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    I would suggest forcing the clocks to high on the AMD cards as well. Some compute loads don't work well with the automatic settings.

    Comment


    • #12
      Originally posted by agd5f View Post
      I would suggest forcing the clocks to high on the AMD cards as well. Some compute loads don't work well with the automatic settings.
      Do you mean DPM not setting the relevant parts to higher clocks with OpenCL like they do on graphics workloads?

      Comment


      • #13
        Originally posted by andrei_me View Post

        Do you mean DPM not setting the relevant parts to higher clocks with OpenCL like they do on graphics workloads?
        DPM sets the clocks dynamically based on gpu load, but the load patterns with some compute tasks don't always create the loads patterns necessary to keep the clocks at optimal levels.

        Comment


        • #14
          It's really not that surprising. AMD's proprietary OpenGL driver may have performance issues, but their OpenCL drive has always performed pretty good. In past hardware generations it put nVidia's to shame.

          Comment


          • #15
            Originally posted by bug77 View Post
            Hm, I thought Nvidia were gimping OpenCL to push CUDA instead. And yet, with only OpenCL 1.2 support, they're pretty competitive with AMD's OpenCL 2.0 solutions.
            Nvidia isn't really gimping CL performance (though it's not great compared to the effort they put into other APIs) but rather features. As in, they only support 1.2.

            It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.

            Comment


            • #16
              Originally posted by smitty3268 View Post

              Nvidia isn't really gimping CL performance (though it's not great compared to the effort they put into other APIs) but rather features. As in, they only support 1.2.

              It's basically pretty evident that they view CL as a checkbox feature they have to support but don't really care about.
              And yet they beat AMD in several benchmarks. That's a pretty nifty checkbox feature.

              Comment


              • #17
                Is there a guide/howto to build ROCm & use ROCm openCL for Darktable on Fedora/25?

                For amdgpu-pro the OpenCL-specific RHEL pakcages that AMD has in their site seem to work for darktable.

                This is for a RX 460 and I'm only interested in openCL acceleration for darktable (and gimp).

                Comment


                • #18
                  Being unable to install full AMDGPU-PRO due to package dependencies and version conflicts, I have installed only the few shared libraries from opencl-amdgpu-pro-icd* and libdrm-amdgpu-pro-amdgpu1_*4. I am using kernel 4.9 and there is no need from DKMS. It seems the OpenCL libraries from AMDGPU-PRO work just fine with amdgpu driver in kernel.

                  I wanted to see if the OpenCL performance was the same as seen in Phoronix article. I tried to run the phoronix test suite to verify performance. Some of the tests didn't run (Rodinia and Luxmark) or some parts of the test didn't run (SHOC Triad, FFT SP, MAX SP Flops, and Texture Read Bandwidth). But I was able to run all but Rodinia test manually from command line. My card is MSI RX 480 with 1303 MHz clock, +3% difference to reference card with 1266 MHz.

                  The largest gainer and loser were:
                  julianGPU 113365184 vs. 81972594 (+38%)
                  SHPC BusSpeedReadback 12.68 vs. 14.20 (-11%)

                  Luxmark varied from +2 to +18%. SHOC FFT sp +4%, SHOC MaxFlops +4%. SHOC BusSpeedDownload -9%.

                  It is as if there are two kind of results.

                  GPU oriented benchmarks which run mostly in card's memory and do not use CPU or don't do transfers between CPU and GPU while benchmark is running, show some 2-4% improvement.

                  And the tests which depend more on CPU speed or data transfers between CPU and GPU show more larger 2-digit gain or loss when compared to reference card figures by Phoronix.

                  Comment

                  Working...
                  X