Announcement

Collapse
No announcement yet.

AMD OpenCL APP SDK Beats Intel's Own SDK On Ivy Bridge

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD OpenCL APP SDK Beats Intel's Own SDK On Ivy Bridge

    Phoronix: AMD OpenCL APP SDK Beats Intel's Own SDK On Ivy Bridge

    Here are some OpenCL benchmarks from the Intel Ivy Bridge CPU. Being compared though is AMD's APP SDK, which does support running OpenCL on x86 CPUs, to Intel's CPU-based OpenCL SDK for Linux. To some surprise, AMD's Accelerated Parallel Processing SDK when using the Ivy Bridge CPU is actually faster than the Intel OpenCL SDK on the same hardware.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Irony

    Originally posted by Michael
    This is a bit ironic that AMD's OpenCL environment is actually faster on Intel hardware than Intel's own code.
    Even more ironic would be a few years ago when Intel crippled performance for AMD CPUs in the Intel Compiler Suite (ICC), and some would say still does to some degree (although, i haven't payed close attention / kept track ).

    Now, it turns out AMD's OpenCL APP SDK actually improves performance on Intel's CPU. awesome

    funny shit.

    Comment


    • #3
      Intel should just deprecate their SDK in favor of (and contribute to) the open source one - like they're doing with MESA. I'm sure someone over at intel has done some math and figured out that AMD is benefitting from Intel's work to MESA, so now it's AMD's turn to return the favor so to say.

      1+ for the users.

      Plus Intel could re-allocate some of it's resources elsewhere. Like another dev for MESA maybe? :-)
      Last edited by halfmanhalfamazing; 13 June 2012, 04:07 PM.

      Comment


      • #4
        It's funny because AMD can't even write software, so what does that say about intel.

        Comment


        • #5
          The problem is Intel really doesn't like OpenCL.

          It highlights the weakest link in their hardware. AMD though can actually benefit from strong OpenCL support. It is actually a great selling point, use our software and even if your code runs on Intel hardware you will get the best performance.

          As to bench marking I'd really love to see some graphs of better i86 hardware and the better GPUs side by side. This to determine if there really is an advantage to those expensive GPUs. An APU or two thrown in the mix would be nice.

          Comment


          • #6
            What's the point of OpenCL on the CPU? Using fairly "meh" AMD graphics cards with a rather poorly-optimized OpenCL implementation (Catalyst) you can easily get more than twice the performance than a high-end CPU. If you switch to an Nvidia Kepler GPU on the Nvidia binary, holy cow look out, can we say ZING? Or, you know, Tesla dedicated compute cards (with no graphics ports) are probably the best, since they are specifically designed for GPGPU.

            If we assign a Core i7 3770K using an ideal software OpenCL implementation a score of "1", you'd have a chart looking something like this:
            Core i7 3770K CPU: 1
            Core i7 3770K GPU: 1.5? (haven't actually tested but it isn't going to be nearly as fast as a discrete chip)
            Radeon HD6870: 2
            Radeon HD7970: 4 or 5
            Nvidia GTX 680: 8 to 10
            Nvidia Tesla K10 (single precision only): 20

            Considering you'd need 20 Ivy Bridge CPUs to equal the throughput of a K10 which costs $2900, but each of those CPUs would also need a motherboard and RAM and PSU, making it way more expensive... you're probably better off going with the K10. Just a guess.

            So the use cases for these devices turn out being something like:

            1: I want to run LuxMark for 5 minutes just for fun! ---> Use Ivy Bridge 3770K CPU OpenCL.
            2. I want to encode a casual, short video for YouTube faster than my CPU could do it! ---> Use a desktop or laptop Radeon or GeForce with a video codec supporting OpenCL. Or if you have working drivers, use Intel QSV.
            3. I want to get in on the next big IPO before Warren Buffet's cronies! ---> Use a Tesla K10 (or many of them).

            Comment


            • #7
              Hmm

              Would it make any difference to say that the developer of LuxMark uses the AMD APP SDK to develop LuxMark, on a AMD GPU system? (Was 5850 + 2* 5870's at one time)

              Now I know OpenCL is meant to run on any supported hardware, but if LuxMark was written with the Intel OpenCL SDK for Intel CPU's first (given different optimizations could have made a difference) then perhaps it could be faster with the Intel OpenCL solution.

              I only suggest this because of how LuxMark was faster on AMD hardware (before the crippled OpenCL even more) and if nVidia hardware could have been faster if written in CUDA, or OpenCL targeted to the Fermi architecture.

              I'm no programmer though, so input from someone who might be able to agree/debunk what I said would be good :P

              Also +1 AMD

              Comment


              • #8
                Originally posted by allquixotic View Post
                .......
                If we assign a Core i7 3770K using an ideal software OpenCL implementation a score of "1", you'd have a chart looking something like this:
                Core i7 3770K CPU: 1
                Core i7 3770K GPU: 1.5? (haven't actually tested but it isn't going to be nearly as fast as a discrete chip)
                Radeon HD6870: 2
                Radeon HD7970: 4 or 5
                Nvidia GTX 680: 8 to 10
                Nvidia Tesla K10 (single precision only): 20
                ......
                Isn't the GTX 680's Compute preformance driver limited? To drive people to the nVidia Tesla/Quadro range? I know the 7970 outpreforms it in a number of compute tests. (Was the exception direct compute? that wasn't capped somehow?)

                Here is the collection of the 'Room' scene top 20 benchmarks. Top is 8* 580's, 4 7970's and you can see the 7970's are alot faster then the 580's (OpenCL cripple drive?r)
                http://www.luxrender.net/luxmark/top/top20/Room

                Comment


                • #9
                  Originally posted by zeealpal View Post
                  Would it make any difference to say that the developer of LuxMark uses the AMD APP SDK to develop LuxMark, on a AMD GPU system? (Was 5850 + 2* 5870's at one time)

                  Now I know OpenCL is meant to run on any supported hardware, but if LuxMark was written with the Intel OpenCL SDK for Intel CPU's first (given different optimizations could have made a difference) then perhaps it could be faster with the Intel OpenCL solution.

                  I only suggest this because of how LuxMark was faster on AMD hardware (before the crippled OpenCL even more) and if nVidia hardware could have been faster if written in CUDA, or OpenCL targeted to the Fermi architecture.

                  I'm no programmer though, so input from someone who might be able to agree/debunk what I said would be good :P

                  Also +1 AMD
                  You're not far off base here. The sad truth is that many "open standards" (OpenGL, OpenCL, HTML5, JavaScript, Java, etc) are written in such a way as to give the implementation a lot of leeway in how it implements the specifications. Obviously, each vendor is going to choose their own way: some will choose an easy-to-implement-but-slow way; some will choose to optimize for low RAM usage; some will choose to optimize for maximum cache locality; some will choose to optimize for fastest times in benchmarks; and so on. And there will be different parts of the implementation that will perform relatively better or worse for each given workload. You may also often encounter implementations that skip "checking" -- validation that would prevent the code from crashing, or allow it to print out a warning for incorrect/indeterminate behavior rather than doing something unpredictable. Sometimes that checking is required by the spec, and sometimes the implementors willfully violate the spec in the name of performance. All kinds of things can factor in.

                  So the task of writing a real world application against an open standard is much harder than you think, if performance matters to you: you almost need to write a separate version of your program code for every major implementation of the standard! Many companies don't bother to do this, and just allow their software to run poorly (slow or buggy or both) on certain implementations, but some top tier developers will invest the engineering effort to figure out what is the fastest path to accomplish their task for every major implementation of the spec.

                  To make a long story short, if you buy some extremely expensive enterprise software that does something interesting with OpenCL, chances are good that it will run well on the top 2 or 3 implementations from AMD, Intel and Nvidia. But if you run a benchmark that's written by a guy in his spare time, chances are it will be optimized to run on whatever hardware he wrote it on.

                  To the extent that OpenCL is successful in allowing the same program to run on disparate hardware (CPUs and GPUs from multiple vendors), it is also a failure, in that, "vendor-specific optimizations" are very much a part of writing code that is going to run well on your customers' systems. LuxMark clearly has no incentive to do that, particularly not with a CPU implementation of OpenCL. I do hope they optimize to favor Gallium / Clover, though; that would rock.
                  Last edited by allquixotic; 13 June 2012, 10:51 PM.

                  Comment


                  • #10
                    Cool, but I don't really care.

                    Too bad both Intel's OpenCL SDK and AMDs OpenCL SDK are both proprietary closed-source software.

                    @Intel, @AMD: source or gtfo!

                    Comment

                    Working...
                    X