Announcement

Collapse
No announcement yet.

AMD OpenCL APP SDK Beats Intel's Own SDK On Ivy Bridge

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Wilfred
    replied
    Originally posted by Ansla View Post
    Where did you get those values from? Your ass? Every benchmark I saw of those cards had results ranging from GTX680 insignificantly faster then HD7970 for some tests to HD7970 being more then twice as fast as GTX680 for other tests.
    I agree with you, just look at:

    It appears radeon is much better for gpgpu than geforce.

    Leave a comment:


  • Ansla
    replied
    Originally posted by allquixotic View Post
    Radeon HD7970: 4 or 5
    Nvidia GTX 680: 8 to 10
    Where did you get those values from? Your ass? Every benchmark I saw of those cards had results ranging from GTX680 insignificantly faster then HD7970 for some tests to HD7970 being more then twice as fast as GTX680 for other tests.

    Leave a comment:


  • wstorm
    replied
    Originally posted by chris200x9 View Post
    It's funny because AMD can't even write software, so what does that say about intel.
    hah, i have tried intel code composer studio once. Thats lolware.

    Leave a comment:


  • AnonymousCoward
    replied
    Originally posted by zeealpal View Post
    Isn't the GTX 680's Compute preformance driver limited? To drive people to the nVidia Tesla/Quadro range? I know the 7970 outpreforms it in a number of compute tests. (Was the exception direct compute? that wasn't capped somehow?)

    Here is the collection of the 'Room' scene top 20 benchmarks. Top is 8* 580's, 4 7970's and you can see the 7970's are alot faster then the 580's (OpenCL cripple drive?r)
    http://www.luxrender.net/luxmark/top/top20/Room
    there was/is some performance regression in OpenCL driver on nVidia site. also Geforce 680 and other chips from this family is crap for GPGPU. nVidia strip down every feature which is not needed for DX/OGL rendering and it is not driver/SW limited. nVidia will bring two different GPUs. one for gaming purpose and second for Tesla GPGPU purpose. on other hand new Northern Island aka Radeon 7xxx chips are huge improvement for GPGPU on AMD side.

    Leave a comment:


  • uid313
    replied
    Cool, but I don't really care.

    Too bad both Intel's OpenCL SDK and AMDs OpenCL SDK are both proprietary closed-source software.

    @Intel, @AMD: source or gtfo!

    Leave a comment:


  • allquixotic
    replied
    Originally posted by zeealpal View Post
    Would it make any difference to say that the developer of LuxMark uses the AMD APP SDK to develop LuxMark, on a AMD GPU system? (Was 5850 + 2* 5870's at one time)

    Now I know OpenCL is meant to run on any supported hardware, but if LuxMark was written with the Intel OpenCL SDK for Intel CPU's first (given different optimizations could have made a difference) then perhaps it could be faster with the Intel OpenCL solution.

    I only suggest this because of how LuxMark was faster on AMD hardware (before the crippled OpenCL even more) and if nVidia hardware could have been faster if written in CUDA, or OpenCL targeted to the Fermi architecture.

    I'm no programmer though, so input from someone who might be able to agree/debunk what I said would be good :P

    Also +1 AMD
    You're not far off base here. The sad truth is that many "open standards" (OpenGL, OpenCL, HTML5, JavaScript, Java, etc) are written in such a way as to give the implementation a lot of leeway in how it implements the specifications. Obviously, each vendor is going to choose their own way: some will choose an easy-to-implement-but-slow way; some will choose to optimize for low RAM usage; some will choose to optimize for maximum cache locality; some will choose to optimize for fastest times in benchmarks; and so on. And there will be different parts of the implementation that will perform relatively better or worse for each given workload. You may also often encounter implementations that skip "checking" -- validation that would prevent the code from crashing, or allow it to print out a warning for incorrect/indeterminate behavior rather than doing something unpredictable. Sometimes that checking is required by the spec, and sometimes the implementors willfully violate the spec in the name of performance. All kinds of things can factor in.

    So the task of writing a real world application against an open standard is much harder than you think, if performance matters to you: you almost need to write a separate version of your program code for every major implementation of the standard! Many companies don't bother to do this, and just allow their software to run poorly (slow or buggy or both) on certain implementations, but some top tier developers will invest the engineering effort to figure out what is the fastest path to accomplish their task for every major implementation of the spec.

    To make a long story short, if you buy some extremely expensive enterprise software that does something interesting with OpenCL, chances are good that it will run well on the top 2 or 3 implementations from AMD, Intel and Nvidia. But if you run a benchmark that's written by a guy in his spare time, chances are it will be optimized to run on whatever hardware he wrote it on.

    To the extent that OpenCL is successful in allowing the same program to run on disparate hardware (CPUs and GPUs from multiple vendors), it is also a failure, in that, "vendor-specific optimizations" are very much a part of writing code that is going to run well on your customers' systems. LuxMark clearly has no incentive to do that, particularly not with a CPU implementation of OpenCL. I do hope they optimize to favor Gallium / Clover, though; that would rock.
    Last edited by allquixotic; 13 June 2012, 10:51 PM.

    Leave a comment:


  • zeealpal
    replied
    Originally posted by allquixotic View Post
    .......
    If we assign a Core i7 3770K using an ideal software OpenCL implementation a score of "1", you'd have a chart looking something like this:
    Core i7 3770K CPU: 1
    Core i7 3770K GPU: 1.5? (haven't actually tested but it isn't going to be nearly as fast as a discrete chip)
    Radeon HD6870: 2
    Radeon HD7970: 4 or 5
    Nvidia GTX 680: 8 to 10
    Nvidia Tesla K10 (single precision only): 20
    ......
    Isn't the GTX 680's Compute preformance driver limited? To drive people to the nVidia Tesla/Quadro range? I know the 7970 outpreforms it in a number of compute tests. (Was the exception direct compute? that wasn't capped somehow?)

    Here is the collection of the 'Room' scene top 20 benchmarks. Top is 8* 580's, 4 7970's and you can see the 7970's are alot faster then the 580's (OpenCL cripple drive?r)
    http://www.luxrender.net/luxmark/top/top20/Room

    Leave a comment:


  • zeealpal
    replied
    Hmm

    Would it make any difference to say that the developer of LuxMark uses the AMD APP SDK to develop LuxMark, on a AMD GPU system? (Was 5850 + 2* 5870's at one time)

    Now I know OpenCL is meant to run on any supported hardware, but if LuxMark was written with the Intel OpenCL SDK for Intel CPU's first (given different optimizations could have made a difference) then perhaps it could be faster with the Intel OpenCL solution.

    I only suggest this because of how LuxMark was faster on AMD hardware (before the crippled OpenCL even more) and if nVidia hardware could have been faster if written in CUDA, or OpenCL targeted to the Fermi architecture.

    I'm no programmer though, so input from someone who might be able to agree/debunk what I said would be good :P

    Also +1 AMD

    Leave a comment:


  • allquixotic
    replied
    What's the point of OpenCL on the CPU? Using fairly "meh" AMD graphics cards with a rather poorly-optimized OpenCL implementation (Catalyst) you can easily get more than twice the performance than a high-end CPU. If you switch to an Nvidia Kepler GPU on the Nvidia binary, holy cow look out, can we say ZING? Or, you know, Tesla dedicated compute cards (with no graphics ports) are probably the best, since they are specifically designed for GPGPU.

    If we assign a Core i7 3770K using an ideal software OpenCL implementation a score of "1", you'd have a chart looking something like this:
    Core i7 3770K CPU: 1
    Core i7 3770K GPU: 1.5? (haven't actually tested but it isn't going to be nearly as fast as a discrete chip)
    Radeon HD6870: 2
    Radeon HD7970: 4 or 5
    Nvidia GTX 680: 8 to 10
    Nvidia Tesla K10 (single precision only): 20

    Considering you'd need 20 Ivy Bridge CPUs to equal the throughput of a K10 which costs $2900, but each of those CPUs would also need a motherboard and RAM and PSU, making it way more expensive... you're probably better off going with the K10. Just a guess.

    So the use cases for these devices turn out being something like:

    1: I want to run LuxMark for 5 minutes just for fun! ---> Use Ivy Bridge 3770K CPU OpenCL.
    2. I want to encode a casual, short video for YouTube faster than my CPU could do it! ---> Use a desktop or laptop Radeon or GeForce with a video codec supporting OpenCL. Or if you have working drivers, use Intel QSV.
    3. I want to get in on the next big IPO before Warren Buffet's cronies! ---> Use a Tesla K10 (or many of them).

    Leave a comment:


  • wizard69
    replied
    The problem is Intel really doesn't like OpenCL.

    It highlights the weakest link in their hardware. AMD though can actually benefit from strong OpenCL support. It is actually a great selling point, use our software and even if your code runs on Intel hardware you will get the best performance.

    As to bench marking I'd really love to see some graphs of better i86 hardware and the better GPUs side by side. This to determine if there really is an advantage to those expensive GPUs. An APU or two thrown in the mix would be nice.

    Leave a comment:

Working...
X