Announcement

Collapse
No announcement yet.

Looking At The OpenCL Performance Of ATI & NVIDIA On Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • allquixotic
    replied
    Hmm, also interesting: I got 31011133.23 average for the mandelGPU test. Although the test was not running at 1920x1080, but rather the default resolution, so that may account for the difference.

    If the resolution isn't important though, as it sometimes isn't, then I'm getting about 1.5x the performance of the GTX 460 with the HD5970. This is more in line with what I was expecting.

    Leave a comment:


  • allquixotic
    replied
    Hmm. On my HD5970 for SmallPT 1.6 GPU Caustic3, I'm getting 45200 KSamples/sec on the GPU, and ~16000 KSamples/sec on my Core i7 920. Neither part is overclocked; they're at their factory default clock rates.

    The GPU number is lower than either of Michael's radeons, but still a ways faster than Michael's GT 240. The numbers seem unaffected by whether compiz is on. I find it hard to accept that a HD5970 gets poorer results than a 5770. Even if a HD5970 is two 5850 cores together, shouldn't even one of those cores single-handedly outperform a 5770? And wouldn't OpenCL have the smarts to use both cores automatically to make it nearly twice as fast?

    I noticed something funky about the tests, though. When the test is running, the output visual says at the bottom something like 52000K samples/sec. This is substantially larger than the 45000 Ksamples/sec reported by PTS in the output. I'm not sure why such the large discrepancy. Bug in PTS? Bug in the test?

    Either way, it seems (disappointingly) that a HD5970 is only 3 times faster at this test than a Core i7? It is probably more economical to use a bunch of CPUs than to use GPUs for this kind of workload, seeing how a Core i7 is much cheaper than a dual gpu HD5970. We already know from other tests that a GPU is many, many, many times faster than the CPU at OpenGL 3d rendering, so maybe the parts needed for general purpose GPGPU are kept to a modest level on Evergreen in order to support top-of-the-line 3d graphics. I'm not complaining, since I don't use GPGPU for anything other than PTS

    Leave a comment:


  • brent
    replied
    MandelGPU might suffer from something similar (redefining __constant to __global), but I haven't checked.

    Leave a comment:


  • brent
    replied
    Michael, please keep in mind that SmallPtGPU contains a bug/incompatibility that seriously limits performance on NVidia hardware, especially pre-Fermi.

    Here's a diff that fixes it. This improves performance more than ten-fold on G80/GT200.

    Leave a comment:


  • vrodic
    replied
    Great benchmark!

    Thank you Michael!

    It would be nice to compare it with some CPU. So we can actually see if low end cards make sense for that processing work.

    I'm not sure which OpenCL CPU implementation is efficient and uses SSE2 etc. Maybe Intel has such implementation of the OpenCL compiler, or LLVM has some OpenCL frontend/parser.

    Leave a comment:


  • ssam
    replied
    do any of those benchmarks use double precision floating point?

    (and as always: it would be nice if you could put error bars on the plots)

    Leave a comment:


  • Michael
    replied
    Originally posted by kernelOfTruth View Post
    ah - thanks for the clarification !

    it's just that the results look rather in favor of Nvidia

    do you have a 5830 available (afaik that's a cypress LE) ?
    Nope... Any and all GPUs you would have seen in a review. Only Evergreen ASICs I have are the HD 5550 and HD 5570 and an HD 5450 - that I actually bought and am in the process of reviewing.

    Leave a comment:


  • kernelOfTruth
    replied
    Originally posted by Michael View Post
    No access to such hardware...
    ah - thanks for the clarification !

    it's just that the results look rather in favor of Nvidia

    do you have a 5830 available (afaik that's a cypress LE) ?

    Leave a comment:


  • numasan
    replied
    What about the FirePro's? Could be fun to see if they are that much faster in OpenCL than the consumer cards.

    Leave a comment:


  • Michael
    replied
    Originally posted by kernelOfTruth View Post
    why weren't the Cypress GPUs not included (5850, 5870) and only the smaller Junipers ?
    No access to such hardware...

    Leave a comment:

Working...
X