Gallium3D LLVMpipe Compared To Nine Graphics Cards

Kivada replied

01 May 2013, 07:29 AM
Well on the consumer end, having something that can do ray tracing at an acceptable resolution and frame rate would be a massive boon for gaming companies, since it opens up the flood gates on the levels of realism and physics they can do.

Forgive the usual Intel marketing bullshit since this is actually only being displayed via remote desktop on the laptop, the rendering is actually being done on 4 bigass servers...
Leave a comment:
DrYak replied

01 May 2013, 06:35 AM
GPU are optimized to do exactly the same stuff on a huge amount of data.
Take the texels, apply the shaders, output the pixels. The same small amount of code multiplied over millions of pixels.
Thus they are desiged to more or less apply the exact same instruction simultaneously on bit sets of data.
Under the hood they are very huge SIMD processors which keep dozens copies of the exact same thread running at the same time.
Things like conditions and similar are implemented with masks and serializing.

A GPU isn't that efficient when there's too much divergences in the code-path taken at execution time.
It works very nicely for graphics, because you apply the same shader over-and-over again until you've output all the needed pixels.
It works not so nice for raytracing because neighbouring rays might end up taking different path, and thus you reach different states and need to execute different code path. You're execution diverge too much. If one single elements take much more time to process, all the thread copies sharing the same block stall until this "slow-one" has finished.

Things like Larrabee, Tilera (or on smaller scale: Sun's Niaggara, or precursors like Cell) are designed to tolerate much more divergence. They are huge collection of tiny light-weight processors.
But they are much more independent, and basically each run its own thread in it own corner. They *do* share cache and a lot of other resource of common, so it's not quite the same as having a server farm, but they are not forced to run the exact same instuction all at the same time.
Thus they are much more efficient at heavily diverging code-path.
They are good with ray tracing: if one ray takes more processing than its neighbours, the light-weight processor handling it will keep working on it while the other will take another job.
But they aren't that good for pixel churining: lots of resource are wasted for thing which will get redundant when you just basically run the same instruction at the same time over 64 pixels.

So how will LLVMpipe look on a Tilera ?
Well, much better than on a regular CPU (it has much more CPU cores to process pixels in parallel), but not so good than a GPU of similar transistor count/clock frenquency/power usage : the Tilera will just way too much ressource on having each core have it's very own instruction pipe-line, and so on. These resource are good to increase independence and tolerate more divergence (for tasks like raytracing), but are a complete waste for doing OpenGL (kill the extra pipelines and use the freed space to add more computing power to spit more pixels at the same time).
On the other hand, if all you have is a server/workstation with tileras, that could be a nice substitute to have an OpenGL desktop.

What we should see on the long-term is what the GPU maker will plan: the extra pipe-lines of such architecture could be a waste that some GPU maker could afford, because the graphics are fast enough anyway, and this extra capabilities will help taping into some HPC market which is currently only served by small players like Tilera.
So perhaps Tilera-like architecture could become slightly more popular with some constructor.
That's the path that Intel seems to be currently taking woth their Larrabee.
Leave a comment:
Kivada replied

30 April 2013, 05:47 PM
Originally posted by ChrisXY View Post

Well, without knowing that much about OpenGL rendering I'm still interested in how it would perform on the Parallella with 64 of those cores: http://www.adapteva.com/products/sil...vices/e64g401/ and whether the latency could be acceptable. (with necessary modifications of course)

The performance would probably not be too impressive, but it could still be ok. Also, 2014 they want to reach for ~1.2 TFlops

Likely wont be that great since they will hit the same wall that Intel did with Larrabee. It might be decent for ray tracing, but probably be terrible for standard graphics and will draw much more power then a dedicated GPU would to do the same task.
Leave a comment:
ChrisXY replied

30 April 2013, 05:27 PM
Originally posted by Kivada View Post

LLVM is not a substitute for a GPU in any case save for doing single frame rending accuracy tests.

Well, without knowing that much about OpenGL rendering I'm still interested in how it would perform on the Parallella with 64 of those cores: http://www.adapteva.com/products/sil...vices/e64g401/ and whether the latency could be acceptable. (with necessary modifications of course)

The performance would probably not be too impressive, but it could still be ok. Also, 2014 they want to reach for ~1.2 TFlops
Leave a comment:
Kivada replied

30 April 2013, 05:07 PM
Originally posted by droidhacker View Post

And once again, a totally pointless test.

1) Benchmark it doing something that it would actually be useful doing, like DESKTOP COMPOSITING.
2) Benchmark it on a system typical of those that don't already have decent GPU's, like... intel Z520.

Nobody cares how fast it can play games on a very fast 8-core processor.

Well if it can't run OpenErena 8.5 at more then a handful of FPS even on a very fast CPU then any CPU that you have that would not have a supported GPU on the mobo would run compositing over LLVM at around 1 frame every 4 seconds or less.

LLVM is not a substitute for a GPU in any case save for doing single frame rending accuracy tests.
Leave a comment:
freedam replied

30 April 2013, 04:46 PM
And with a weaker cpu and a low resolution display (like 1366x768, 1280x1024 ecc.)? Does it (down)scale lineary?
Leave a comment:
Calinou replied

30 April 2013, 04:32 PM
Originally posted by droidhacker View Post

Nobody cares how fast it can play games on a very fast 8-core processor.

4 modules actually, but you still 8 cores. You get the performance of a quad core Intel (in multithread usage), similar to a 2600(K).
Leave a comment:
smitty3268 replied

30 April 2013, 03:10 PM
Not so bad, actually

Getting 5-15 fps at 1080p resolution is better than i expected to see from a software renderer.
Leave a comment:
DanL replied

30 April 2013, 03:01 PM
What's up with the RadeonHD 6450 results?
Leave a comment:
Nuc!eoN replied

30 April 2013, 02:58 PM
LOL that's lame. Stop complaining, do the tests yourself...

Originally posted by droidhacker View Post

Nobody cares how fast it can play games on a very fast 8-core processor.

exactly that was requested:

Originally posted by chithanh View Post

Also I would have liked to see llvmpipe in this comparison, as the 8350's 8 cores could give results close to the low-end cards.
Leave a comment:

Announcement

Gallium3D LLVMpipe Compared To Nine Graphics Cards

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: