Announcement

**bridgman** · 08 January 2011, 03:33 PM

Originally posted by Qaridarium

does this mean the catalyst team learn something from the OS team ?

Um... yes there's some head scratching going on (just like in the picture

) but the associated question is more along the lines of "the code in the open source driver looks pretty good, but it seems to run a lot slower than Catalyst and we don't know why".

**bridgman** · 08 January 2011, 03:39 PM

Originally posted by Drago View Post

Bridgman, I am sure if you do not know where the bottlenecks are, you(and other developers) have suspects. Is it in kernelmode or usermode part of the stack?

The problem is that so far the test results aren't supporting our initial suspicions. Going in I think most of us suspected that the bottlenecks were likely to be in the kernel driver (synchronization, memory mapping etc..) but test results seem to suggest that common mesa code in the usermode 3D driver is a bigger factor. There's a lot more testing required though, and there are conflicting views re: how to interpret the test results so far.

Performance optimization is basically :

- run some benchmarks & save the results
repeat forever {
- do some profiling
- form a theory re: where the bottleneck is
- change some code to test the theory
- re-run the benchmarks to see if things go faster
- (4 times out of 5) curse and discard the theory (or save as the basis for a more complex theory)
- (1 time out of 5) make happy noises and get some sleep
}

**deanjo** · 08 January 2011, 04:28 PM

Originally posted by Qaridarium

o bad

thats not so funny..

Take heart Q, at least the poor performance isn't AMD specific but is a across all current solutions with FOSS drivers. It may mean that the greatest increase in performance may come from a joint effort to find the common cause.

**Drago** · 08 January 2011, 04:29 PM

Does the Radeons offer debug instrumentarium, like the CPUs. Cache misses, instruction counting, pipeline stalls. Or bassicaly is there way to know that your shader compiler is bad, and thus resulting stalls on more SIMDs than needed. In general how you evaluate FOSS r600 shader compiler? Dismissing shader compiler as the main bottlenck, open spaces for more agressive CPU micro optimizations, e.g: branch prediction hints, preventing CPU cache trashing etc. Well that will break portability to other platforms, but I am sure fglrx is full of that.

**bridgman** · 08 January 2011, 04:41 PM

Originally posted by Qaridarium

o bad

thats not so funny..

Yeah, I would have been happier if it was the other way

Part of the problem is that profiling only tells you what the CPU is doing, not what the GPU is doing.

Drago, there are some hardware bits that can help but they're mostly aimed at getting the most out of the GPU once core driver isses are worked out, don't think they will help much here but we are going to look at those as well. Right now the open source driver hacked to not run anything on the GPU is still slower than fglrx doing full rendering (even on a single CPU core, apparently).

The "good" news (such as it is) is that this means there is a bunch of useful work that can be done before getting into the nasties of performance tuning on a pile of asynchronous engines (CPU execution, CPU cache flusher thingy, command processor, graphics pipe, shader core, vertex fetcher, texture fetchers/filters, GPU memory controller, GPU cache flusher thingy etc...).

**pingufunkybeat** · 08 January 2011, 05:56 PM

If most of this work is in core Mesa (as I understood it, perhaps incorrectly), then all drivers will profit from this work?

**Drago** · 08 January 2011, 06:01 PM

Originally posted by pingufunkybeat View Post

If most of this work is in core Mesa (as I understood it, perhaps incorrectly), then all drivers will profit from this work?

The focus is primarily on Gallium3D, so no intel there. An they are the second biggest contributor to Mesa. VMware being the first.

**agd5f** · 08 January 2011, 06:07 PM

There are some sw and hw bits that can provide detail in to the state of the GPU pipeline, but unfortunately, you generally need a fairly complex instrumentation infrastructure in place to get meaningful data out of them. Unfortunately, we don't really this in place yet in the open source driver.

**marek** · 08 January 2011, 06:23 PM

Originally posted by Drago View Post

The focus is primarily on Gallium3D, so no intel there. An they are the second biggest contributor to Mesa. VMware being the first.

Given the latest events, I think Intel is now the biggest contributor. Lately VMWare doesn't seem to work much on Gallium core components, compared to what Intel does for their GLSL compiler everybody happily uses.

**pingufunkybeat** · 08 January 2011, 07:05 PM

What is the relation between intel's GLSL compiler and Jerome's (unfinished) shader compiler for r600g?

Does Intel's compiler produce Mesa IR code? Which is then converted to TGSI code for Gallium drivers?

Announcement

A Big Comparison Of The AMD Catalyst, Mesa & Gallium3D Drive

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment