No announcement yet.

A Big Comparison Of The AMD Catalyst, Mesa & Gallium3D Drive

  • Filter
  • Time
  • Show
Clear All
new posts

  • Most of the redundancy is temporary, a consequence of the transitions going on now and the associated need to support old and new standards. For years there were just the two steps :

    - mesa converts from GLSL to hardware-independent Mesa IR
    - HW driver converts from Mesa IR to hardware instructions

    The conversion from Mesa IR to TGSI is needed as long as we have both "classic" and Gallium3D drivers but that can go away once (if) all the drivers transition to Gallium3D and whatever the corresponding IR turns out to be. At that point mesa can generate the appropriate IR directly and pass it to the HW driver and we're back to two steps.

    The "compiler IR to Mesa IR" step happens because the Intel devs were nice enough to write a new GLSL compiler and make it work for everyone's hardware, not just their own, even though they don't plan to use Mesa IR in the future. My understanding is that the conversion isn't very expensive but as you say it all makes a difference.

    You can find our IL specification on the Stream SDK documentation page :
    Test signature


    • Originally posted by HokTar View Post
      Also, from bridgman's latest posts it seems to me that mesa might cause significant performance penalties. Are there any plans to write an ogl state tracker for gallium? I understand that it's not going to be the radeon devs to do that, but still.
      Mesa code has been very well optimized over the years - the problem here IMO is that hardware has become much more powerful in a relatively short amount of time, so there may be some code in mesa which was more than fast enough a few years ago but which is now starting to get in the way again. Or it may turn out that mesa is not a problem at all -- I don't think we really know yet.

      Given that mesa has evolved over the years from "a software renderer with support for HW aacceleration to "a "GL state tracker that runs over hardware drivers but includes a couple of software-implemented hardware drivers" I think we are looking at a bit more optimization of mesa code rather than any kind of replacement.
      Test signature


      • Originally posted by RealNC View Post
        Aren't shaders computed only once during loading? If yes, wouldn't the performance improvement be rather irrelevant?
        That's the way things are supposed to work. I'm told that there are still some apps which compile shaders while the application is running. Not sure if this is because of historical limitations (eg not enough room to store compiled shaders or something) or whether shader source is being dynamically generated as the game proceeds (haven't heard of this but anything is possible), or if it's even true.
        Test signature


        • Originally posted by bridgman View Post
          shader source is being dynamically generated as the game proceeds (haven't heard of this but anything is possible), or if it's even true.
          That's fairly common. Modern effects engines combine shader fragments on demand for complex materials and effects that are loaded mid-level as the player moves around larger game maps/worlds. (Which is a pain in the ass with GLSL which makes that kind of dynamic combination harder than HLSL/Cg do.)

          No game should be recompiling shaders every frame, or even every few frames, but they will do it at runtime as the game is running. It shouldn't matter too much unless the shader compiler is taking a _really_ long time and can't finish between the times the game starts pre-loading the resource and when it actually needs it.

          Basically, inefficient shader compilation could cause stuttering as the player moves around, but should not affect general frame time. So actually measuring the effects of the shader compiler on application performance is going to be a bit more complex than just running a benchmark and seeing what the average frame rate is. At the very least, you need to measure minimum frame rates as the player moves around and make sure it never spikes below a playable threshold.

          Some poorly written games that aren't using the separate shader objects support (which is an OpenGL 4.1 feature and not all implementations have the EXT/ARB separate_shader_objects extension) may be relinking shader objects very frequently instead of caching the linked programs. They're going to suffer bad performance even on Catalyst and other drivers, though, unless part of those millions of lines of driver code is internal caching of linked programs.

          Not sure if the Phoronix Test Suite dealy handles any of that or not, or whether it tests any games that have modern effects engines and larger game areas that need dynamic/streaming resource loading.


          • Originally posted by Wyatt View Post
            Do the docs from ATI have any information on the IL used in fglrx? You seem to get pretty good performance, so in my limited experience with compiler development, it would seem like a good idea to learn from the specialist in the domain (Thought: A generic IR with driver-specific extensions where necessary?)
            AMD "IL" documentation and other shader related info is available here:


            • Originally posted by Pfanne View Post
              what percentages are we talking about?
              5%, 10% or more like 50%?
              just rough guess.
              It really depends on the workload and where the current bottlenecks are; e.g., if the app is CPU limited, hw optimizations won't make much difference. Things like tiling and hyperZ improve cache and memory bandwidth utilization, so if an app is bandwidth limited, they should help significantly.


              • Well, 2D colour tiling boosted performance between 50% and 100%, which was a very significant leap.

                Other optimisations might not bring such drastic improvements, but I don't expect them to be in the 1-2% range.


                • My recollection was that tiling on its own was good for something closer to 7%*, and there were a heap of other improvements at around the same time which collectively gave a much larger improvement. Take this with the usual disclaimer, this is just what I remember reading on the internet

                  * faster on some things, slower on others
                  Test signature


                  • No, I think pingufunkybeat is right.
                    Tiling is essential for GPU performance and should have brought a big leap in performance. 7% is way too low for propper tiling support.


                    • If you look for the largest performance gain on a single app at a single (high) resolution, I agree completely, but if we are talking about average performance gain across a range of apps (say the ones used in this article) I think the gain would be quite a bit lower.

                      Alex also mentioned 2D tiling, note that in some cases "1D tiling" is already supported and going from "1D tiling" to "2D tiling" (more aggressive tiling) won't necessarily bring the same gain as going from linear to 1D. Going from linear to 2D is a big deal though.
                      Test signature