Announcement

Collapse
No announcement yet.

Intel's Mesa Driver Is Going Faster For Unigine

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel's Mesa Driver Is Going Faster For Unigine

    Phoronix: Intel's Mesa Driver Is Going Faster For Unigine

    Eric Anholt at Intel has a new Mesa GLSL patch to add a new pass for their compiler that decreases the number of instructions and can result in performance improvements...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    CSE stands for Common Subexpression Elimination not Constant Subexpressio Ellimination

    If you have:
    Code:
    a = b+c
    d = b+c
    The code will become
    Code:
    e = b+c
    a = e
    d = e
    In rest, it is great to have such of a great impact in this "real life" benchmark.

    Comment


    • #3
      Not quite the whole story.

      I'm guessing Eric is looking at optimizing tropics now because of a recent Intel regression:

      Hi Eric,
      >>> The frame rate of Unigine Tropics (with low shader quality) dropped
      >>> from 40.8 to 23.5 after this change.
      So while common optimizations like this are great for everyone, there's still a big regression and it's likely this change won't be that big for other drivers. Especially r600g, which already has a bunch of optimizations like CSE running in their own driver already.
      Last edited by smitty3268; 19 October 2013, 04:17 AM.

      Comment


      • #4
        Originally posted by smitty3268 View Post
        So while common optimizations like this are great for everyone, there's still a big regression and it's likely this change won't be that big for other drivers. Especially r600g, which already has a bunch of optimizations like CSE running in their own driver already.
        We have CSE in the i965 backend as well, but it doesn't handle texturing operations at the moment. Those are obviously the most expensive operations, so it's important to handle. Of course, most people don't write redundant texture lookups, either.

        According to I965Todo, Unigine Tropics has a texturing operation in a loop which is actually loop-invariant. The core compiler doesn't yet implement decent loop-invariant code motion techniques, so it just goes ahead and unrolls the loop, duplicating the texture lookup many times.

        If r600g has support for CSE on texturing operations, then it might already recover from this problem in the core compiler. Otherwise, it'll probably benefit too.
        Free Software Developer .:. Mesa and Xorg
        Opinions expressed in these forum posts are my own.

        Comment


        • #5
          Why are drivers working around the GLSL compiler, instead of things being fixed in one place?

          Comment


          • #6
            Originally posted by curaga View Post
            Why are drivers working around the GLSL compiler, instead of things being fixed in one place?
            AFAIK Eric did implement this at the GLSL compiler level, ie as an additional optimization performed on GLSL IR before being passed to the drivers.
            Test signature

            Comment


            • #7
              Aye, but it's been in r600sb for months.

              Comment


              • #8
                Originally posted by curaga View Post
                Why are drivers working around the GLSL compiler, instead of things being fixed in one place?
                The GLSL compiler isn't the only place generating code that the drivers might want to optimize. It's certainly an important one, but for example, the OpenCL code completely bypasses it. Do you want optimizations like this in every API front end? Or in every hardware backend? I'm not entirely sure, but if you can put it in the driver backend there's probably a good chance it will help you out somewhere, sometime.

                Comment


                • #9
                  That's a good argument for having it in both places. Still, it means there's two places, and each will have optimizations the other will not, because people only work on one.

                  Comment

                  Working...
                  X