Announcement

Collapse
No announcement yet.

Intel SNA Performance Continues To Be Compelling

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by devius View Post
    This is irrespective of which drivers are being used on the Radeon (binary or open source, doesn't make a difference in terms of desktop compositing).
    Well, it does make a difference for me on KDE. KWin won't work with fglrx and will work with radeon (though this is on openSUSE 12.1, I think they made improvements in later KDE versions).

    Comment


    • #12
      Originally posted by tenzero View Post

      I would however like to see some kind of baseline against the usual suspects in discrete gpus. Nothing fancy, just the bottom of the range from AMD and Nvidia to give it all some kind of perspective.
      Here you go, IvyBridge (i7-3720qm) in comparison with an Nvidia GTX-550: http://openbenchmarking.org/result/1...SU-1207273SU39

      Comment


      • #13
        Originally posted by ickle View Post
        Here you go, IvyBridge (i7-3720qm) in comparison with an Nvidia GTX-550: http://openbenchmarking.org/result/1...SU-1207273SU39
        Very nice. Basically SNA = Good, UXA = Bad. When SNA has regressions they are negligible, but when it performs better, it really performs a lot better.

        Comment


        • #14
          Originally posted by devius View Post
          Very nice. Basically SNA = Good, UXA = Bad. When SNA has regressions they are negligible, but when it performs better, it really performs a lot better.
          Right, the regressions tend to be a consequence of choosing one method that gives the better performance elsewhere at a cost. Most of the regressions are in the noise of the measurement, IvyBridge is very sensitive to thermals (in some of those tests the initial run is 2x faster than the final run due to turbo). The only significant regression there is -compwinwin500. The reason for the regression is that last week it was 2x faster due to hitting the Render cache - however that was missing a flush. Having added that flush for correctness, it becomes faster to use the BLT for that particular test, a trivial change already made.

          But what I find truly fascinating is how competitive we actually are with a discrete GPU that has a good driver, over 4x the fill rate of the igfx and several times the shader flops. With regards to 2D performance the limitation tends not to be SNA (unlike UXA and glamor where they are the bottleneck), but the application - which is as it should be. :-)

          Comment


          • #15
            Originally posted by ickle View Post
            unlike UXA and glamor where they are the bottleneck
            Is this statement related to intel hardware only or do you think there are general (significant) bottlenecks connected to Glamor?

            Comment


            • #16
              Originally posted by entropy View Post
              Is this statement related to intel hardware only or do you think there are general (significant) bottlenecks connected to Glamor?
              There is a significant impedance mismatch between X and GL, that is tricky to overcome and adds lots of extra complexity, and with the extra abstraction layer you cannot exploit hardware features not exposed through a GL extension. Also you need to leak many details through that abstraction layer in order to allocate shared objects between multiple clients and your acceleration routines (which is quite, quite scary and hairy.) And there is the tiny issue of having a critcal system process relying on several hundred thousand lines of code that has not been written with robustness in mind, and having no failsafe method.

              With regards to performance, the current bottlenecks I see in glamor are due to the CPU overhead of the Intel mesa stack, and the many assumptions that interact extremely poorly with the 2D workload of glamor. Where you do find yourself mostly GPU bound (such as the fish-demo), glamor still falls short by 10-30% due to inefficiences in the GPU programming (too many state changes and poor optimisation of shaders) and the multiple abstraction layers. However, being GPU bound is the exception and typically you end up being ratelimited by one of the paths that are orders of magnitude slower. And then there is the issue that glamor is an absolute resource hog, as the intel mesa driver's buffer management has never been used like that before...

              In a perfect world, glamor would equal the performance of a highly specialised driver like SNA; much of the routines used in SNA can be mapped directly onto the OpenGL API - and most have been copied over to glamor. Lots of work needs to be done to tune the entire mesa stack, a lot of which I suspect will only benefit glamor.

              And remember, RENDER acceleration is just one small part of the driver.

              Comment


              • #17
                Thanks for taking the time and sharing your thoughts!
                This is very much appreciated.

                Comment

                Working...
                X