Announcement

Collapse
No announcement yet.

Intel SNA Acceleration Performance On Ironlake

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel SNA Acceleration Performance On Ironlake

    Phoronix: Intel SNA Acceleration Performance On Ironlake

    Wondering how Intel's SNA acceleration architecture is performing for Ironlake hardware? Here's some benchmarks...

    http://www.phoronix.com/vr.php?view=MTAyOTE

  • #2
    Cool results, looks like overall performance is already much better. Can't wait until this gets enabled by default.

    Comment


    • #3
      Originally posted by phoronix View Post
      Phoronix: Intel SNA Acceleration Performance On Ironlake

      Wondering how Intel's SNA acceleration architecture is performing for Ironlake hardware? Here's some benchmarks...

      http://www.phoronix.com/vr.php?view=MTAyOTE
      "A VMA cache appears unavoidable thanks to compiz and an excruciatingly slow GTT pagefault, though it does look like it will be ineffectual during everyday usage. Compiz (and presumably other compositing managers) appears to be undoing all the pagefault minimisation as demonstrated on gen5 with large XPutImage. It also appears the CPU to memory bandwidth ratio plays a crucial role in determining whethergoing straight to GTT or through the CPU cache is a win - so no trivial heuristic."

      im wondering what Chris means and implies here ?, is he saying that the compositing managers are stealing all the CPU cycles gains because they are simply not being benched and re-factored to minimise their overall impact often enough!
      Last edited by popper; 12-15-2011, 03:52 PM.

      Comment


      • #4
        Originally posted by popper View Post
        "A VMA cache appears unavoidable thanks to compiz and an excruciatingly slow GTT pagefault, though it does look like it will be ineffectual during everyday usage. Compiz (and presumably other compositing managers) appears to be undoing all the pagefault minimisation as demonstrated on gen5 with large XPutImage. It also appears the CPU to memory bandwidth ratio plays a crucial role in determining whethergoing straight to GTT or through the CPU cache is a win - so no trivial heuristic."

        im wondering what Chris means and implies here ?, is he saying that the compositing managers are stealing all the CPU cycles gains because they are simply not being benched and re-factored to minimise their overall impact often enough!
        No, it is a limitation in how the rendering is split between X and the DRI compositor. In order for all rendering performed by X to be seen by the compositor, the ddx must flush its queues before broadcasting the damage to clients. The ddx only knows when X is about to reply to a client, but we don't know if we're sending a damage report so we need to assume the worst and flush the rendering before every reply to any client. This means that when a compositor is in use, or more generally when we have exported GEM buffers to other DRI applications i.e. games, the ddx can only batch little amounts of rendering and so throughput suffers and cpu overhead increases. In this particular instance, PutImage is buffered onto a system copy of the pixmap and normally flushed in time for vblank, however with a DRI compositor we end up flushing the pixmap after each call to PutImage, causing many more small uploads rather than one big one. Prior to the commit, the GPU buffer would be mmapped on each upload. The commit introduces a caching scheme so that those mappings (which themselves are a precious resource and have costs associated with keeping them open) are preserved between uploads.

        This is also one of the major changes inherent in the design of Wayland; the clients push the damage to the compositor without any unnecessary round-trips, updates are always atomic, fast and only when required.

        Comment


        • #5
          You can also compare your system's 2D performance to these Intel Ironlake numbers by simply running phoronix-test-suite benchmark 1112104-AR-INTELIRON40 from your system.
          Now that is a handy little feature, thanks Michael!

          Comment


          • #6
            Originally posted by ickle View Post
            Now that is a handy little feature, thanks Michael!
            For any result uploaded to OpenBenchmarking.org, you can pass that to the phoronix test suite and it will automatically fetch the results, etc.
            Michael Larabel
            http://www.michaellarabel.com/

            Comment

            Working...
            X