Announcement

Collapse
No announcement yet.

AMD Catalyst vs. X.Org Radeon Driver 2D Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Well Linux's 2D stuff has always been 'faster' then Windows per say.

    But really nobody cares that much about 2D performance. In Vista when your running your composted desktop you have zero hardware acceleration going into doing 2d rendering.

    If that goes to show you how much that matters.

    With Linux 2D on composited desktop I think it's more of a matter all the context changes that the drivers have to go through to convert the X Windows 2D driver's world to the Linux DRM/DRI managed world. Like you have to render the item off-screen, then capture the image, then convert the image to something that can be managed by the 3D drivers then copy that image to texture and then render that texture as a image that we call the desktop.

    So many steps. At that point it doesn't really matter if you have a very fast CPU or very fast GPU or anything like that. It doesn't really matter that the conversion gets done fast, either. There can be thousands and thousands of cpu cycles wasted for each one of those steps... reading in instructions form main memory, loading them into cache, executing them, sucking in textures from memory, etc etc. etc. Each time you do a context switch your purging out your cache and starting over and wasting just all sorts of cpu/gpu.

    I mean RAM may seem fast, especially compared to disk, but your CPU/GPU will burn through thousands of wasted cycles waiting for information to come in from main memory or over that PCIe bus.


    The 'correct' way ot manage all of that would be to render the application off screen, and have the output write directly to 3D texture that is mapped to the 3D primitive that is then used as part of your desktop image.

    Something like that that can be done in as close to a single operation as possible.

    But the current driver model for Linux won't allow something like that. X.org world and Linux-DRM world is just to heavily split. They were never designed to work together very very closely... instead you just assigned a hunk of the screen for X to render to, then assigned a smaller hunk of hte screen for the OpenGL stuff to be rendered to. That's what the 'overlay' provides and it is fast, but it's ugly. That's how it's designed to work.

    ---------------------------------------

    You see the trick with Linux right now is you have 2 entirely different set of drivers sharing the same single video card. You have the 2D Xorg drivers and then the Linux-DRM/DRI-based 3D drivers.

    X.org X Server goes down and performs such actions as configuring PCI devices and modesetting outside of the context of the Linux kernel's control.

    So you end up with situations were Linux is configuring PCI devices and doing something like that and X comes along and stomps on it and causes your video card to flake out.

    ---------------------------------------


    So I suppose with Intel's UXA framework it will be much more efficient.

    Instead of worrying about getting the 2D drivers working better or porting the 2D X drivers to the 3D Linux-DRM world, they just rewrote the drivers from scratch and implemented the EXA API using the Linux-DRM 3D-related core.

    That way you end up with compatibility with current applications, but you render everything directly using the 3D engine. So instead of doing EXA in the 2D engine on the card you do it on the '3D' engine.

    That will probably actually end up being slower in benchmarks then just doing 2D-only with no composition, but it doesn't really matter because it'll be fast enough and it'll make it much easier to deal with performance issues that matter... such as video playback acceleration, better composited desktops, a more stable/saner 1-driver design, faster 3D performance, etc.

    Comment


    • #32
      Sigh... Even on my 8MHz 68000 Atari ST, opening new windows would just POP onto the screen. It's amazing the amount of visible lag people are willing to put up with these days, and that's not even talking about the worse culprit of scrolling a browser window...

      2D performance affects *every* computer user, 3D performance only affects the minority of computer users who play 3D games. People have really got their priorities messed up.

      Comment


      • #33
        Most of EXA is done with 3D engines on r5xx and newer. In addition, UXA is completely useless for non-Intel. (I'd say it's completely useless, period, but it allows Intel stuff to do zero-copy EXA, which is useful for them.)

        DRI2 fixes a lot of things. So does KMS. Be on the lookout for those.

        Comment


        • #34
          Originally posted by highlandsun View Post
          Sigh... Even on my 8MHz 68000 Atari ST, opening new windows would just POP onto the screen. It's amazing the amount of visible lag people are willing to put up with these days, and that's not even talking about the worse culprit of scrolling a browser window...

          2D performance affects *every* computer user, 3D performance only affects the minority of computer users who play 3D games. People have really got their priorities messed up.
          Well you should pay attention to exactly what is causing those applications to not *pop* up on the screen. I doubt the majority of it has to do with anything that has to do with EXA acceleration.

          If your using Gnome:
          1. add a "System Monitor: system load indicator" applet to your panel.
          2. Right click on it and go to preferences
          3. Change the 'system monitor width' to something that is useful and change the colors to something that contrasts. Like leave the 'User' and 'Nice' colors blue, but make the 'system' color green and then the IOwait red.

          You'll find that the majority of application start up time, including window draw, is going to be dominated by the cpu simply waiting on your system storage to read out it's information.

          Linux applications use a lots of little files in your file system. Lots of little configuration files, lots of little system libraries. They also spend a great deal of time polling various directories and files which are usually empty or missing. All of this causes a lot of I/O seek time. The drive spinning around looking at this or that directory.

          So it's not even a question on how fast your drive can read out information. It's all just blowing it's time out on seeking.

          Windows, for example, has the registry. While the registry sucks for many different reasons the thing is is that it's a sized-optimized, fast database that is stored almost entirely in RAM almost all the time. So when your Windows applications start up they read in Windows system files, which are already loaded into RAM, and then get their configuration files from that fast little database. There is much much less I/O seek time for most applications. This is one of the reasons why things like IE, MS Office, or whatever start up so much quicker in Windows then the open source alternatives do in Linux.

          And if you try to optimize your system by getting rid of Gnome and striping it down to 'lighter' desktops that conserve RAM you can actually make the problem worse because when your running Gnome and Gnome-only applications all the dependencies and libraries are already read into RAM from when you logged into your system, thus reducing the amount of load your system goes under when you actually start up individual applications. (and no.. firefox and openoffice.org are not gnome applications and thus have their own set of libs and whatnot that they read from the drives at start up.)

          -------------------------------


          Then if you get rid of the I/O wait and seek times that Linux desktop applications tend to get penalized for, you still have system performance issues with scheduling that Linux has to deal with.

          Going back to what I was saying in my last post about "context changes"... The main memory in your system is very slow. It's much faster then disk, hundreds of times faster, but it's still much slower then your L2 and L1 cache. Each time the system needs to perform a different sort of processing then those cache's need to be flushed and replaced with new information. That's a context switch and it's a big performance penalty, but it's necessary to maintain the illusion that your running a multitasking computer.

          So Linux is heavily optimized for performance. Linux's goal is to get processing done as fast as possible. This means that kernel developers are going to be very careful about maintaining the L1/L2 cache's for as long as possible.

          So if something is being processed it's usually best to let it get finished before you move onto the next thing.

          In a very performance oriented environment you can actually then end up having very lousy user interactivity. That is the amount of time the system takes to respond to user input can be huge and make the system feel very sluggish and be irritating to use.

          In a system optimized for 'realtime' performance... that is performance were you have a set/required latency that the system has to conform too, you can then dramatically increase the level of user interactivity... you can optimize the system not to drain your sound card's buffers and avoid audio stuttering and other artifacts.... make X Windows much more responsive, etc etc... you'd actually end up REDUCING overall performance.

          that is by increasing the realtime-like aspect of Linux your going to actually end up slowing things down, slightly. That means you'll score lower benchmarks, but you'll have a more responsive system.

          ------------------------

          It's like this:

          Say your doing your chores at home. You have two things you need to accomplish:
          Doing laundry, sorting clothes, and whatnot, in the basement...
          Raking leaves in the backyard.

          So the fastest way to get it done would be to concentrate on one task until it's finished then work on the next.

          However you have a wife that is angry at you for whatever reason. She goes down stairs and sees that your not doing the laundry so she yells at you to do the laundry. So you run downstairs and start that.

          So then she goes up to the kitchen and looks out the back yard and sees that your not finished raking the leaves, so she yells to you about that, and thus you end up running up stairs and start raking leaves...

          Then she goes into the basement to get some diet coke and then yells at you about that.


          So you see if your highly reactive and you jump quickly from one job to another then while that can make things responsive it will actually take much longer to get anything done.

          ----------------------------------

          Of course the best way is to simply eliminate the jumping around.

          If you can figure out how to do the laundry AND the yard using a single set of operations then you'll win. Even if somebody timed you and saw that it took you longer to do the yard or the laundry then it did before.. But you win because you've eliminated the time it takes to run in the house and up and down the stairs.


          With 2D drivers vs 3D drivers it's not really even a question of what 'core' they run on. It's the fact that they exist in such different worlds and are not compatible with each other. Same thing with video playback and decoding acceleration. Anything that requires GPU acceleration and drawing things out to the display.


          -------------------------------

          Or you can just use tricks to make it _seem_ like your doing things faster. And simply not do things faster at all. This is what composition does for your Linux desktop, or any desktop.

          --------------------------------


          Traditionally speaking (this is a couple years old and maybe it's changed with recent versions of OS X or Vista) if you were to do system benchmarks to compare display performance of Linux vs Windows vs OS X you'd find that generally speaking 2D application performance Linux is fastest, Windows is next, and OS X is slowest.

          But if you asked the users they'd say the exact opposite and say that OS X is the fastest and provides the best visual quality, Windows is next, and Linux is last.

          This is because with Linux they saw much more redraw time and visual tearing and all sorts of other ugliness. Were as with OS X they would see nothing but solid and pretty looking UI. The relative speed didn't matter so much.

          This is because OS X had composited desktop.

          Instead of racing to keep up window rendering with the display, like Linux did, it simply ignored that. When people move windows about on the desktop they are not causing the windows to redraw and all that... they are simply moving a single solid image around.. a square with a image of the application painted on it.

          And if the person tried to move the window to fast... it simply doesn't go faster. In Linux it would do this sort of half-redraw and hop from one side of the display to the other, maybe leaving pieces of the image ontop of the desktop or on a window underneath the one your moving, but with OS X it simply just lagged slightly. The mouse moves slower in OS X then, the Window moves slower then it does in Linux (in this example), however because the Window stays solid and pretty then nobody notices it.

          So in Linux the actual operations of moving windows and redrawing was much much faster... it still loses because you just can't keep up with the visual quality. If your slower and pretty then people will think your better then something that is fast and ugly.

          Within reason, of course.

          -------------------------------------------


          So the challenges for Linux is to:

          * Eliminate the 2 driver model... The X.org DDX vs Linux-DRM/Mesa-DRI drivers. Having two different and inconsistent set of drivers that produce two different sets of graphics that need to have all sorts of image conversion sets taking place to work with each other is just stupid.

          Even if 2D on DRI is much slower then 2D on X.org DDX then you still can win. Having them unified makes it much easier to do fancy graphics things like animations, vector-based graphics, faster media playback, etc.

          * Better 3D application compatibility, better performance for media decoding, more stable and consistent performance. In other words: Simply higher quality drivers.


          ----------------------------------------


          I mean seriously... do you really go out and spend $150+ on a nice video card to make your GTK combo boxes draw 2msec faster?!

          Or do you want something that looks nicer and provides better OpenGL and media acceleration?

          Comment


          • #35
            Originally posted by d2kx View Post
            Xserver 1.6 has extremely improved EXA performance, fglrx wouldn't have a chance with that, especially with Composite/RENDER
            EXA text rendering should indeed be several times faster in 1.6 thanks to the glyph cache, but I don't remember there being as significant improvements in other areas. Just trying to manage expectations.

            Also, I think it's indeed important to keep in mind that the motivation for EXA was compositing, whereas XAA is excessively optimized for non-compositing.

            and who knows what will be when the UXA stuff merges to EXA with Xserver 1.7...
            I'm not sure something like that will ever happen, but there's no need to wait anyway - the same benefits (storing pixmap contents in buffer objects, avoiding the EXA pixmap migration code overhead) can be had with EXA already if the driver so chooses.

            Comment


            • #36
              I mean seriously... do you really go out and spend $150+ on a nice video card to make your GTK combo boxes draw 2msec faster?!
              That would be just as reasonable as getting a new 150$ graphics card to get 2 fps more. Yet see the hordes of people doing just that

              Comment


              • #37
                You're missing the point. Even a $30 video card ought to be able to paint its screen instantaneously. And I appreciate your taking the time to post such a detailed and lengthy response, but I'll just comment that I ported X11R1 to the Apollo Domain/OS; I know very well what's involved in getting good performance out of X and a display driver on a multi-tasking OS. It's been almost 25 years since then, and the user experience has only gotten slower.

                Comment


                • #38
                  Originally posted by DoDoENT View Post
                  It seems that commits to this branch are quite old (April 2008). Are you sure this isn't implemented in Ubuntu intrepid's default radeon driver? Because if it is, then what do I have to add to my xorg.conf to enable it? With stock configuration I get 30-40 minutes longer battery life with fglrx than with radeon driver.
                  one moth ago I had to apply them by hand...see Xorg.log if it is enabled

                  Comment


                  • #39
                    Originally posted by drag View Post
                    Well you should pay attention to exactly what is causing those applications to not *pop* up on the screen. I doubt the majority of it has to do with anything that has to do with EXA acceleration.

                    snip to remove the "The text that you have entered is too long (10403 characters). "

                    Going back to what I was saying in my last post about "context changes"... The main memory in your system is very slow. It's much faster then disk, hundreds of times faster, but it's still much slower then your L2 and L1 cache. Each time the system needs to perform a different sort of processing then those cache's need to be flushed and replaced with new information. That's a context switch and it's a big performance penalty, but it's necessary to maintain the illusion that your running a multitasking computer.

                    snip

                    So if something is being processed it's usually best to let it get finished before you move onto the next thing.

                    In a very performance oriented environment you can actually then end up having very lousy user interactivity. That is the amount of time the system takes to respond to user input can be huge and make the system feel very sluggish and be irritating to use.

                    In a system optimized for 'realtime' performance... that is performance were you have a set/required latency that the system has to conform too, you can then dramatically increase the level of user interactivity... you can optimize the system not to drain your sound card's buffers and avoid audio stuttering and other artifacts.... make X Windows much more responsive, etc etc... you'd actually end up REDUCING overall performance.
                    that is by increasing the realtime-like aspect of Linux your going to actually end up slowing things down, slightly. That means you'll score lower benchmarks, but you'll have a more responsive system.

                    ------------------------

                    It's like this:

                    Say your doing your chores at home. You have two things you need to accomplish:
                    Doing laundry, sorting clothes, and whatnot, in the basement...
                    Raking leaves in the backyard.

                    So the fastest way to get it done would be to concentrate on one task until it's finished then work on the next.

                    However you have a wife that is angry at you for whatever reason. She goes down stairs and sees that your not doing the laundry so she yells at you to do the laundry. So you run downstairs and start that.

                    So then she goes up to the kitchen and looks out the back yard and sees that your not finished raking the leaves, so she yells to you about that, and thus you end up running up stairs and start raking leaves...

                    Then she goes into the basement to get some diet coke and then yells at you about that.


                    So you see if your highly reactive and you jump quickly from one job to another then while that can make things responsive it will actually take much longer to get anything done.

                    ----------------------------------

                    Of course the best way is to simply eliminate the jumping around.

                    If you can figure out how to do the laundry AND the yard using a single set of operations then you'll win. Even if somebody timed you and saw that it took you longer to do the yard or the laundry then it did before.. But you win because you've eliminated the time it takes to run in the house and up and down the stairs.


                    With 2D drivers vs 3D drivers it's not really even a question of what 'core' they run on. It's the fact that they exist in such different worlds and are not compatible with each other. Same thing with video playback and decoding acceleration. Anything that requires GPU acceleration and drawing things out to the display.


                    -------------------------------

                    Or you can just use tricks to make it _seem_ like your doing things faster. And simply not do things faster at all. This is what composition does for your Linux desktop, or any desktop.

                    --------------------------------


                    Traditionally speaking (this is a couple years old and maybe it's changed with recent versions of OS X or Vista) if you were to do system benchmarks to compare display performance of Linux vs Windows vs OS X you'd find that generally speaking 2D application performance Linux is fastest, Windows is next, and OS X is slowest.

                    But if you asked the users they'd say the exact opposite and say that OS X is the fastest and provides the best visual quality, Windows is next, and Linux is last.

                    This is because with Linux they saw much more redraw time and visual tearing and all sorts of other ugliness. Were as with OS X they would see nothing but solid and pretty looking UI. The relative speed didn't matter so much.

                    This is because OS X had composited desktop.

                    Instead of racing to keep up window rendering with the display, like Linux did, it simply ignored that. When people move windows about on the desktop they are not causing the windows to redraw and all that... they are simply moving a single solid image around.. a square with a image of the application painted on it.

                    snip

                    So in Linux the actual operations of moving windows and redrawing was much much faster... it still loses because you just can't keep up with the visual quality. If your slower and pretty then people will think your better then something that is fast and ugly.

                    Within reason, of course.

                    -------------------------------------------


                    So the challenges for Linux is to:

                    * Eliminate the 2 driver model... The X.org DDX vs Linux-DRM/Mesa-DRI drivers. Having two different and inconsistent set of drivers that produce two different sets of graphics that need to have all sorts of image conversion sets taking place to work with each other is just stupid.

                    Even if 2D on DRI is much slower then 2D on X.org DDX then you still can win. Having them unified makes it much easier to do fancy graphics things like animations, vector-based graphics, faster media playback, etc.

                    * Better 3D application compatibility, better performance for media decoding, more stable and consistent performance. In other words: Simply higher quality drivers.


                    ----------------------------------------


                    I mean seriously... do you really go out and spend $150+ on a nice video card to make your GTK combo boxes draw 2msec faster?!

                    Or do you want something that looks nicer and provides better OpenGL and media acceleration?
                    in essence then drag, perhaps you didnt realise it but...,your asking for a return to the AmigaOS way of doing things with its mirokernel end user realtime message passing and Co-Processor handling of the different parts of the data chain....

                    sounds like a plan, so how do we encurage all the worlds linux/open code devs to get the current mamoth sized linux executables sizes right down to AmigaOS/AROS http://aros.sourceforge.net/ microscopic size levels and re-impliment the most basic keep and reuse a library in memory until a flush cleanup is sent etc and put this old is new again message passing at the core while doing so.

                    perhaps linux needs to finally take this old bounties AROS concept and create its own central bounties program http://www.power2people.org/projects.html

                    as it stands now, only commercial yearly GSOC etc seems to bring out partial advances, a real ongoing bounties plan all year round might be the best option at this moment in time....were anyone can contribute, world business and single user alike....

                    for the ST guy , you want to see fast windows opening, then this AROS on an old 3.1 GHz Athlon64 X2 6000+ is fun to remember how AmigaOS was far better

                    This is the latest version of Paolo Besser Aros distribution. It runs like a charm when installed on a hard disk and shows other OS suppliers how things can ...

                    เกมส์นอกจากจะให้ความบันเทิงแล้ว ยังสามารถทำเป็นอาชีพเพื่อหารายได้ให้ตัวเองได้อีกด้วย ซึ่งในปัจจุบันเกมส์ในโลกออนไลน์แทบจะเป็นส่วนหนึ่งในชีวิตประจำวันของวัยรุ่น
                    Last edited by popper; 21 January 2009, 07:44 AM.

                    Comment


                    • #40
                      Originally posted by drag View Post
                      So I suppose with Intel's UXA framework it will be much more efficient.

                      Instead of worrying about getting the 2D drivers working better or porting the 2D X drivers to the 3D Linux-DRM world, they just rewrote the drivers from scratch and implemented the EXA API using the Linux-DRM 3D-related core.
                      Sorry but this is bullshit.
                      UXA and EXA both are designed to accalerate certain features through the 3D engines.
                      The only difference is that UXA a modified EXA (not a rewrite) better suited to shared-memory GPUs.

                      In fact it was often suggested to merge UXA back into EXA again, to not have so much duplicated code.

                      - Clemens

                      Comment

                      Working...
                      X