Is AMD's New 2D Acceleration Architecture Still Slow?
Phoronix: Is AMD's New 2D Acceleration Architecture Still Slow?
Earlier this week AMD released the Catalyst 10.6 driver that on the Linux side of the table had finally made use by default of their new 2D acceleration architecture, offered official support for Red Hat Enterprise Linux 5.5, and formalized their OpenGL 3.3/4.0 support. Since the release of the Catalyst 10.6 Linux driver, we have been running a new set of tests on their new ATI 2D acceleration architecture, but the results are not what you may expect when compared to the open-source ATI Linux driver.
the architecture was designed to circumvent an XAA operation that's bloody slow and is triggered on unminimizing and other operations where it incurs a visible delay. PTS doesn't test that, but it results in a noticably better user experience. fps isn't everything, with GUIs responsiveness is much more important.
Originally Posted by bridgman
Agreed, The new A2AA 2D stack is so much more usable than the old XAA 2D stack.
I think it indicates that those benchmarks are not targeted correctly, since with the exception of the Render benchmark, the new stack lost everywhere.
This is so contrary to actual working experience.
I mean, For the last 2 releases even Kano (biggest pessimist on the forums) was seeming positive about the ATI drivers.
I can only say that despite these results everything still feels faster with the 10.6 drivers. And the minimize/maximize of windows with desktop effects on (KWin) is definitely smoother than what it used to be on my system with the OS drivers. I am using the integrated graphics on my 785G motherboard though, and on faster graphics cards maybe it's different. I do note that dragging windows around (again with desktop effects on) is a bit jerky, but everything else seems smooth and responsive enough. Without desktop effects all actions look and feel very smooth and responsive. That's my experience anyway.
Oh, I loose window borders when activating/deactivating desktop effects though, leaving me no choice but to terminate the session and restart, but that's the only down side to the catalyst drivers so far.
Please don't use QGears image scaling as benchmark
Please don't use "QGears - Image Scaling" for 2D benchmarks.
QT isn't able to leverage the XRender extension for image scaling, so the scaling isn't done on hardware at all.
This is the reason why the "old" Catalyst is faster in this test, because QT forces the driver out of its acceleration state, copying data back and forth between RAM and video-ram which is painful.
I haven't seen any benchmarks which map particularly well onto the driver workload that you see from a modern compositing UI and toolkit-based apps. Not sure if it's as simple as scripting up a typical interaction (eg open a browser, browse an on-disk HTML page, scroll around, minimize, maximize) but that might be a good place to start. Worst case I guess some benchmarks need to be implemented using the latest toolkits ?
Perversely enough performance while drawing things on the screen seems to have less correlation to user experience than all the stuff that happens off screen.
"Is AMD's New 2D Acceleration Architecture Still Slow?"
It is meant in the way that it might be as slow as the Catalyst driver used to be for years. I agree it can be misleading at first.
Originally Posted by frische
I have to agree. Michael, you should at least mention that the primary reason people think this driver is faster is because the much-maligned minimize/maximize delay has been fixed. It is now possible to have a maximized Rhythmbox window in a GNOME desktop, with compositing enabled, and minimize/maximize it from the taskbar several times per second (if you're quick with the mouse). Before, it would take *several seconds* to perform *one* maximize after you had minimized it.
Originally Posted by rohcQaH
The other really slow operation before was resizing windows. Go to Catalyst version < 10.6 and try to resize a default gnome-terminal to anything but 80x25. Wait. Wait. Wait. OK there it is -- finally!
I also had a rendering error on 10.5 and earlier which no longer appears. Sometimes, when starting a GTK or Qt4 application that started maximized, all the controls in the window, except for the background color, would be completely missing. To see them, I had to minimize the window, maximize it, and wait the 2-3 seconds it takes for it to restore. I haven't been able to reproduce this at all on 10.6. It was easy to reproduce this using Rhythmbox and Quassel, for example.
I do want to remark that the quantitative test that showed a performance regression rendering text is actually noticeable. The smoothness of scrolling a browser or IRC window was user-perceptibly better in 10.5 and earlier. At the expense of rendering correctness and eliminating some rather serious performance problems, text rendering seems to be just barely acceptable in terms of its performance. There's a bit of a shimmering, or stuttering effect when trying to scroll any text window containing a *lot* of glyphs (such as a maximized IRC window full of long sentences on a 1920x1200 monitor). It's more noticeable in Google Chrome (on Phoronix or Gmail) than Quassel, though.
As for the other regressions, well, I think my HD5970 is proving fast enough to make up for the software slowness. I can't perceive a difference, because, like bridgman said, the tests measure operations that are not commonly done in standard desktop apps.
It would be nice to see a threshold of operations/sec or FPS on your tests, Michael -- a red line across each bar -- representing the minimum point at which the average user could easily notice performance problems in standard desktop apps if they had performance lower than that.
For 3d games, this is really easy; for perfection, you would set the red line at 60 FPS; for playability, you'd set it at 30 FPS. The ones that test operations/sec for things like radio buttons or combo buttons, though, are probably to the point where you could set the red line right at the bottom (a few hundred operations per second) for the user-perceptible barrier. I usually click about 0 - 1 radio buttons per second, so if it can support 4700 of them per second, that's more than fine.
I think it's just plain wrong the way it is.
Originally Posted by d2kx
Tags for this Thread