Originally posted by Hi-Angel
View Post
It's a lot trickier than you might think. As I pointed out, these techniques have been used for a long time in GPUs and in rendering libraries client side, but it also requires some help from the application to get any benefit from it.
Back to my browser example - Firefox has decided to start rendering full frames and pushing them to the surface regardless of how much of the screen was altered. When Firefox is fullscreen, it's going to be pushing 60hz full resolution non-stop regardless of whether anything has changed at all. So then if the OS wants to reduce power or latency by controlling bandwidth then the DRM is going to need at the very least some frame-to-frame comparisons to detect the regions to send. Same thing if the desktop compositor is sending in full frames all the time regardless of how much has changed. (My 5-year-old Panther Point i965 supports Display Link Power Management and Framebuffer Compression to minimise bandwidth so the DLPM can engage as long as possible, and I assume it differences the frames as well as compressing the differences.)
One of the reasons I've never used Unity or Gnome 3+ for any significant time is because their compositors use far more CPU than KWin, XFCE or Compton, and I haven't looked but if they're burning 14% of 1 core and keeping the GPU warm when literally nothing is changing on my 5520x2160 dual-monitor screen - then they are very plainly not limiting themselves to updating only the changed screen regions.
On the Application front, Firefox is definitely stepping in the wrong direction. They're going to screw users' mobile battery life pushing all that data to the compositor. Or maybe they assume the compositor will bypass them like a video player or a game.... hard to say... but on top of that they're also moving to a layout engine that uses 3-4x the compute to get 2x the performance so that is also going to wreck battery life.
It'll be interesting to see how DRM damage rectangles gets properly leveraged by applications. I really look forward to it because there's a lot of gains to be made if the display stack and app developers are prepared to educate themselves and do the hard work to make it deliver results. Firefox is headed the wrong direction but an example of an app that's about as un-optimal as can be is Skype. It's notification area icon updates thousands (!) of times per second as you log in and for a while after, driving your compositor into a meltdown. If you really want to lock up your UI, try disconnecting the internet a fraction of a second after you start up Skype, and for fun leave top running so you can watch your compositor doing backflips.
Yes Skype is rendering the entire UI nonresponsive... but the compositor is equally to blame for actually rendering all those redundant updates off-screen instead of discarding the occlusions. It gets 16+ updates to the exact same rectangle every single frame and yet the compositor doesn't choose to ignore the first 15 redundant events. If your GPU saw 15 updates to the same rectangle it would simply discard the occluded ones.
Why does the GPU discard them? Because someone cared and took the time to make that optimisation and it made their product more competitive as a result.
How do we get everyone at all levels of the stack to collaborate and make the effort to make these optimisations... because really they do need to be considered at all levels. I have no clue. I don't think most developers care to understand the whole stack, to worry about watching what pixels are being pushed in what order and whether any redundant work is being done. That's basically the wheelhouse of those involved with rasterization libraries for fonts and widgets and games. I know my own attitude many times has been along the lines of "It runs nicely on my workstation, on my iphones, on my ICS tablet, and hardware just keeps getting faster so I'd rather focus on features than tweaking."
Comment