On Thursday was a rare blog post by Keith Packard to his KeithP.com blog about Sandy Bridge LLC caching.
What I wanted to know was how caching backbuffers affects application performance. The reason for this question is that when the driver uses page flipping to display a new frame, we force the new scanout buffer to be uncached because the display hardware only scans out from main memory. However, when we go back to using this buffer as the next back-buffer, we don’t turn the caching bits back on as that requires writing GTT entries to change caching modes.Keith proceeded to make a kernel patch that can still utilize the existing code, change to flipping back and forth, and then flushing to memory but not disable caching. His blog post describes these modes in more detail. He ran some glxgears and Nexuiz benchmarks with the patch; with his Nexuiz benchmark on Sandy Bridge with the different caching modes he found: "None significantly faster, none significantly slower."
This means that swapping via page-flipping and swapping via copying has a large change in main memory access patterns during rendering — page flipping applications use an uncached back buffer while copying applications use a cached back buffer.
He ended his blog post with, "So, for these two tests, caching has no positive effect on overall rendering performance. Obviously, I need to collect data from more applications to see if the effect is general. I sure hope so, because the alternative will be to find some heuristic to direct when to enable caching."
Having a simple patch to go against the Intel DRM driver from the Linux 3.5 kernel, and craving some new Linux benchmarks now that I returned from Munich last night, I ran some more OpenGL tests this morning from a Sandy Bridge notebook today.
The results are from the stock Linux 3.5 kernel configuration and then adjusting the JUST_FLUSH and FLIP_CACHING defines with Keith's patch from his blog post to hit the different code-paths for the caching modes. A Core i5 "Sandy Bridge" notebook was used running the Ubuntu 12.10 development snapshot from this morning and then the modified stable Linux 3.5 kernels and Git master from today on Mesa 8.1-devel, xf86-video-intel 2.20.2, and libdrm.
Here are these quick benchmarks:The results seem just like Keith's results: the caching was of no benefit to the overall OpenGL performance on the Intel Core i5 Sandy Bridge.