Eric Anholt worked on today's performance-enhancing patch, entitled intel: Use a CPU map of the batch on LLC-sharing architectures. The commit description explains, "Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in, which was an improvement over mapping the batch through the GTT directly (since any readback or other failure to stream through write combining correctly would hurt). However, on LLC-sharing architectures we can do better by mapping the batch directly, which reduces the cache footprint of the application since we no longer have this extra copy of a batchbuffer around."
Eric went on to note that this patch, which changes less than three dozen lines of code, improves the performance of GLBenchmark, Lightsmark, and Cairo-GL by a couple percent when running on Ivy Bridge hardware. However, overall this won't affect Intel's Mesa OpenGL performance too very much.