There were a number of commits pertaining to the i965 driver's fragment shared pushed this afternoon (i965/fs). Most notably, this should help Ivy Bridge with KWin when using the scaling-related effects using the OpenGL 2.x renderer.
The bug report relevant to this Intel IVB OpenGL performance regression is covered at FreeDesktop.org. There's several commits relevant to this optimization work. One of the commits by Eric Anholt also mentions a significant performance improvement with one of his GL Shading Language demos.
Like we have done for the VS and for constant-index uniform loads, we use the sampler engine to get caching in front of the L3 to avoid tickling the IVB L3 bug. This is also a bit of a functional change, as we're now loading a vec4 instead of a single dword, though we're not taking advantage of the other 3 components of the vec4 (yet).It also looks like the newly-merged patch series takes care of some Sandy Bridge performance issues too.
With the driver hacked to always take the varying-index path for all uniforms, improves performance of my old GLSL demo by 315% +/- 2% (n=4). This a major fix for some blur shaders in compositors from the varying-index uniforms support I introduced in 9.1.