In one commit that affects several hundred lines of code, the command submission and resource space checking was reworked. This rework eliminates some CPU over-head and with some OpenGL software the performance is much faster, such as with TORCS where the performance goes up by 10%. Right now this commit is for the Radeon R300g driver that supports the R300/R400/R500 ASICs (up through the Radeon X1000 series), but Marek Olšák, the patch's author, says it could be ported to the R600g driver too.
In another commit today by Marek, there is another performance boost in ensuring the same user-buffer is not uploaded multiple times to the GPU.
While not yet merged, Christian König is also working on some unrelated optimizations. For his XvMC state tracker on R600g he has been working to optimize the shaders used for the iDCT and MC code. This is in the form of TGSI code optimizations and also TGSI to the R600-specific hardware code generation. Among the optimizations include removing the temporary register usage from most instructions, special constants, TEX and VTX joning, reworked swizzle code, fully implemented barrier handling, reworked literal handling, and implement register remapping. This chunk of work done by Christian has resulted in a 25% performance boost for 720 x 480p video playback and 5~7% for 1080i/1080p video playback. You can read about this work on the Mesa mailing list.