On the mailing list, Chris has supplied a patch entitled "glx: Cache indirect opcode->index conversion." He starts out by saying, "Decoding the opcode into the appropriate index into the dispatch tables is quite expensive using the radix tree. By keeping a small cache, we can dramatically speed up indirect function dispatch."
While this may not seem like anything exciting by just adding a cache for GLX opcode to index conversion for an end-users, what he says next is the exciting part. "World of Padman over the network increased from 28fps to 45fps, with an almost identical increase when run indirectly over a local socket." Yes, that's right, nearly a 60% increase in frame-rate from this simple patch. In the case of the Intel driver and other un-optimized graphics drivers or those running on low-power GPUs, this could mean the difference between a playable gaming experience and not, as is demonstrated by Chris' test with the frame-rate going from below-30 FPS to 45 FPS.
This patch is in the common GLX code so is not specific to any one driver too. However, to benefit from it you must be using indirect rendering.