That still relies upon Steam where the game could be updated without knowing and not for sure at the same version as others.
True, and I agree with some of your points. Steam is a pain to move around across machines, even though you can actually just copy the steamapps folder and Steam is clever enough to figure out the minimal download required. However, just for benchmark, that's still huge amount of wasted data moving around.
Regarding auto updating, it will defeat your very stable baseline since you won't be able to compare the results a month old if there is a patch. However, it's same story if you upgrade your driver/xorg/kernel, etc. It really means that you will have to treat the game as a moving target, and make it part of the things you are benchmarking. Updates will significantly reduce the comparability of benchmark results across a long time, if there is any comparability at all, but that will be the games players actually play, at the time you publish those articles.
If a game is painfully slow after an auto-upgrade, it's a good thing if benchmarking reveals the regression. Or if a game becomes faster, it's good to know that things are improving. If L4D2's version 219 is significantly slower in kernel 3.12-rc1 with Nvidia driver 320.xx than version 210 with NV 319 in 3.10, it's obviously something the game/chip company or kernel devs could pay attention too. If just because L4D2 will auto-update and we don't benchmark them, these problems might just slip without notice. Whether they will pay attention is their problem, but it's always good if someone is raising the problem or even tracking it.
Btw, there are ways to tell the version. Some game needs to check .ini files, and others might have in-game commands. I agree 100% that if version check is in-game, this makes automation very hard to nearly impossible (unless there are command line options for that, and that might be the case).
I still think it will be informative to provide Steam game benchmarks even if they do not form stable baseline overtime. It will nevertheless reflect state-of-art in gaming world, and make readers, especially those Windows users considering trying out Linux gaming more connected.
This is just my two cents and thank you for your continuous Linux benchmarking effort. As the only website I know out there pushing Linux benchmarks, you definitely have an uphill battle trying to get the tools and workflow together.