Somehing I would like to see
It would be nice if instead of just saying for example "xfce is faster than gnome", say "xfce is faster than gnome because of more hevy gtk effect using and more ram hungry proccess"(for example),Also when talking about distros I hear a lot of "A is faster than B" but little of, "A is faster than B because A is using X version of package Y, which is more optimized or didin't have a regression." I mean if the objetive is to make linux better, only numbers are not enough, the explanation about the numbers is more important I think.
About default in tests, I would say "it depends" as some other said:
1-For driver testing, make the input and output the same, and if there are any important configuration (xorg tweaks) test them if you have time for it, if not then use defaults in that part (no xorg)
2-For distro testing, leaving everything default is perfect, but as I said it would be nicer if you say distro A is faster than B because of this configuration, so that the users/packagers of the slow distro can fix it.
3-For compiler testing: You should test the compilers with the recommended settings, not only the defaults, if llvm-gcc is designed to work with -O3 then use it with -O3, use the compilers as a real user or developer would use them.
4-For hardware reviews(this is the harder):You should start using he default options for everything but taking care that the input and output of everything is the same (in the case of games), also different distros and configurations should be used.
Imagine what happens if you compare 2 different cards, and by casualty you use a distro or package which has a BIG regression on 1 card. I think you should test hardware across at least 3 totally different distributions, like fedora, ubuntu, and slackware.