As at least one or two readers within the Phoronix Forums had speculated, the apparent cause of the Radeon GPU performance increase comes down to a CPUfreq change. The reason for the GPU performance change appears to be this Git commit.
cpufreq: ondemand: Change the calculation of target frequency
The ondemand governor calculates load in terms of frequency and increases it only if load_freq is greater than up_threshold multiplied by the current or average frequency. This appears to produce oscillations of frequency between min and max because, for example, a relatively small load can easily saturate minimum frequency and lead the CPU to the max. Then, it will decrease back to the min due to small load_freq.
Change the calculation method of load and target frequency on the basis of the following two observations:
- Load computation should not depend on the current or average measured frequency. For example, absolute load of 80% at 100MHz is not necessarily equivalent to 8% at 1000MHz in the next sampling interval.
- It should be possible to increase the target frequency to any
value present in the frequency table proportional to the absolute load, rather than to the max only, so that:
Target frequency = C * load where we take C = policy->cpuinfo.max_freq / 100.
Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait. Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an increase ~1.5% in performance. cpufreq_stats (time_in_state) shows that middle frequencies are used more, with this patch. Highest and lowest frequencies were used less by ~9%.
[rjw: We have run multiple other tests on kernels with this change applied and in the vast majority of cases it turns out that the resulting performance improvement also leads to reduced consumption of energy. The change is additionally justified by the overall simplification of the code in question.]
The Git commit changed cpufreq with the ondemand governor for how load is calculated. The changes were for load computation not to depend upon the current or average measured frequency and to let the governor select any target frequency proportional to the load rather than just using the max frequency. Interestingly, the developer behind this commit, Stratos Karafotis, had even used the Phoronix Test Suite for verifying the performance changes when working on the patch. However, in his testing he was using the build-linux-kernel test profile and there he found only a 1.5% performance difference. There was no mention of graphics testing having occurred during this work. The patch also mentions there may be power consumption benefits too.
It's somewhat interesting that the Radeon performance changes were due to the CPUfreq ondemand governor changes and that for the other Phoronix.com testing of the Linux 3.12 kernel so far hasn't revealed anything else too exciting: besides some file-system performance changes, the Ivy Bridge and Haswell graphics performance has been the same and in the CPU-bound tests we haven't found anything to get excited about. I haven't yet done any Nouveau Linux 3.12 kernel comparison but that testing is now warranted and will be delivering results on Phoronix shortly. With this being a non-Radeon specific change, the AMD Catalyst (and NVIDIA) drivers could also benefit from this CPU governor change, but on my testing platter is now to run some new open vs. closed-source driver benchmarks in this configuration.