Performance regression in kernel 3.12
I noticed a performance regression for a raytracer between Fedora 20 kernel 3.11.10 and 3.12.6. On a Haswell system, for example, radiance is about 7% slower with the newer kernel. Note that this is an identical binary, identical glibc, ... - I just have to boot the older kernel to get a speedup. This is 100% CPU-bound user-level code - almost no IO or other kernel calls.
A more complete description is on stackoverlow: http://stackoverflow.com/questions/2...kernel-version
Has anyone seen something like this in other distributions or with different programs? Any idea about the cause? I've looked at the overviews of the kernel changes, but nothing seemed relevant.
IIRC the Intel cpu governor changed, check that both your kernels are using the same settings in that area.
If they are, then it's bisecting time for you
I was using the default ondemand scheduler. Will try performance to see if it changes anything.
The perf output showed almost identical numbers of context switches and effecive frequency (max turbo on both systems), so I have little hope it will have any effect.
Haven't tried bisecting yet - actually, the last time I compiled the kernel myself must be 15 - 20 years ago. I think I'll leave that one for others :-)
Tested it on the AMD system: Changing the cpufreq governor from "ondemand" to "performance" does not have any effect - same run time.
Since you asked: The most reliable way to find out what happened is bisecting:
But that can take a long time. Probably can save many reboots when first testing whether 3.12-rc1 was already bad.
I'd also test 3.13-rc7 whether it's maybe already solved again.
This has now been resolved on StackOverflow, see http://stackoverflow.com/questions/2...kernel-version
In short (and hoping that I get this right): a bug in the "soft dirty" page flag behavior in 3.12 led to huge pages not being used, which leads to more TLB misses, which explains the performance hit.
Tags for this Thread