Well, it lets you take a couple sips of coffee between all the merging, so I’d say it’s necessary.
Announcement
Collapse
No announcement yet.
Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code
Collapse
X
-
Originally posted by CommunityMember View Post
Linus eats his own dog food (as he makes it). He merged various patches, compiled the kernel, and then booted into the new kernel to continue his work. His future compiles are slower. He then bisects the patches he merged to identify a likely suspect.
While compiling a kernel does not exercise all possible kernel code paths (and could always be an anomaly), and additional performance testing is necessary (there is a set of systems doing just that), moving from 22 seconds to 44 seconds for an empty build was significantly noticeable, and Linus was not pleased.
- Likes 10
Comment
-
So, this is the culprit patch.
effective_cpu_util() interface changes and now returns the actual
utilization of the CPU with 2 optional inputs:
- The minimum performance for this CPU; typically the capacity to handle
the deadline task and the interrupt pressure. But also uclamp_min
request when available.
- The maximum targeting performance for this CPU which reflects the
maximum level that we would like to not exceed. By default it will be
the CPU capacity but can be reduced because of some performance hints
set with uclamp. The value can be lower than actual utilization and/or
min performance level.
The only Zen 2 I have access to is a 4500U, and on that machine, which uses acpi-cpufreq on kernel 6.6.9, cpuinfo_max_freq is completely bogus:
Code:sudo grep . cpu4/cpufreq/*freq cpu4/cpufreq/cpuinfo_cur_freq:2375000 cpu4/cpufreq/cpuinfo_max_freq:2375000 cpu4/cpufreq/cpuinfo_min_freq:1400000 cpu4/cpufreq/scaling_cur_freq:3992447 cpu4/cpufreq/scaling_max_freq:2375000 cpu4/cpufreq/scaling_min_freq:1400000
IMO this is a combination of Arm people ignoring PC hardware and probably not dogfooding, and AMD being asleep at the console for years.
- Likes 6
Comment
-
Off-topic but it still bugs me that intel is the cause why ECC in RAM is still not the norm for consumer pcs. We urgently need this with growing kernel complexity, software RAM requirements and debugging. To put it with the words of Torvalds.:
"We have decades of odd random kernel oopses that could never be explained and were likely due to bad memory. And if it causes a kernel oops, I can guarantee that there are several orders of magnitude more cases where it just caused a bit-flip that just never ended up being so critical."
- Likes 6
Comment
-
Originally posted by uxmkt View PostWhen you do this a thousand times over (which is what workstations and servers normally are used for!), it quickly adds up.
- Likes 1
Comment
-
Originally posted by Hans Bull View PostSo sad that regressions like this are not discovered by automated testing, but by Linus compiling himself.
And if you're thinking about testing just the functionality that was changed, that may only exhibit a slight overhead that you wouldn't normally notice.
I'm all for automated testing, but this looks like a case you would expect to only catch when testing with a human at the helm.
- Likes 1
Comment
-
Originally posted by stormcrow View Post
...lead Linus to 4 commits from Linaro who happens to be an Arm hardware group. (Read all the article :P ) It has nothing to do with AMD so it shouldn't have affected any hardware but Arm to begin with. If it had screwed with performance with POWER 9 CPUs it'd still have to have been reverted cuz someone forgot to check the CPUIDs.
- Likes 2
Comment
Comment