Announcement

**dammarin** · 11 January 2024, 03:43 AM

Well, it lets you take a couple sips of coffee between all the merging, so I’d say it’s necessary.

**Doomer** · 11 January 2024, 05:07 AM

this remind me of the massive work done by Ingo Molnar to refactor the headers within the kernel to speed up compilation, is it still being worked on or is it dropped? I can't seem to find any new information on this subject.

**mrg666** · 11 January 2024, 05:49 AM

Originally posted by CommunityMember View Post

Linus eats his own dog food (as he makes it). He merged various patches, compiled the kernel, and then booted into the new kernel to continue his work. His future compiles are slower. He then bisects the patches he merged to identify a likely suspect.

While compiling a kernel does not exercise all possible kernel code paths (and could always be an anomaly), and additional performance testing is necessary (there is a set of systems doing just that), moving from 22 seconds to 44 seconds for an empty build was significantly noticeable, and Linus was not pleased.

Linux kernel build is a very useful benchmark for system performance. And it is is also a hardware stability test. I call a new machine stable after it builds the kernel.

**yump** · 11 January 2024, 06:51 AM

So, this is the culprit patch.

effective_cpu_util() interface changes and now returns the actual
utilization of the CPU with 2 optional inputs:

- The minimum performance for this CPU; typically the capacity to handle
the deadline task and the interrupt pressure. But also uclamp_min
request when available.

- The maximum targeting performance for this CPU which reflects the
maximum level that we would like to not exceed. By default it will be
the CPU capacity but can be reduced because of some performance hints
set with uclamp. The value can be lower than actual utilization and/or
min performance level.

That smells fishy.

The only Zen 2 I have access to is a 4500U, and on that machine, which uses acpi-cpufreq on kernel 6.6.9, cpuinfo_max_freq is completely bogus:

Code:

sudo grep . cpu4/cpufreq/*freq
cpu4/cpufreq/cpuinfo_cur_freq:2375000
cpu4/cpufreq/cpuinfo_max_freq:2375000
cpu4/cpufreq/cpuinfo_min_freq:1400000
cpu4/cpufreq/scaling_cur_freq:3992447
cpu4/cpufreq/scaling_max_freq:2375000
cpu4/cpufreq/scaling_min_freq:1400000

Also, "cpuinfo_cur_freq" requires root privileges to read, but the one that actually shows the real frequency (and might expose a power side channel) is scaling_cur_freq.

IMO this is a combination of Arm people ignoring PC hardware and probably not dogfooding, and AMD being asleep at the console for years.

**anarki2** · 11 January 2024, 07:29 AM

Originally posted by NeoMorpheus View Post

I bet you good money that this code only affects Ryzens and work marvelously in Intels….

Backtracking the bad code leads us to……😈

Oh no, the smell of intel fanboys.

**M.Bahr** · 11 January 2024, 07:49 AM

Off-topic but it still bugs me that intel is the cause why ECC in RAM is still not the norm for consumer pcs. We urgently need this with growing kernel complexity, software RAM requirements and debugging. To put it with the words of Torvalds.:

"We have decades of odd random kernel oopses that could never be explained and were likely due to bad memory. And if it causes a kernel oops, I can guarantee that there are several orders of magnitude more cases where it just caused a bit-flip that just never ended up being so critical."

**Hans Bull** · 11 January 2024, 09:08 AM

So sad that regressions like this are not discovered by automated testing, but by Linus compiling himself.

**oleid** · 11 January 2024, 09:30 AM

Originally posted by uxmkt View Post

When you do this a thousand times over (which is what workstations and servers normally are used for!), it quickly adds up.

Only Linus can answer this. But since he didn't upgrade his system, yet, he most likely things it is not worth it, yet.

**bug77** · 11 January 2024, 11:02 AM

Originally posted by Hans Bull View Post

So sad that regressions like this are not discovered by automated testing, but by Linus compiling himself.

How would you write a test case to consider the total compilation time? Considering the billion ways to build a Linux kernel?
And if you're thinking about testing just the functionality that was changed, that may only exhibit a slight overhead that you wouldn't normally notice.

I'm all for automated testing, but this looks like a case you would expect to only catch when testing with a human at the helm.

**NeoMorpheus** · 11 January 2024, 11:20 AM

Originally posted by stormcrow View Post

...lead Linus to 4 commits from Linaro who happens to be an Arm hardware group. (Read all the article :P ) It has nothing to do with AMD so it shouldn't have affected any hardware but Arm to begin with. If it had screwed with performance with POWER 9 CPUs it'd still have to have been reverted cuz someone forgot to check the CPUIDs.

Was intended to be a joke and a jab to intel old dirty practices. :-)

Announcement

Linus Torvalds Hits Nasty Performance Regression With Early Linux 6.8 Code

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment