Cleaning Up A Mess: Linux 6.9 Likely To Land Rework Of x86 CPU Topology Code

Written by Michael Larabel in Hardware on 16 February 2024 at 12:38 PM EST. 43 Comments
HARDWARE
Longtime Linux kernel developer Thomas Gleixner with Intel-owned Linutronix has been spending much time over the past several months reworking the Linux kernel's x86 CPU topology evaluation code. This is to clean-up a mess of aging kernel code as well as some areas of the code being incorrect in today's era of hybrid Intel Core processors with a mix of P / E cores with the E cores lacking SMT/HT and thus throwing off prior kernel assumptions. With the code now queued up in a TIP branch today, it looks like that CPU topology rework could be good to go with Linux 6.9.

Gleixner has been working since last summer to improve the x86 CPU toplogy evaluation code within the Linux kernel. As with much of the kernel code that was started long ago, it has grown hairy over time as well as having incorrect assumptions given today's Intel Core hybrid processor designs. Gleixner had explained last summer in his original patch cover letter:
"A recent commit to the CPUID leaf 0xb/0x1f parser made me look deeper at the way how topology is evaluated. That "fix" is just yet another cure the sypmtom hack which completely ignores the underlying disaster.

The way how topology evaluation works is to overwrite the relevant variables as often as possible. E.g. smp_num_siblings gets overwritten a gazillion times, which is wrong to begin with. The boot CPU writes it 3 times, each AP two times.

What's worse is that this just works by chance on hybrid systems due to the fact that the existing ones all seem to boot on a P-Core which has SMT. Would it boot on a E-Core which has no SMT, then parts of the early topology evaluation including the primary thread mask which is required for parallel CPU bringup would be completely wrong. Overwriting it later on with the correct value does not help at all.

What's wrong today with hybrid already is the number of cores per package. On an ADL with 8 P-Cores and 8 E-cores the resulting number of cores per package is evaluated to be 12. Which is not further surprising because the CPUID 0xb/0x1f parser looks at the number of logical processors at core level and divides them by the number of SMP siblings.

24 / 2 = 12

Just that this CPU has obviously 16 cores not 12.

It's is even clearly documented in the SDM that this is wrong.
...
This "_NOT_ to use for topology evaluation" sentence existed even before hybrid came along and got ignored. The code worked by chance, but with hybrid all bets are off. The code completely falls apart once CPUID leaf 0x1f enumerates any topology level between CORE and DIE, but that's not a suprise.

The proper thing to do is to actually evaluate the full topology including the non-present (hotpluggable) CPUs based on the APICIDs which are provided by the firmware and a proper topology domain level parser. This can exactly tell the number of physical packages, logical packages etc. _before_ even booting a single AP. All of that can be evaluated upfront.

Aside of that there are too many places which do their own topology evaluation, but there is absolutely no central point which can actually provide all of that information in a consistent way. This needs to change."

Over the past half-year this big patch series improving the Intel / AMD / Hygon / Centaur / Zhaoxin CPU topology evaluation code has been revised six times.

Intel and AMD x86 processors


With it now in good shape, the many patches were queued within tip/tip.git's x86/apic branch. With it making its way to a TIP branch, it's likely to be submitted for next month's Linux 6.9 merge window unless any new issues come to light or objections raised by Linus Torvalds.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week