AMD CPU Microcode Loading On Linux Being Fixed Up To Be Per-Thread
Up to this point loading updated CPU microcode on AMD processors under Linux has checked just to ensure every physical CPU core was loaded with the new microcode but not sibling threads for SMT processors. While logically that makes sense, it turns out some AMD microcode updates do carry out per-thread modifications that means the microcode updating needs to be carried out on every thread. A Linux fix is on its way to the kernel to adjust that behavior.
Queued up today as part of TIP's x86/microcode branch is a patch so AMD CPU microcode loading now is attempted on every logical thread rather than checking to see at a physical CPU core level that the microcode update is carried out and then ignoring any sibling thread of each core.
AMD microcode updates potentially carrying out per-thread modifications was figured out while kernel developers were debugging an issue. Since early July was a bug report around the lightweight profiling "LWP" instructions only being exposed on half of the CPU cores/threads for an AMD Bulldozer/Piledriver system under Linux. While the LWP instructions are seldom used, it can create problems when compiling code with "-march=native" and then running the code and finding different behavior depending upon whether the execution is happening on one of the threads that has the CPU feature exposed.
The situation initially perplexed kernel developers as well as an AMD Linux developer on the thread but then the bug report author discovered this LWP feature exposure difference was caused by the CPU microcode. Back when AMD was working on their Spectre V2 mitigation and introducing IBPB (Indirect Branch Prediction Barrier) in the microcode, they dropped LWP from K8 and K10 processor families due to that feature seeing little usage.
It turns out when BIOSes on AMD systems carry out microcode updates at boot, it's done on a per-thread basis (and presumably for Windows as well). But on Linux the AMD CPU microcode updates were only being checked for a physical per-core basis and skipping the update for the sibling thread. Now at least with this LWP bug report there is evidence of per-thread modifications being performed. It's also possible other AMD CPU microcode updates have been carrying out per-thread modifications too, but not noticed until this very evident difference stemming from removing LWP being advertised on older AMD CPUs.
In any event this patch is now working its way to the kernel for properly carrying out microcode updates on all AMD CPU threads.
Queued up today as part of TIP's x86/microcode branch is a patch so AMD CPU microcode loading now is attempted on every logical thread rather than checking to see at a physical CPU core level that the microcode update is carried out and then ignoring any sibling thread of each core.
AMD microcode updates potentially carrying out per-thread modifications was figured out while kernel developers were debugging an issue. Since early July was a bug report around the lightweight profiling "LWP" instructions only being exposed on half of the CPU cores/threads for an AMD Bulldozer/Piledriver system under Linux. While the LWP instructions are seldom used, it can create problems when compiling code with "-march=native" and then running the code and finding different behavior depending upon whether the execution is happening on one of the threads that has the CPU feature exposed.
The situation initially perplexed kernel developers as well as an AMD Linux developer on the thread but then the bug report author discovered this LWP feature exposure difference was caused by the CPU microcode. Back when AMD was working on their Spectre V2 mitigation and introducing IBPB (Indirect Branch Prediction Barrier) in the microcode, they dropped LWP from K8 and K10 processor families due to that feature seeing little usage.
It turns out when BIOSes on AMD systems carry out microcode updates at boot, it's done on a per-thread basis (and presumably for Windows as well). But on Linux the AMD CPU microcode updates were only being checked for a physical per-core basis and skipping the update for the sibling thread. Now at least with this LWP bug report there is evidence of per-thread modifications being performed. It's also possible other AMD CPU microcode updates have been carrying out per-thread modifications too, but not noticed until this very evident difference stemming from removing LWP being advertised on older AMD CPUs.
In any event this patch is now working its way to the kernel for properly carrying out microcode updates on all AMD CPU threads.
36 Comments