New Linux CPU Hot-Plugging Works Out "Nightmare"

Posted by Michael Larabel on January 31, 2013

The current Linux kernel CPU hot-plugging support has been described as "an increasing nightmare full of races and undocumented behaviour", but fortunately it's in the process of being re-developed.

Thomas Gleixner has been one of the kernel developers looking to rework the Linux CPU hot-plug support and published a new patch-set today. Hot-plugging support in general has been a focus lately with work on a common system device hot-plug framework for the kernel, true CPU hot-plug support, ACPI hot-plug improvements, and other efforts in recent months.

Gleixner's CPU hot-plug re-work that was published today consists of 40 patches that amount to over one thousand lines of changed code within the kernel. Below is his description of the massive CPU hot-plug changes for the Linux kernel.
The current CPU hotplug implementation has become an increasing nightmare full of races and undocumented behaviour. The main issue of the current hotplug scheme is the completely asymetric startup/teardown process. The hotplug notifiers are mostly undocumented and the CPU_* actions in lots of implementations seem to be randomly chosen.

We had a long discussion in San Diego last year about reworking the hotplug core into a fully symetric state machine. After a few doomed attempts to convert the existing code into a state machine, I finally found a workable solution.

The following patch series implements a trivial array based state machine, which replaces the existing steps in cpu_up/down and also the notifiers which must run on the hotplugged cpu are converted to a callback array. This documents clearly the ordering of the callbacks and also makes the asymetric behaviour very obvious.

This series converts the stop_machine thread to the smpboot infrastructure, implements the core state machine and converts all notifiers which have ordering constraints plus a randomly chosen bunch of other notifiers to the state machine.

The runtime installed callbacks are immediately executed by the core code on or on behalf of all cpus which have already reached the corresponding state. A non executing installer function is there as well to allow simple migration of the existing notifier maze.

The diffstat of the complete series is appended below.

36 files changed, 1300 insertions(+), 1179 deletions(-)

We add slightly more code at this stage (225 lines alone in a header file), but most of the conversions are removing code and we have only tackled about 30 of 130+ instances. Even with the current conversion state, the resulting text size shrinks already.
The current patch-set can be found in CPU hotplug rework - episode I.

Discuss this article in our forums, IRC channel, or email the author. You can also follow our content via RSS and on social networks like Facebook, Identi.ca, and Twitter (@Phoronix and @MichaelLarabel). Subscribe to Phoronix Premium to view our content without advertisements, view entire articles on a single page, and experience other benefits.
Latest Hardware Reviews
  1. Intel Haswell HD Graphics 4600 vs. AMD Radeon Graphics On Linux
  2. Intel Haswell HD Graphics 4600 Performance On Ubuntu Linux
  3. Intel Core i7 4770K "Haswell" Benchmarks On Ubuntu Linux
  4. The First Experience Of Intel Haswell On Linux
Latest Software Articles
  1. Optimized Binaries Provide Great Benefits For Intel Haswell
  2. 11-Way Linux, BSD Platform Comparison
  3. SNA Acceleration Works Great For Intel Core i7 Haswell
  4. The Linux Evolution For Intel Haswell's Performance
Latest Linux News
  1. KDE's KWin Made Lots Of Progress In 4.11
  2. Ubuntu Announces Carrier Advisory Group
  3. Qt 5.1 Release Candidate 1 Has Arrived
  4. In-Fighting Continues Over Mir On Non-Unity Ubuntu
  5. Subversion 1.8 Presents New Features
  6. LLVM 3.3 Officially Released
  7. LLVM/Clang Now Uses Loop Vectorizer At New Levels
  8. Intel GPU Driver Tries To Rip Out FBDEV Support
  9. Coreboot Doing AMD USB 3.0, Q35 QEMU Emulation
  10. VP9 Codec Now Enabled By Default In Chrome
  11. openSUSE 13.1 M2 Plays On PulseAudio 4.0
Latest Forum Talk
  1. Ubuntu Announces Carrier Advisory Group
  2. In-Fighting Continues Over Mir On Non-Unity Ubuntu
  3. KDE's KWin Made Lots Of Progress In 4.11
  4. Planetary Annihilation Plans To Come To Linux
  5. Intel GPU Driver Tries To Rip Out FBDEV Support
  6. VP9 Codec Now Enabled By Default In Chrome
  1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Motherboards
  5. Peripherals
  6. Processors
  7. Software
  8. Operating Systems
  9. All Articles
  1. Linux Benchmarking
  2. OpenBenchmarking.org
  3. Phoronix Test Suite