Linux 5.4 Kernel To Bring Improved Load Balancing On AMD EPYC Servers
Adding to the growing list of features for Linux 5.4 with its cycle officially kicking off in mid-September is a kernel scheduler optimization designed to improve load balancing on AMD EPYC servers.
The scheduler topology improvement by SUSE's Matt Fleming changes the behavior as currently it turns out for EPYC hardware the kernel has failed to properly load balance across NUMA nodes on different sockets.
AMD EPYC/Zen processors now overrides the node reclaim distance to better account for the CPU's architecture. From one of the code comments, "AMD EPYC machines use this because even though the 2-hop distance is 32 (3.2x slower than a local memory access) performance actually *improves* if allowed to reclaim memory and load balance tasks between NUMA nodes 2-hops apart."
The change goes into more details and is part of the core scheduler changes queued ahead of the Linux 5.4 merge window opening up in two weeks.
The scheduler topology improvement by SUSE's Matt Fleming changes the behavior as currently it turns out for EPYC hardware the kernel has failed to properly load balance across NUMA nodes on different sockets.
AMD EPYC/Zen processors now overrides the node reclaim distance to better account for the CPU's architecture. From one of the code comments, "AMD EPYC machines use this because even though the 2-hop distance is 32 (3.2x slower than a local memory access) performance actually *improves* if allowed to reclaim memory and load balance tasks between NUMA nodes 2-hops apart."
The change goes into more details and is part of the core scheduler changes queued ahead of the Linux 5.4 merge window opening up in two weeks.
5 Comments