AMD Aims To Squeeze More EPYC Performance Out Of Linux With User-Space Hinting For Tasks
Ahead of a Linux Plumbers Conference (LPC) session next week in Dublin where AMD will be leading a discussion over enhancing the scheduler for split-LLC architectures, K Prateek Nayak of AMD's Linux server team posted an early patch series around adding user-space hinting for task placement.
The patch series is marked "request for comments" and "experimental" but it adds low-level knobs for influencing the scheduler's placement of user-space tasks depending upon hints supplied by user-space.
The current API design is experimental and is only capable of setting low-level hints. This API is not meant for public consumption and only serves as a means to test and demonstrate the efficacy of hints in helping the scheduler make optimal placement decisions based on the requirements provided by the applications. Scheduler is free to ignore the hints set by the user if it believes that following the hints will put the system in a suboptimal state.
- Motivation
The heuristics used by the scheduler today, such as the WF_SYNC flag, wake_wide() logic, etc., fall short at accurately inferring the nature of the workload in terms of whether it is preferable to consolidate a group of threads close together or if they should be spread apart. The inability to infer the nature of the workload can lead to a series of incorrect placement decisions that can be detrimental to the workload performance. The penalty seems to be severe on systems with split-LLC such as AMD EPYC.
A year ago Peter Zijlstra of Intel's Linux kernel team also suggested a high-level hinting framework may be needed for helping the kernel scheduler's task placement with the ever increasingly complex processors and workloads. Among the hints with AMD's initial patch series are for being able to prefer task placement close to the parent if there is an idle core in the local group, preferring to go with the group having the least utilization to spread out the workload, and other potential hints are also being discussed. The user-space hinting in current form is done via the prctl() interface.
AMD continues ramping up their work on the Linux kernel to better optimize it for EPYC server workloads. Additionally, AMD engineers also have been increasing their Ryzen Linux client contributions too.
AMD's preliminary testing of the user-space hinting patches have shown the potential to increase EPYC server performance even further for various workloads tested like Hackbench, Schbench, Tbench, and others. Some testing has also been done on Xeon Ice Lake too where for some workloads this user-space hinting can be beneficial.
Again, the work for now is still very experimental but will be discussed further next week at LPC Dublin and will presumably still take some months for this user-space hinting to get ironed out before being possibly suitable for upstreaming. The current experimental patches for those interested can be found on the kernel mailing list. I'll certainly be following this AMD patch work and will be testing it once it looks like they may have the kernel changes settled down and in agreement from all the key kernel stakeholders / vendors.