Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

Written by Michael Larabel in Linux Kernel on 27 December 2022 at 10:52 AM EST. 8 Comments

The Restartable Sequences "RSEQ" system call merged a few years ago into the Linux kernel and is now used by the GNU C Library and friends for faster user-space operations on per-CPU data. Now coming next year with Linux 6.3 is set to be some notable additions to the RSEQ support.

By avoiding atomic operations in cases like incrementing per-CPU counters, modifying per-CPU spinlocks, reading/writing to per-CPU ring buffers, and similar, Restartable Sequences can provide a performance advantage with great benchmark results.

Mathieu Desnoyers who has led much of the RSEQ effort has been working recently to extend the Restartable Sequences ABI to expose the NUMA node ID, mm_cid, and mm_numa_cid fields. Desnoyers explained with the patch series:

The NUMA node ID field allows implementing a faster getcpu(2) in libc.

The per-memory-map concurrency id (mm_cid) allows ideal scaling (down or up) of user-space per-cpu data structures. The concurrency ids allocated within a memory map are tracked by the scheduler, which takes into account the number of concurrently running threads, thus implicitly considering the number of threads, the cpu affinity, the cpusets applying to those threads, and the number of logical cores on the system.

The NUMA-aware concurrency id (mm_numa_cid) is similar to the mm_cid, except that it keeps track of the NUMA node ids with which each cid has been associated. On NUMA systems, when a NUMA-aware concurrency ID is observed by user-space to be associated with a NUMA node, it is guaranteed to never change NUMA node unless a kernel-level NUMA configuration change happens. This is useful for NUMA-aware per-cpu data structures running in environments where a process or a set of processes belonging to cpuset are pinned to a set of cores which belong to a subset of the system's NUMA nodes.

In particular the possibility of faster getcpu() is especially useful for glibc users.

As of this morning the code introducing the extensible RSEQ ABI, adding these new fields, and other RSEQ improvements have been queued into the sched/core branch of TIP. With the Linux 6.2 merge window behind us now, the TIP Git repository is beginning to queue various feature changes in turn for sending into the Linux 6.3 merge window once that opens up in two months time.

These RSEQ additions can be checked out via the sched/core branch.

8 Comments