Restartable Sequences "RSEQ" Seeing Up To 16.7x Speedup With Newest Linux Patch
For those making use of Restartable Sequences (RSEQ) on Linux systems, there is an enticing performance optimization on the way.
RSEQ as a reminder is a low-level synchronization primitive for operating on per-CPU data in user-space. With work by Mathieu Desnoyers, there is an improvement around cache locality for RSEQ concurrency IDs for intermittent workloads. Desnoyers explained in the patch:
With the newest work to improve the per-MM-CID cache locality, there can be very nice speed-ups for intermittent workloads:
Those interested in this performance optimization work for Restartable Sequences can see this patch for all the details.
RSEQ as a reminder is a low-level synchronization primitive for operating on per-CPU data in user-space. With work by Mathieu Desnoyers, there is an improvement around cache locality for RSEQ concurrency IDs for intermittent workloads. Desnoyers explained in the patch:
"commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid") introduced a per-mm/cpu current concurrency id (mm_cid), which keeps a reference to the concurrency id allocated for each CPU. This reference expires shortly after a 100ms delay.
These per-CPU references keep the per-mm-cid data cache-local in situations where threads are running at least once on each CPU within each 100ms window, thus keeping the per-cpu reference alive.
However, intermittent workloads behaving in bursts spaced by more than 100ms on each CPU exhibit bad cache locality and degraded performance compared to purely per-cpu data indexing, because concurrency IDs are allocated over various CPUs and cores, therefore losing cache locality of the associated data."
With the newest work to improve the per-MM-CID cache locality, there can be very nice speed-ups for intermittent workloads:
Those interested in this performance optimization work for Restartable Sequences can see this patch for all the details.
2 Comments