Userspace RCU Will Be Much Faster For Its Next Release Paired With Linux 4.14+
The liburcu Userspace RCU data synchronization library should be significantly faster when built with a modern Linux kernel release.
Added to the Linux 4.14 kernel was a expedited private command for membarrier system call (MEMBARRIER_CMD_PRIVATE_EXPEDITED) that is now being taken advantage of by liburcu. The Linux kernel documentation explains this new membarrier system call flag as, "Execute a memory barrier on each running thread belonging to the same process as the current thread. Upon return from system call, the caller thread is ensured that all its running threads siblings have passed through a state where all memory accesses to user-space addresses match program order between entry to and return from the system call (non-running threads are de facto in such a state). This only covers threads from the same processes as the caller thread. This command returns 0. The "expedited" commands complete faster than the non-expedited ones, they never block, but have the downside of causing extra overhead."
The User-Space RCU library can now be built with this support to offer faster performance and now never blocks the calling thread. This functionality will be included in the upcoming liburcu 0.11 release.
In turn liburcu making use of MEMBARRIER_CMD_PRIVATE_EXPEDITED should speed-up the LTTng open-source tracing framework and other programs that make use of the user-space read-copy-update library.
A recent blog post via the LTTng Blog goes over the impact of this work in much greater detail for those interested in the technicals behind it. Those unfamiliar with the User-Space RCU library can learn more at liburcu.org and this Git commit describes more about the MEMBARRIER_CMD_PRIVATE_EXPEDITED behavior.
Added to the Linux 4.14 kernel was a expedited private command for membarrier system call (MEMBARRIER_CMD_PRIVATE_EXPEDITED) that is now being taken advantage of by liburcu. The Linux kernel documentation explains this new membarrier system call flag as, "Execute a memory barrier on each running thread belonging to the same process as the current thread. Upon return from system call, the caller thread is ensured that all its running threads siblings have passed through a state where all memory accesses to user-space addresses match program order between entry to and return from the system call (non-running threads are de facto in such a state). This only covers threads from the same processes as the caller thread. This command returns 0. The "expedited" commands complete faster than the non-expedited ones, they never block, but have the downside of causing extra overhead."
The User-Space RCU library can now be built with this support to offer faster performance and now never blocks the calling thread. This functionality will be included in the upcoming liburcu 0.11 release.
In turn liburcu making use of MEMBARRIER_CMD_PRIVATE_EXPEDITED should speed-up the LTTng open-source tracing framework and other programs that make use of the user-space read-copy-update library.
A recent blog post via the LTTng Blog goes over the impact of this work in much greater detail for those interested in the technicals behind it. Those unfamiliar with the User-Space RCU library can learn more at liburcu.org and this Git commit describes more about the MEMBARRIER_CMD_PRIVATE_EXPEDITED behavior.
4 Comments