IO_uring Adding Support For Vectored FUTEX Waits In Linux 6.6

Written by Michael Larabel in Linux Storage on 14 August 2023 at 10:16 AM EDT.
With the upcoming Linux 6.6 cycle another exciting change was recently queued up within the block subsystem's "for-next" branch: IO_uring futex/futexv support.

Linux storage expert and IO_uring lead developer Jens Axboe last week queued up his code into linux-block.git's for-next branch for allowing vectored FUTEX waits. In prior patches around this futex/futexv support in IO_uring, Axboe explained:
As far as I can recall, the first request for futex support with io_uring came from Andres Freund, working on postgres. His aio rework of postgres was one of the early adopters of io_uring, and futex support was a natural extension for that. This is relevant from both a usability point of view, as well as for effiency and performance. In Andres's words, for the former:

"Futex wait support in io_uring makes it a lot easier to avoid deadlocks in concurrent programs that have their own buffer pool: Obviously pages in the application buffer pool have to be locked during IO. If the initiator of IO A needs to wait for a held lock B, the holder of lock B might wait for the IO A to complete. The ability to wait for a lock and IO completions at the same time provides an efficient way to avoid such deadlocks."

and in terms of effiency, even without unlocking the full potential yet, Andres says:

"Futex wake support in io_uring is useful because it allows for more efficient directed wakeups. For some "locks" postgres has queues implemented in userspace, with wakeup logic that cannot easily be implemented with FUTEX_WAKE_BITSET on a single "futex word" (imagine waiting for journal flushes to have completed up to a certain point). Thus a "lock release" sometimes need to wake up many processes in a row. A quick-and-dirty conversion to doing these wakeups via io_uring lead to a 3% throughput increase, with 12% fewer context switches, albeit in a fairly extreme workload."

Quite a nice efficiency and performance boost in making use of IO_uring futex/futexv support.

IO_uring futex code queued

Also queued in that branch last week is IO_uring IORING_OP_WAITID support. The IORING_OP_WAITID is a fully async version of waitid.

Linux 6.6 is shaping up to be another exciting kernel cycle with this kernel debuting toward the end of the year.
