IO_uring Zerocopy Send Is Ready For Linux 5.20 Networking
For months Pavel Begunkov has been working on IO_uring zero-copy send support and now it's all buttoned up and ready to be merged come Linux 5.20. The benchmarks have been looking great and the code is now mature enough for mainline.
As of yesterday the "io_uring-zerocopy-send" support was queued into net-next as the networking subsystem code for the next kernel merge window.
The patchset implements io_uring zerocopy send. It works with both registered and normal buffers, mixing is allowed but not recommended. Apart from usual request completions, just as with MSG_ZEROCOPY, io_uring separately notifies the userspace when buffers are freed and can be reused (see API design below), which is delivered into io_uring's Completion Queue. Those "buffer-free" notifications are not necessarily per request, but the userspace has control over it and should explicitly attaching a number of requests to a single notification. The series also adds some internal optimisations when used with registered buffers like removing page referencing.
From the kernel networking perspective there are two main changes. The first one is passing ubuf_info into the network layer from io_uring (inside of an in kernel struct msghdr). This allows extra optimisations, e.g. ubuf_info caching on the io_uring side, but also helps to avoid cross-referencing and synchronisation problems. The second part is an optional optimisation removing page referencing for requests with registered buffers.
Queued via for-5.20/io_uring-zerocopy-send are the IO_uring side changes queued up by maintainer Jens Axboe. Linux 5.20 is shaping up to be a big kernel.
Jakub merged the prep net series for the io_uring tx zerocopy support:https://t.co/l6Axk5WUr9
— Jens Axboe (@axboe) July 20, 2022
and I staged the remaining bits on top of that:https://t.co/35veXfsMVM
5.20 is shaping up nicely!