Systemd/Microsoft Effort For A Global Counter On Block/Disk Changes Coming To Linux 5.15
This global counter for block device changes is sought after to better correlate events for devices that may end up re-using the same device, commonly for cases like /dev/sda or /dev/loop0 when a device is detached and then later reattached but not necessarily the same device. User-space software like systemd could thus benefit from such a system-wide numbering scheme to better handle events to avoid issues around device re-use confusion or events arriving to user-space out-of-order.
Those patches providing this global counter for block device changes by Microsoft's Matteo Croce were queued on Thursday to the block subsystem's "for-5.15" Git branch.
The main commit further sums up the motivation:
Associating uevents with block devices in userspace is difficult and racy: the uevent netlink socket is lossy, and on slow and overloaded systems has a very high latency. Block devices do not have exclusive owners in userspace, any process can set one up (e.g. loop devices). Moreover, device names can be reused (e.g. loop0 can be reused again and again). A userspace process setting up a block device and watching for its events cannot thus reliably tell whether an event relates to the device it just set up or another earlier instance with the same name.
Being able to set a UUID on a loop device would solve the race conditions. But it does not allow to derive orderings from uevents: if you see a uevent with a UUID that does not match the device you are waiting for, you cannot tell whether it's because the right uevent has not arrived yet, or it was already sent and you missed it. So you cannot tell whether you should wait for it or not.
Associating a unique, monotonically increasing sequential number to the lifetime of each block device, which can be retrieved with an ioctl immediately upon setting it up, allows to solve the race conditions with uevents, and also allows userspace processes to know whether they should wait for the uevent they need or if it was dropped and thus they should move on.
Additionally, increment the disk sequence number when the media change, i.e. on DISK_EVENT_MEDIA_CHANGE event.
The disk sequence number is exported via uevents, sysfs, and there is also a new BLKGETDISKSEQ ioctl. Assuming no last minute design objections, this code is slated to make it for Linux 5.15 as part of the pending block subsystem for-5.15 material.