SUSE Reworking Btrfs File-System's Locking Code
SUSE continues to back the Btrfs file-system and as part of that investing in new/improved functionality around this Linux file-system once billed as the competitor to ZFS. This week one of the SUSE developers sent out a set of patches implementing a new "DRW" lock and wiring that into the file-system driver.
SUSE's Nikolay Borisov sent out the patches on Thursday that refactor their snapshot / nocow writers locking mechanisms. The existing locking code is changed around to use a new DRW "Double Reader Writer" lock. Borisov explained, "A (D)ouble (R)eader (W)riter lock is a locking primitive that allows to have multiple readers or multiple writers but not multiple readers and writers holding it concurrently. The code is factored out from the existing open-coded locking scheme used to exclude pending snapshots from nocow writers and vice-versa. Current implementation actually favors Readers (that is snapshot creaters) to writers (nocow writers of the filesystem)."
And then explained in the follow-up patch:
If all goes well we could see this improved locking code as part of the Linux 5.3 kernel cycle.
SUSE's Nikolay Borisov sent out the patches on Thursday that refactor their snapshot / nocow writers locking mechanisms. The existing locking code is changed around to use a new DRW "Double Reader Writer" lock. Borisov explained, "A (D)ouble (R)eader (W)riter lock is a locking primitive that allows to have multiple readers or multiple writers but not multiple readers and writers holding it concurrently. The code is factored out from the existing open-coded locking scheme used to exclude pending snapshots from nocow writers and vice-versa. Current implementation actually favors Readers (that is snapshot creaters) to writers (nocow writers of the filesystem)."
And then explained in the follow-up patch:
This patch removes all haphazard code implementing nocow writers exclusion from pending snapshot creation and switches to using the drw lock to ensure this invariant still holds. "Readers" are snapshot creators from create_snapshot and 'writers' are nocow writers from buffered write path or btrfs_setsize. This locking scheme allows for multiple snapshots to happen while any nocow writers are blocked, since writes to page cache in the nocow path will make snapshots inconsistent.
So for performance reasons we'd like to have the ability to run multiple concurrent snapshots and also favors readers in this case. And in case there aren't pending snapshots (which will be the majority of the cases) we rely on the percpu's writers counter to avoid cacheline contention.
The main gain from using the drw is it's now a lot easier to reason about the guarantees of the locking scheme and whether there is some silent breakage lurking.
If all goes well we could see this improved locking code as part of the Linux 5.3 kernel cycle.
42 Comments