Announcement

**NobodyXu** · 29 September 2022, 10:42 PM

Originally posted by linuxgeex View Post

Background for other readers: Normally we do file atomics by creating a new inode, modifying its data to the new state we want, then making the directory entry point to the new inode, usually by renaming the new inode to the old one (or with locks and barriers, but the question is re:renames). This requires the FS to do a lot of work in preparing the (temporary) inode if the changes exceed the VFS/mount's write-back caching interval, or some API is forcing unwanted FSYNCs on the temporary file, then it can hurt performance / cause write amplification. And then also the COW scenario, where you might be modifying just a few bytes out of a terabyte-sized file, and using rename would be pathological.

Not sure how this "atomic replace" work in practice compared to atomic rename file to avoid race condition.

But you can use reflink to avoid write amplification and speeds up performance while still having atomic rename.

**sinepgib** · 30 September 2022, 06:11 AM

Originally posted by linuxgeex View Post

I came here meaning to ask the same question, but a few seconds in, it occurred to me that this ioctl should be part of the VFS, and each filesystem should implement it in the most efficient manner for that FS.

TLDR; Yes I think this avoids unwanted metadata updates, but also: it can avoid unwanted data updates esp for COW scenarios where blocks can be shared between the original and updated file data; it's a simpler API for userspace vs traditional file atomics; performance can be much higher.

Background for other readers: Normally we do file atomics by creating a new inode, modifying its data to the new state we want, then making the directory entry point to the new inode, usually by renaming the new inode to the old one (or with locks and barriers, but the question is re:renames). This requires the FS to do a lot of work in preparing the (temporary) inode if the changes exceed the VFS/mount's write-back caching interval, or some API is forcing unwanted FSYNCs on the temporary file, then it can hurt performance / cause write amplification. And then also the COW scenario, where you might be modifying just a few bytes out of a terabyte-sized file, and using rename would be pathological.

It makes sense. But, as you said, this makes much more sense as part of the VFS. The moment the user (programmer) needs to know the underlying filesystem you can almost guarantee it's gonna be a fancy painting in the wall: pretty in theory, but ignored by most most of the time.

**linuxgeex** · 30 September 2022, 01:43 PM

Originally posted by NobodyXu View Post

Not sure how this "atomic replace" work in practice compared to atomic rename file to avoid race condition.

But you can use reflink to avoid write amplification and speeds up performance while still having atomic rename.

I agree and F2FS does support reflinks. I was making an aside to non-technical users who might understand why atomic updates are important but not understand the cost of doing business without a ton of boilerplate/optimisation work, including trying proprietary IOCTL's for performance and media life optimisation, which most programmers will avoid.

It's good that you pointed it out for those who want to learn and are involved in a project which allows writing platform-dependent code.

**yump** · 30 September 2022, 06:04 PM

Originally posted by linuxgeex View Post

I agree and F2FS does support reflinks. I was making an aside to non-technical users who might understand why atomic updates are important but not understand the cost of doing business without a ton of boilerplate/optimisation work, including trying proprietary IOCTL's for performance and media life optimisation, which most programmers will avoid.

It's good that you pointed it out for those who want to learn and are involved in a project which allows writing platform-dependent code.

Come on, admit that you hadn't thought of using reflink-modify-rename. Google is a silo with an enormous budget for throwing wide, but not necessarily deep, teams at problems. This feature seems to come from the Android side, which has a famous tendency to go it's own way.

It's entirely plausible that the people tasked with solving whatever use case motivated this feature did not think of reflink-modify-rename either.

**arQon** · 04 October 2022, 04:18 AM

Originally posted by sinepgib View Post

It's unclear to me what's exactly the advantage of this. Typically we'd just create a new file and move it as with any other filesystem. Is the intent to avoid creating extra metadata in the logs?

eMMC maybe?
I can't think of a good reason for this yet either: even if you have a severely constrained storage device that you're trying to dump a new monolithic update onto, overwriting it doesn't buy you anything over just deleting the old version first except a trivial amount of metadata updates.
Mind you, I never use F2Fs, or Chromebooks etc, so maybe that potentially triggers some undesirable set of behavior like a TRIM, or zeroing out the file as part of deleting it, etc. This does feel a lot like a very specific hack for an observed problem, not a considered design concept.

**linuxgeex** · 06 October 2022, 02:44 PM

Originally posted by yump View Post

Come on, admit that you hadn't thought of using reflink-modify-rename.

I considered it, and I also considered fs-managed dedupe, and even the old ReiserFS dancing trees, lol. I'm glad you put it forward, you feel it belonged in my TLDR block, but I felt I was already far too long-winded.

Announcement

F2FS Preparing Support For Atomic Replace

Comment

Comment

Comment

Comment

Comment

Comment