Announcement

Collapse
No announcement yet.

XFS File-System With Linux 5.10 Punts Year 2038 Problem To The Year 2486

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • tytso
    replied
    Originally posted by Weasel View Post
    And that's exactly why you shouldn't.
    As I said above, the reason why we do is because if userspace sets one or more of the inode's timestamps to a timespec value (using the utimesat(2) system call), it is highly desirable that if it is then read back from the file system (using the statx(2) system call), that you get back a bit-identical result. That's a positive value to userspace applications today. Granted, some file systesm can't make this guarantee; but if we can, it's a good thing to do. Sure, we could round it to, say, the nearest microsecond, or some arbitrary value based on the fastest clock ticktoday, but we thought it worthwhile to be able to do bidirectional encoding of struct timespec's.

    Where as whether or not the ext4 or xfs file systems will still be in use 500 years from now is a quite speculative matter, and if there is a hint that five centuries from now, these file systems will be in use, sometime in the next 100 to 200 years, we can promulgate another format change. "Premature optimization is the root of all evil".

    Disagree with our choices? Feel free to implement your own file system. Or fork an existing file system. Then you can go to town with your design brilliance, and dazzle the world. :-)

    Leave a comment:


  • Weasel
    replied
    Originally posted by tytso View Post
    Why not use a finer granularity? Because space in the on-disk inode structure is precious. We need 30 bits to encode nanoseconds.
    And that's exactly why you shouldn't.

    Leave a comment:


  • tytso
    replied
    The reason why ext4 and xfs both use nanosecond resolution is because in the kernel the high precision time keeping structure is the timespec structure (which originally was defined by POSIX). That uses tv_sec and tv_nsec. Certainly in 2008, when ext4 was declared "stable", the hardware of the time was nowhere near having the necessary resolution to give us nanosecond accuract. However, that's not really the point. We want to be able to store an arbitrary timespec value, encode it in the file system timestamp, and then decode it back to a bit-identical timespec value. So that's why it makes sense to use at least a nanosecond granularity.

    Why not use a finer granularity? Because space in the on-disk inode structure is precious. We need 30 bits to encode nanoseconds. That leaves an extra two bits that can be added to 32 bit "time in seconds since the Unix epoch". For full backwards compatible, where a "negative" tv_sec corresponds to times before 1970, that gets you to the 25th century. If we *really* cared, we could an extra 500 years by stealing a bit somewhere from the inode (maybe an unused flag bit, perhaps --- but since there are 4 timestamps, you would need to steal 4 bits for each doubling of time range). However, there is no guarantee that ext4 or xfs will be used 400-500 years from now; and if it *is* being used, it seems likely that there will plenty of time to do another format bump; XFS has had 4 incompatible fomat bumps in the last 27 years. ext2/ext3/ext4 has been around for 28 years, and depending on how you count, there has been 2-4 major version bumps (we use finer-grained feature bits, so it's a bit hard to count). In the next 500 years, we'll probably have a few more. :-)

    Leave a comment:


  • smitty3268
    replied
    Originally posted by billyswong View Post


    The problem is less about overkill, but more about placebo yet inaccurate precision. For a computer today, timestamps that are accurate to microsecond are already extremely hard to obtain (https://superuser.com/questions/1327...-it-compensate). Even for relative accuracy, if 2 threads read the computer internal clock "simultaneously" and their readout values are with difference less than microseconds (assuming there is a very high resolution clock on board), there are still so many possibilities such as out-of-order execution and IO caching that I can imagine the events which those 2 readings trying to record or present could have occurred in reverse order to their value.

    Therefore, any nanosecond timestamp recording in mainstream computers/servers shall be consider a lie.
    It's not a lie, because the filesystem never claims that it's accurate to that degree, the same way a filesystem that only stores to 1 second precision isn't "lying" if a save actually occurred 500ms after the second started. It's merely a storage location that can potentially be used if future hardware ever gets to the point that it needs it.

    That's unlikely to occur in the near future. But given that the range goes out past 2400, maybe in a couple hundred years hardware will be more able to take advantage of it.

    Leave a comment:


  • billyswong
    replied
    Originally posted by smitty3268 View Post

    This is the correct answer.

    Yes, nanoseconds is overkill, but you've got 64 bits so you might as well do something with them. You can either push out the range to an enormous size, or increase precision.

    Nanoseconds + 500 years is a good compromise in both directions, because a filesystem should never need to exceed either of those limits within the currently foreseeable future.

    The problem is less about overkill, but more about placebo yet inaccurate precision. For a computer today, timestamps that are accurate to microsecond are already extremely hard to obtain (https://superuser.com/questions/1327...-it-compensate). Even for relative accuracy, if 2 threads read the computer internal clock "simultaneously" and their readout values are with difference less than microseconds (assuming there is a very high resolution clock on board), there are still so many possibilities such as out-of-order execution and IO caching that I can imagine the events which those 2 readings trying to record or present could have occurred in reverse order to their value.

    Therefore, any nanosecond timestamp recording in mainstream computers/servers shall be consider a lie.

    Leave a comment:


  • Weasel
    replied
    Originally posted by smitty3268 View Post
    because a filesystem should never need to exceed either of those limits within the currently foreseeable future.
    That's what everyone said in 1980 with 32-bit timestamps.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by jabl View Post
    Nanoseconds might be overkill per se, but OTOH second resolution isn't enough, and the next natural size up from 32 bits is 64, so you might as well use nanoseconds. Particularly as a lot of other timing stuff is using nanoseconds anyway, so less chance of messing up some conversion.
    This is the correct answer.

    Yes, nanoseconds is overkill, but you've got 64 bits so you might as well do something with them. You can either push out the range to an enormous size, or increase precision.

    Nanoseconds + 500 years is a good compromise in both directions, because a filesystem should never need to exceed either of those limits within the currently foreseeable future.

    Leave a comment:


  • FireBurn
    replied
    I'd better set an outlook reminder

    Leave a comment:


  • Weasel
    replied
    Originally posted by Zan Lynx View Post
    Do you even know how timers work?

    You take the last timer read and apply the current TSC to it.

    CPUs are running 5 cycles per nanosecond. So YES the timers can be that accurate.
    CPUs are also out of order so your timestamp is way off compared to what you expect out of it. In fact, this could easily be a security hole (ala Spectre), if it truly was nanosecond accurate, which it likely isn't.

    Anything below 100ns is likely to just be statistical measurement noise. Wanna do an experiment and randomize those bits and see if anything breaks? It's literally random and nobody gives a shit. In fact there wouldn't be a difference anyway between randomizing them and an actual measurement.

    Leave a comment:


  • Weasel
    replied
    Originally posted by Zan Lynx View Post
    I guess I won't bother trying to convince you. But you're wrong.

    Personally, I'd go for 128 bit Planck time. With that our timestamps would be at the limit of the resolution of the universe itself.
    You can already do that easily dude.

    Just store a 1024-bit timestamp since the start of the Universe, to be a cool kid.

    For any bits below 100ns precision just randomize them, not like it makes a fucking difference, since it's literally just measurement noise.

    But that makes you a cool kid, right? "oh look I can store picosecond precision I'm cool af"

    Leave a comment:

Working...
X