Announcement

Collapse
No announcement yet.

XFS Reverse-Mapping Proposed For Linux 4.8: Getting Ready For New File-System Features

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • XFS Reverse-Mapping Proposed For Linux 4.8: Getting Ready For New File-System Features

    Phoronix: XFS Reverse-Mapping Proposed For Linux 4.8: Getting Ready For New File-System Features

    Last week was the main XFS feature pull for Linux 4.8 while one day before the 4.8 merge window is expected to close, XFS maintainer Dave Chinner is hoping to land a big new feature...

    http://www.phoronix.com/scan.php?pag...se-Mapping-4.8

  • #2
    Missiles locked at Btrfs.. shoot!

    Comment


    • #3
      Originally posted by tessio View Post
      Missiles locked at Btrfs.. shoot!
      Lol, my thoughts exactly. It's as if they looked at BTRFS and said "why would people use this over XFS?" They then set out to fix all answers.

      Comment


      • #4
        oh finally someone else steps up to the fray. Good, very fucking good.

        Comment


        • #5
          Drives are block layer devices right? Can somebody please explain to me what benefit does CoW techniques imbue? In every scenario I can imagine it seems like it would perform much worse due to it write thrashing the shit out of your drives and free space fragmentation being unmanageable because of that. SSD's effectively solve the write thrashing problem due to being solid state and their is no head seek latency. But the free space fragmentation problem exists regardless of any physical property a drive might have.

          Comment


          • #6
            Originally posted by duby229 View Post
            Drives are block layer devices right?
            Yes. They have always been.
            Can somebody please explain to me what benefit does CoW techniques imbue?
            What about using Google? It's pretty old hat by now. https://en.wikipedia.org/wiki/Copy-on-write
            "Copy-on-write (COW), sometimes referred to as implicit sharing[1] or shadowing,[2] is a resource management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources.[3] If a resource is duplicated but not modified, it is not necessary to create a new resource; the resource can be shared between the copy and the original. Modifications must still create a copy, hence the technique: the copy operation is deferred to the first write. By sharing resources in this way, it is possible to significantly reduce the resource consumption of unmodified copies, while adding a small overhead to resource-modifying operations."

            It is also the base for neat filesystem online snapshots.

            In every scenario I can imagine it seems like it would perform much worse due to it write thrashing the shit out of your drives and free space fragmentation being unmanageable because of that.
            as the above quote demonstrates, you cannot imagine real-life filesystem usage.

            But the free space fragmentation problem exists regardless of any physical property a drive might have.
            the fragmentation issue can be dealt with by using smart space allocation alogrithms, like the ones in ext4, but in most cases you will need an online defragmentation that deals with that stuff too. (online = runs while the filesystem is used, it's not a separate program run on a schedule)

            Comment


            • #7
              Originally posted by duby229 View Post
              Drives are block layer devices right? Can somebody please explain to me what benefit does CoW techniques imbue? In every scenario I can imagine it seems like it would perform much worse due to it write thrashing the shit out of your drives and free space fragmentation being unmanageable because of that. SSD's effectively solve the write thrashing problem due to being solid state and their is no head seek latency. But the free space fragmentation problem exists regardless of any physical property a drive might have.
              The biggest benefit to CoW (to me), other than letting you do things like snapshotting, is the idea that you should -never- be without data. If you have Data A, and its being updated to become Data B, and you lose power in the middle of that update, then what happens depends on your filesystem. In a CoW filesystem, the update was never marked as 'complete' so "Data" still points to Data A, which was never touched. In a non-CoW filesystem things get dicier. Depending on the design of the filesystem and when exactly you lost power, maybe you still have Data A, maybe you have Data B, or maybe you have no data at all because its corrupted.

              Thrashing, as you said, is pretty much a moot point. Fragmentation does happen more, but that is partially offset by SSD's also, but btrfs lets you do online defragging-- the filesystem waits for the system to be idle, and then starts defragging the FS.

              Comment


              • #8
                Originally posted by Ericg View Post

                The biggest benefit to CoW (to me), other than letting you do things like snapshotting, is the idea that you should -never- be without data. If you have Data A, and its being updated to become Data B, and you lose power in the middle of that update, then what happens depends on your filesystem. In a CoW filesystem, the update was never marked as 'complete' so "Data" still points to Data A, which was never touched. In a non-CoW filesystem things get dicier. Depending on the design of the filesystem and when exactly you lost power, maybe you still have Data A, maybe you have Data B, or maybe you have no data at all because its corrupted.

                Thrashing, as you said, is pretty much a moot point. Fragmentation does happen more, but that is partially offset by SSD's also, but btrfs lets you do online defragging-- the filesystem waits for the system to be idle, and then starts defragging the FS.
                So I'm trying to be as realistic here as possible and honestly it's been my experience data integrity in power failure situations highly depends on the drive. Drives are designed specifically to handle those circumstances. So then the way I see it is the only benefit of those that you listed is snapshotting. It might be ok as a stage in your backup policy, but in no way at all is adequate by itself. And then there has been dd for a long time. I personally do complete backups once a week and incremental backups every day.

                I'm personally convinced my backup strategy is a lot safer and probably faster too. I think a filesystems job is to store files accurately and that doesn't mean redundantly.

                Comment


                • #9
                  Originally posted by duby229 View Post
                  Drives are block layer devices right?
                  Anything that abstracts the physical layer somehow (i.e. there is a digital controller that deals with how to write stuff to the hardware) is a block device. Also hardware raid cards generate a block device with X hard drives (also block devices) and show it to the operating system.

                  Can somebody please explain to me what benefit does CoW techniques imbue? In every scenario I can imagine it seems like it would perform much worse due to it write thrashing the shit out of your drives
                  Well, the performance hit isn't as bad if you design the filesystem around that feature.
                  You can try yourself, CoW is similar to running ext4 with data=journal mount option (so that ext4 writes data in journal first then commits it to the filesystem). ext4 performance with that option gets nuked, meanwhile you can see btrfs manages to get performance relatively close to ext4 with only metadata journalling (default option, the featureset ext4 was designed for)

                  Comment


                  • #10
                    Originally posted by duby229 View Post

                    So I'm trying to be as realistic here as possible and honestly it's been my experience data integrity in power failure situations highly depends on the drive. Drives are designed specifically to handle those circumstances. So then the way I see it is the only benefit of those that you listed is snapshotting. It might be ok as a stage in your backup policy, but in no way at all is adequate by itself. And then there has been dd for a long time. I personally do complete backups once a week and incremental backups every day.

                    I'm personally convinced my backup strategy is a lot safer and probably faster too. I think a filesystems job is to store files accurately and that doesn't mean redundantly.
                    can dd make backups while the filesystem is still online and in use?

                    Because CoW filesystems can do so. Snapshot, then start transferring the data away while other processes still work on it.

                    putting offline a server to do backups isn't cool.

                    Comment

                    Working...
                    X