Announcement

Collapse
No announcement yet.

XFS With Linux 6.13 Sees Major Rework To Real-Time Volumes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • XFS With Linux 6.13 Sees Major Rework To Real-Time Volumes

    Phoronix: XFS With Linux 6.13 Sees Major Rework To Real-Time Volumes

    The XFS file-system updates were merged yesterday for the ongoing Linux 6.13 merge window...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    What is this?
    Is this for when the Linux kernel is compiled to run as real-time?

    Comment


    • #3
      Originally posted by uid313 View Post
      What is this?
      Is this for when the Linux kernel is compiled to run as real-time?
      This has nothing to do with Kernel RT but could be beneficial in conjunction: https://blogs.oracle.com/linux/post/xfs-realtime-device

      Comment


      • #4
        From the XFS filesystem structure documentation:
        The performance of the standard XFS allocator varies depending on the internal state of the various metadata indices enabled on the filesystem. For applications which need to minimize the jitter of allocation latency, XFS supports the notion of a “real-time device”. This is a special device separate from the regular filesystem where extent allocations are tracked with a bitmap and free space is indexed with a two-dimensional array. If an inode is flagged with XFS_DIFLAG_REALTIME, its data will live on the real time device. The metadata for real time devices is discussed in the section about real time inodes.
        By placing the real time device (and the journal) on separate high-performance storage devices, it is possible to reduce most of the unpredictability in I/O response times that come from metadata operations.​
        So put your most frequently used or time sensitive data on the real time device and you should experience "smoother" overall storage performance. I think in a world where SSDs just keep getting cheaper and faster this is a highly specialized filesystem feature. At the very least you will need to profile data access patterns to identify what data is constraining performance and what your desired threshold is, then use that to figure out what data needs to go onto a real time device and what kind of device that needs to be (3D XPoint would be a candidate but that is gone now).

        Comment


        • #5
          It's a machine translation of my post on some other forum

          Dear pon4ik, recently raised the topic here of how to make it so that "FS ... would store metadata on one disk, and the data itself on the second. So that while all sorts of listdir and fstat are happening, the hard drive is not turned on and it can sleep soundly."

          I remembered that I heard something similar about XFS, quickly googled a couple of links about the realtime section and threw it in the comments. But since I'm an XFS-boy, I went to see how it was implemented. I report). The Realtime partition in XFS is an additional partition to which only data is written (not inodes and logs - the first ones are written to the main partition, the second ones either to the same or to a separate partition, if specified). Accordingly, it is possible to transfer data in large files with sequential access to one partition, and all IOPS-intensive operations to the second partition on the SSD or even to the RAM (if data security is needed only before rebooting, this happens).

          How to implement:

          mkfs.xfs -r rtdev=/dev/sdb /dev/sdc
          or
          mkfs.xfs -l logdev=/dev/sdd -r rtdev=/dev/sdb /dev/sdc
          Where:
          /dev/sdb — realtime section (only files and only if “asked for” about it, about it below)
          /dev/sdc — main partition (files, inodes, log)
          /dev/sdd is a partition for FS log

          The log section has a size limit. Therefore, it is easier to remove it, considering that we are already removing files from the "main" partition.

          Next, we mount:

          mount -o rtdev=/dev/sdb /dev/sdc /mnt
          or
          mount -o logdev=/dev/sdd,rtdev=/dev/sdb /dev/sdc /mnt
          How to force the system to write files to the realtime section? There are 3 options:

          1. Option
          mkfs.xfs -d rtinherit=1
          is an undocumented option that says that all files will be written to the Realtime partition on the created FS.
          2. Team
          xfs_io -c "chattr +t" /mnt/
          — sets the "realtime inheritance" attribute on the directory. All files created after this in the directory will be recorded in the rt section. The attribute can be placed on the directory in which the FS is mounted (and even on it the attribute is saved after remounting).
          3. Team
          xfs_io -c "chattr +r" /mnt/file_name
          — sets the "the realtime" attribute on the file. The file must be created empty for this (touch /mnt/file_name is suitable).

          What is the stability of the solution? After discussing patches for realtime partitions an hour ago (more details here: https://patchwork.kernel.org/patch/9933237/ ), active testing of this functionality in XFS began, several bugs were fixed, and functionality for testing FS with realtime section.​

          Comment


          • #6
            So this affects only the writing of metadata?

            Comment


            • #7
              Originally posted by Gonk View Post
              From the XFS filesystem structure documentation:


              So put your most frequently used or time sensitive data on the real time device and you should experience "smoother" overall storage performance.
              Just for clarification: real-time means almost the same as predictable latency. So this seems to minimize jitter (variation of latency) on allocation.

              If anybody knows where this is used in practice, please speak up. I'm interested!

              Comment


              • #8
                Originally posted by Gonk View Post
                From the XFS filesystem structure documentation:


                So put your most frequently used or time sensitive data on the real time device and you should experience "smoother" overall storage performance. I think in a world where SSDs just keep getting cheaper and faster this is a highly specialized filesystem feature. At the very least you will need to profile data access patterns to identify what data is constraining performance and what your desired threshold is, then use that to figure out what data needs to go onto a real time device and what kind of device that needs to be (3D XPoint would be a candidate but that is gone now).
                The feature dates back to the days when SGI was still around as a big iron concern, hard drives were still the primary source for data-on-demand (spinning rust has different access times and throughput for different physical areas) and SSDs were in their infancy with price tags of $10k+ US for mere hundred megabytes (in the 90s), and Unix systems (IRIX on Onyx systems in this case) other than Macs were being used to computationally build CGI scenes for media (movies). It was necessary in some cases to have data that could be deterministically depended on to be available in certain defined slices of time.

                Real time systems (whether it's the scheduler, the file system, whatever) are meant for applications with tight timing constraints & deterministic time behavior. Data movement and decisions determined that must be completed within defined time constraints are "real time". Please note it's not about quickly dispatching actions. If there isn't a defined time constraint, it's not real time. Low latency isn't deterministic. It's just quick dispatch with no real boundary on when (jitter is allowed within unspecified limits- dispatch can miss decision gates). In real time systems, jitter is only allowed within defined limits and dispatch may or may not be "quick" but they're not allowed to miss a predefined time boundary when a decision must be made that depends on that signal getting through.

                (Edit to add) Someone ( rawr ) strike the physics example... too contrived on consideration.

                Event reconstruction realm. Forensic reexamination of safety events involving computer monitoring of physical systems. Consider this: Normally as a forensic data retrieval person I'm really only tasked with collecting metadata suggesting certain events happened or were altered/deleted, whathave you. This doesn't really require consistent metadata, only that it exists.

                However, once you get into the problem of reconstructing a physical event where time stamp consistency matters, like in safety critical arenas of pipe line monitoring, accident reconstruction, etc. being able to tell if a computer was unable to handle the events within specification is part of the process (like on the verge of, but hasn't failed yet). Part of that process is trying to match up the jitter of log writes with the metadata that the system creates about those events. For example, the file was created on date/time by <user>, it was last accessed at date/time, it's been modified at dates/times in the past, and was last modified at date/time by <user> and whether or not those events are occurring within a specified time of each other. Because if there's a pattern of log writes where metadata writes are outside of a defined window you might have problems with the computer, there may have been something that interfered with those events, or the logs themselves could have been altered. This isn't really a contrived example (look up how the XZ backdoor was discovered as an example where time discrepancies matter). This is how forensic reconstruction at least should proceed even if the real world is a lot messier than this. But I can at least answer a question in court whether I believe the computer monitoring system itself may have been part of the problem or not based in part on the jitter of metadata and logged events. Not just what it can theoretically manage, but what it was doing at the time of the accident.
                Last edited by stormcrow; 22 November 2024, 08:03 PM.

                Comment


                • #9
                  Originally posted by unwind-protect View Post
                  So this affects only the writing of metadata?
                  Depends on what you mean by "this". If you mean the pull request in the article, I have no idea what it is all touching. If you mean XFS real time devices as a thing, no: XFS RT devices hold data and metadata.

                  Comment


                  • #10
                    XFS is used in banking and other high-end applications where latency matters and needs to be predictable.

                    That said, honestly XFS really should be considered as a replacement for ext4 now as a general purpose filesystem, it's only missing feature is shrinking or fscrypt, unless you need those, for all purposes it's better in every way over ext4.

                    Comment

                    Working...
                    X