Announcement

Collapse
No announcement yet.

Intel Has Been Working On A New User-Space File-System For Persistent Memory

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Has Been Working On A New User-Space File-System For Persistent Memory

    Phoronix: Intel Has Been Working On A New User-Space File-System For Persistent Memory

    Intel developers have been working on a new user-space file-system designed for persistent memory. This user-space file-system is designed to be high-performance and does not make use of FUSE...

    http://www.phoronix.com/scan.php?pag...Intel-PMEMFILE

  • #2
    So if I understand this correctly, it basically creates a RAM disk with persistent storage?

    Comment


    • #3
      This makes sense for limited situations like databases in which Optane or a high-end flash device is providing persistent storage like a filesystem but the workload's access patterns are probably closer to that of RAM vs. a traditional filesystem like EXT4. Since it's a pretty specialized workload keeping it out of the kernel may be a good thing. People who have worked with large Oracle databases will of course know that in many instances Oracle prefers to have raw device access without any underlying filesystem whatsoever. This appears to be an abstraction that can provide similar functionality to a wider range of applications.
      Last edited by chuckula; 10-26-2017, 09:35 AM.

      Comment


      • #4
        Typo:

        Originally posted by phoronix View Post
        Emebedded Linux Conference Europe

        Comment


        • #5
          Could be nice for training deep NNs with millions of images using toy python data layers that are constantly reading images from disk! I swear, sometimes these data layers choke the system and sshd takes forever to service a new ssh connection (and then bash takes forever to start up once you do login!). Just shove all the images onto one of these pmemfile file systems and worry no more! Even worse is the GPU systems we have can service up to 4 independent GPU-based learning systems, and all the post docs like to write their own data layers to do just as I described. Imagine 4 of these constantly reading lots and lots of images from a set of millions of images at a time. It's so inefficient that nvidia-smi often reports the respective GPUs being underutilized. Personally, I just package everything up into an LMDB and let it do all the magic with mmap(). But the problem with this solution is that computing and storing all the training data into an LMDB can result in a prohibitively large database (e.g. half a terabyte for a "small" data set). With a custom data layer, you can at least compute some of this on the fly.

          For some of my older machine learning work, I just loaded the images once and cached them in RAM (sped up training by a factor 10-100x depending on data and task!). More code is involved to do this than just shoving all the data onto a RAM-based file system though.

          Comment


          • #6
            Originally posted by schmidtbag View Post
            So if I understand this correctly, it basically creates a RAM disk with persistent storage?
            This thing is supposed to work with very high-speed non-volatile memory. RAM is volatile and can't be used for persistent storage without stupid contraptions with batteries and microcontrollers refreshing their contents.

            But it is a "dumb" way of using such new technology, as it still basically uses the thing as storage only, not as RAM/storage at the same time.
            Last edited by starshipeleven; 10-26-2017, 03:37 PM.

            Comment


            • #7
              https://www.snia.org/sites/default/f..._NVDIMMsv2.pdf
              https://github.com/NVSL/NOVA

              This is the horrible thing. They say no data for NOVA. The scary part is that the performance boost documented in NOVA presentation might mean that a kernel driver can keep up and provide all the features that this user-space solution cannot. Yes NOVA is a kernel space module.

              So I am not exactly sure how much advantage this intel attempt really is.

              Userspace filesystem? The problem is right there. Always has been. People who think that userspace filesystems are realistic for anything but toys are just misguided.
              Linus exact words. This is not only referring to FUSE but stuff like this pmemfile as well. Linus Torvalds might still be right that all User-space file systems are toys and the presentation did not include the right benchmarks to prove that its not.

              We have not seen a proper head to head between NOVA and pmemfile to compare if pmemfile really provided enough benefit for all the safety stuff you are giving up or if pmemfile is providing any real benefit at all other than not needing a kernel module. Yes NOVA is quite a bit faster in all it benchmarks that ext4 on solid state storage and none of the numbers of pmemfile presentation has it as absolutely sure faster.

              Comment


              • #8
                Originally posted by oiaohm View Post
                https://www.snia.org/sites/default/f..._NVDIMMsv2.pdf
                https://github.com/NVSL/NOVA

                This is the horrible thing. They say no data for NOVA. The scary part is that the performance boost documented in NOVA presentation might mean that a kernel driver can keep up and provide all the features that this user-space solution cannot. Yes NOVA is a kernel space module.
                Looks pretty badass too. It's a CoW thing with checksumming and single-drive RAID5.

                Yeah it's still very immature, but I like where they are going.

                Comment


                • #9
                  Originally posted by oiaohm View Post
                  We have not seen a proper head to head between NOVA and pmemfile to compare if pmemfile really provided enough benefit for all the safety stuff you are giving up or if pmemfile is providing any real benefit at all other than not needing a kernel module. Yes NOVA is quite a bit faster in all it benchmarks that ext4 on solid state storage and none of the numbers of pmemfile presentation has it as absolutely sure faster.
                  The promise of a pure-userspace approach is no context switches or other syscall overhead. That becomes very significant, as the IOPS rate increases.

                  That said, I'm completely unfamiliar with the specifics of these implementations, so I can only opine in broad platitudes and generalities.

                  Comment


                  • #10
                    Originally posted by oiaohm View Post
                    Linus exact words. This is not only referring to FUSE but stuff like this pmemfile as well. Linus Torvalds might still be right that all User-space file systems are toys and the presentation did not include the right benchmarks to prove that its not.
                    The thing to keep in mind is that he said this over 6 years ago:

                    https://www.phoronix.com/scan.php?pa...item&px=OTYwMA

                    Back then, most SSDs weren't even SATA 3. Now, we have 3D XPoint memory that's allegedly only about an order of magnitude slower than DRAM. At those speeds, syscall overhead will eat you alive. So, even if he was right at the time, and even if there aren't yet any benchmarks to show that Intel's efforts can't be matched with a kernel-based alternative, that doesn't mean these conditions will always hold. IMO, Linus will probably end up eating those words sooner, rather than later.

                    As for filesystems being a crude way to use NV DIMMs, I think that's taking a rather archaic view of filesystems. You need some way to manage nonvolatile memory, even if it's being used via something more like a malloc()-style interface. Building a foundation capable of functioning as a filesystem not only preserves the option to use it as a filesystem, but also gives you a means of observing and managing NV memory utilization by programs accessing it through other interfaces. Let's say a program gets in a bad state (or has a NV memory leak!) and you want to reclaim all the NV memory it allocated, you could essentially just kill the process and unlink the appropriate subtree corresponding to what it allocated. Then, when it restarts, it does so from a clean slate.

                    Of course, this is just a simplistic example, but I think having a filesystem view of NV memory would be a powerful interface and one which enables legacy software to benefit from it, if only as a super-fast storage device.

                    Comment

                    Working...
                    X