Announcement

Collapse
No announcement yet.

READFILE System Call Rebased For More Efficient Reading Of Small Files

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • READFILE System Call Rebased For More Efficient Reading Of Small Files

    Phoronix: READFILE System Call Rebased For More Efficient Reading Of Small Files

    The past year there has been work led by Greg Kroah-Hartman on a "READFILE" system call for efficiently reading small files such as for data exposed via sysfs. While not yet mainlined, this week the patches for this new system call were re-based giving us hope that perhaps we'll see it with Linux 5.13...

    https://www.phoronix.com/scan.php?pa...ebase-READFILE

  • #2
    oh man, are they still beating this dead horse? :-/ https://www.youtube.com/watch?v=84Uyh3KwY-w

    Comment


    • #3
      Originally posted by rene View Post
      oh man, are they still beating this dead horse? :-/ https://www.youtube.com/watch?v=84Uyh3KwY-w
      For anyone (99%) not willing to watch 30 minutes of your talk - in short what are you proposing instead?

      Comment


      • #4
        Originally posted by cl333r View Post

        For anyone (99%) not willing to watch 30 minutes of your talk - in short what are you proposing instead?
        He mentions it 5 minutes in (and in the video title): vectorized system calls. Incidentally I believe Solaris/illumos does something similar to what’s being proposed by Greg and the docs just call them “vectorized”. But Rene’s approach is far more flexible and comprehensive. Theres at least one research paper showing they give a decent performance bump (~30%). No clue how realistic it is to implement. I’m guessing nobody is rushing to retool how syscalls work.

        Comment


        • #5
          Originally posted by rene View Post
          oh man, are they still beating this dead horse? :-/
          To be correct that vectored syscall stuff that covered by Linux kernel BPF. There is a little problem with the vectored syscall on how to handle the error event particular when you start stacking multi different syscalls.

          Its really simple to suggest vectored syscalls until you wake up a Linux kernel developer did implement a vectored syscall attempt back in 2001 and that evolved forwarded to ebpf/bpf today.

          Remember vsyscall idea you have multi random mix of different syscalls stacked up in 1 single call that may have all kinds of different error outputs in combination may not be able to produce a clean unique error message. The reason for the evolve in the ebpf/bpf direction with a kernel uploaded bytecode that could have logic to process the errors and produce something the application can understand.

          Readfile syscall there is no need special validation because its functionality is limited and defined.

          virtual syscall idea is a broad functionality idea that happens to have a broad set of problems that causes.



          Comment


          • #6
            Originally posted by oiaohm View Post

            To be correct that vectored syscall stuff that covered by Linux kernel BPF. There is a little problem with the vectored syscall on how to handle the error event particular when you start stacking multi different syscalls.

            Its really simple to suggest vectored syscalls until you wake up a Linux kernel developer did implement a vectored syscall attempt back in 2001 and that evolved forwarded to ebpf/bpf today.

            Remember vsyscall idea you have multi random mix of different syscalls stacked up in 1 single call that may have all kinds of different error outputs in combination may not be able to produce a clean unique error message. The reason for the evolve in the ebpf/bpf direction with a kernel uploaded bytecode that could have logic to process the errors and produce something the application can understand.

            Readfile syscall there is no need special validation because its functionality is limited and defined.

            virtual syscall idea is a broad functionality idea that happens to have a broad set of problems that causes.


            I mention that even Linus Torvalds himself had a stupid simple vector syscall idea back some 15 or so years ago. I also propose keep it stupid simple ways to deal with error cases (e.g. abort on first error, or flag and mark failed ones). Basically everything is more useful than this useless readfile(). I also would not call BPF a direct vectored syscall replacement, certainly also a magnitude or two more complex than that ;-)

            Comment


            • #7
              Originally posted by rene View Post
              I mention that even Linus Torvalds himself had a stupid simple vector syscall idea back some 15 or so years ago. I also propose keep it stupid simple ways to deal with error cases (e.g. abort on first error, or flag and mark failed ones). Basically everything is more useful than this useless readfile(). I also would not call BPF a direct vectored syscall replacement, certainly also a magnitude or two more complex than that ;-)
              Really you must never of read the mailing list answers. Abort on first error that had example of this one in fact. "Open file, read file, close file." Error on read for some reason if you abort on the read then you fail to close. Mark the failed ones that can end up doing operations when you should not.

              This was all covered in the Linux kernel mailing list. Stupid simple vector syscall on the mailing list proposed the two solutions you just suggest and got answer why those two solutions to the error handling does not work.

              The error handling problem not to leave resources not closed when they should be and not end up performing operations when you should not because some other syscall failed. Error handling is the universal problem with vectored syscalls no one has really proposed better solution than BPF for it. There may be a simpler that works correctly error handling solution that can be done for vectored syscalls but I have not seen a proposal with it. Yes that video you linked to does not cover the error handling problems.

              Comment


              • #8
                Does io_uring not have the same functionality as a vectored syscall for file I/O ? In a sense, isn't that what it is ?

                Comment


                • #9
                  Originally posted by indepe View Post
                  Does io_uring not have the same functionality as a vectored syscall for file I/O ? In a sense, isn't that what it is ?
                  sorta, however, io_uring is way more complex to setup, and the last time I checked does not directly support this open, read, close sequence, and the proposed solution for that was rather low-lech in that they wanted a new open flag and argument to "open with fixed fd=667" or so, so that read and close can use a hardcoded fd like 667. Rather bizarre to me what is going on in Linux kernel land since some time already :-/

                  Comment


                  • #10
                    Originally posted by oiaohm View Post

                    Really you must never of read the mailing list answers. Abort on first error that had example of this one in fact. "Open file, read file, close file." Error on read for some reason if you abort on the read then you fail to close. Mark the failed ones that can end up doing operations when you should not.

                    This was all covered in the Linux kernel mailing list. Stupid simple vector syscall on the mailing list proposed the two solutions you just suggest and got answer why those two solutions to the error handling does not work.

                    The error handling problem not to leave resources not closed when they should be and not end up performing operations when you should not because some other syscall failed. Error handling is the universal problem with vectored syscalls no one has really proposed better solution than BPF for it. There may be a simpler that works correctly error handling solution that can be done for vectored syscalls but I have not seen a proposal with it. Yes that video you linked to does not cover the error handling problems.
                    I can't follow, let alone read all random stuff discussed on each mailing list. However, all of my proposals would just work with your examples. Of course they would return which syscall were run and their respective error codes. So depending on which proposal you look at, the user space app would either get to know read() failed and close() was not executed and thus do it if needed, or for my other proposal the app would mark open to abort on fail, and read and close not. In either case there user space app would get a vector of return values. As simple as that and certainly more universal than the bloddy readfile() nobody really needs nor asked for and the "open with fixed fd" for io_uiring proposal. I wonder how that should scale and work flawlessly in a sophisticated application.

                    It is not rocket science, and certainly all way more generally usable than readfile(). Sigh.
                    Last edited by rene; 03 April 2021, 06:02 PM.

                    Comment

                    Working...
                    X