Announcement

Collapse
No announcement yet.

Linux READFILE System Call Revived Now That It Might Have A User

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    There's nothing bad in the syscall itself and it's definitely an optimization in terms on number of calls required to load a multitude of small files or access small file quick and repeatedly - and indeed procfs/sysfs land there well.

    Comment


    • #22
      But another noticeable thing we see again is so this syscall was not even thoroughly considered until I* did some extra crap again that involves userspace hitting vfs too much. It definitely more and more looks like I* does crap after crap and maintainers try to cover for them. That does not smell nor look good.

      Comment


      • #23
        Originally posted by Alex/AT View Post
        But another noticeable thing we see again is so this syscall was not even thoroughly considered until I* did some extra crap again that involves userspace hitting vfs too much. It definitely more and more looks like I* does crap after crap and maintainers try to cover for them. That does not smell nor look good.
        Humans only perform actions when they're required to. Welcome to the world son.

        P.S. I'm drunk. (I feel like I should state this when I'm drunk because I feel more likely to post when I'm intoxicated.)

        Comment


        • #24
          Originally posted by Ironmask View Post
          Humans only perform actions when they're required to. Welcome to the world son.
          More clever humans tend to learn and take preventive actions, but yeah, I agree, the absolute majority does.

          Comment


          • #25
            Originally posted by atomsymbol

            Some notes:
            • Anything more complicated than the rules of a bare universal Turing machine is "just an optimization"
            • A readfile() function implemented using open-read-close has to execute read() inside a loop. If the size passed to readfile() is greater than the actual size of the file - which is highly probable when reading a sysfs/procfs file - the loop body has to execute at least twice and the last iteration will read zero bytes.
            False. https://git.kernel.org/pub/scm/linux...efbd587bf93f42

            Comment


            • #26
              Originally posted by atomsymbol

              What is your point exactly? That, given the implementation you are referring to, people won't be using readfile() outside of toy programs?
              I am referring to your claim about the read being executed in a loop. It is a factually false.

              Comment


              • #27
                Originally posted by atomsymbol

                My claim is factually true in the future (i.e: a prediction). For programs not to fail due to a random reason, there will have to be either a loop in userspace - for { readfile(); } ---- or a loop - for { vfs_read(); } - in kernel space. Do you believe there exists a different possible future?

                1) You can see in the source code of `readfile` kernel implementation, that there is no loop. I am assuming `vfs_read` does provide stronger guarantees than normal 'read' semantic.

                2) Implementation in user space would need to do two reads. But that is the same if you don't have `readfile` available ever. You need to do two reads today anyway. So even if you emulated `readfile` in user space with `open+read+read+close`, it is same as not using `readfile` emulation.

                Comment


                • #28
                  Originally posted by atomsymbol

                  I don't understand your line of reasoning.

                  Code:
                  /usr/src/linux-5.9.11/fs/read_write.c:
                  
                  ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count) {
                  struct fd f = fdget_pos(fd);
                  ssize_t ret = -EBADF;
                  if (f.file) {
                  loff_t pos, *ppos = file_ppos(f.file);
                  if (ppos) {
                  pos = *ppos;
                  ppos = &pos;
                  }
                  ret = vfs_read(f.file, buf, count, ppos);
                  if (ret >= 0 && ppos)
                  f.file->f_pos = pos;
                  fdput_pos(f);
                  }
                  return ret;
                  }
                  
                  SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count) {
                  return ksys_read(fd, buf, count);
                  }
                  What is there not to understand?

                  readfile uses a single vfs_read. Check Greg's patch.

                  If emulated in user space, to read a small file, you will perform two reads. But if you use own way of reading files, you will also perform a read twice. So providing readfile and readfile fallback in a library will never be a regression in terms of a performance.

                  Comment


                  • #29
                    Excuse my ignorance: why is it necessary to use two reads or even a loop, if using a pre-allocated buffer large enough to hold the max expected size ?

                    EDIT: Or to put it differently, why would be there no internal API, and why is there no user-space API, that allows doing this with a single read call? Or why doesn't a result code allow to distinguish EOF vs incomplete read?

                    EDIT 2: Comparing above source for readfile by Greg KH and read/ksys_read in read_write.c suggests that read on Linux (in the current implementation) will always read the whole content of a file, even though the official semantics are that it doesn't necessarily do so. Is it perhaps valid to take advantage of this in Linux-only code? (Or to have two versions depending on platform?) Specifically sysfs code would tend to be Linux-only.
                    Last edited by indepe; 25 November 2020, 08:15 PM.

                    Comment


                    • #30
                      Originally posted by curfew View Post
                      The "optimization" part is also dubious, as also noted by the author of the readfile patch. While simple synthetic benchmarks show an improvement, his own conclusion was that within any real apps the "system noise" will mask any benefits anyway.
                      I'd expect there will be io_uring support for readfile, and that asynchronous reading of multiple sysfs values will show an increased benefit for using readfile, and certainly be easier to use.

                      Originally posted by curfew View Post
                      Well this rather sounds like some invasion technique by Microsoft.. Inject a piece of code that only works on your platform, on the pretense that everyone will know to avoid it when writing "portable" apps, but of course most people either don't know or don't care, and so the vendor lock-in happens.

                      (Especially when, curiously enough, the function declaration was placed in the POSIX header file unistd.h...)
                      Maybe that's not where the header should be, but other performance related API's (io_uring) are also Linux specific. The proposed use for sysfs certainly is Linux-specific in the first place.
                      Last edited by indepe; 25 November 2020, 09:05 PM.

                      Comment

                      Working...
                      X