Announcement

Collapse
No announcement yet.

Linux Kernel To Get AIO Performance Improvements

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by ryao View Post
    The kernel's VFS layer provides AIO functions. However, writing programs that take advantage of asynchronous operations requires more effort than those using their traditional UNIX counterparts. Utilizing AIO provides no benefit unless your program is a daemon that can do other things while waiting for IO.
    Well would be pretty cool on Web and FTP servers.

    Comment


    • #12
      Well you don't have asynchronous sendfile or splice. You don't have asynchronous stat/opendir/readdir/... as well. And the interface is not the same depending on the fd type. I don't think that AIO will work on something different from regular file.

      Do you have any news about syslets?

      Comment


      • #13
        IMHO asynchronous file I/O is still a mess on Linux.

        As already stated in this thread the POSIX-AIO is implemented in user-space with a pool of blocking threads. The reason why it is not using the Kernel-AIO is the lack of features in AIO.

        There are 3 reasons not to use the Kernel-AIO:
        1. It only works with the O_DIRECT-flag, which means no buffering and very slow I/O.
        2. Kernel-AIO needs support of the file-system in the kernel. Which file-systems are supported is a little bit of a mystery. Does CIFS or NFS support it?
        3. When you finally use it it only runs on Linux, but not on BSD-systems.

        I come from a windows background and I have to admit that the IO-Completion-Ports (IOCP) is a much superior API compared with what Linux has to offer. It works since Windows 2000, supports sockets and ALL kind of file-handles.
        The Kernel-AIO moves the right direction, but the progress is very slow and hardly anybody is using it.

        I wrote a high-performance-server on windows and wonder how I would have achieved the asynchronous file-I/O on Linux.

        Comment


        • #14
          Why would you need AIO specifically - ie, why would it be better than say epoll + threads?

          Comment


          • #15
            Originally posted by curaga View Post
            Why would you need AIO specifically - ie, why would it be better than say epoll + threads?
            Because epoll, select and poll only works properly with sockets and not with files on Linux.

            Having blocking threads doing the read/write operations does not scale.
            An asynchronous reactor- or proactor-pattern is a much better approach.

            Comment


            • #16
              It's also much more complex and so bug-prone.

              Okay, so say you're doing a lot of writing of files to a slow HDD. What do you gain from AIO when compared to regular O_NONBLOCK?

              Comment


              • #17
                Originally posted by curaga View Post
                It's also much more complex and so bug-prone.
                It only feels complex if you are not used to. Programming with multiple threads is also more complex than single-threaded applications. But the benefit is worth the effort.
                I have worked with big projects which used blocking I/O and with ones that used a completely asynchronous approach. In the end the asynchronous design was accepted by much more developers because they saw the benefit.

                IMHO it is not hard to develop programs which use an asynchronous I/O approach. The main reason that a lot of people think it is more complex and bug-prone is because pretty much ALL examples of I/O (sockets or files) are blocking and it is hard to find good examples of the async way. That's why everbody starts with blocking I/O until they find out that it is not the best solution.

                I can encourage everybody who is interested in the topic to look at the boost::asio library which is an excellent platform-independent API for asynchronous I/O. (but no async file I/O on linux)

                Originally posted by curaga View Post
                Okay, so say you're doing a lot of writing of files to a slow HDD. What do you gain from AIO when compared to regular O_NONBLOCK?
                Actually I didn't try out if the O_NONBLOCK-flag does work properly with files on Linux. But even when it returns EWOULDBLOCK when the operation would block, you still don't know when you can call it again and when the operation would complete without blocking (select or epoll won't work). You still have to work around the same problems as with blocking I/O.

                Comment


                • #18
                  Originally posted by jwilliams View Post

                  Do you have a lot of RAM?

                  It took me a long time to figure it out, but I noticed on my system (32GB RAM) that the settings for VM dirty_ratio and dirty_background_ratio are not nearly low enough -- even 1 is not low enough. What seems to be happening is that during heavy write activity, the system caches writes until it has hundred of MiBs or more in the cache, then decides to write it to storage and that seems to block other processes and can cause the system to nearly freeze for seconds at a time.

                  My partial solution is to use dirty_bytes and dirty_background_bytes instead. I currently set them to 96MiB and 32MiB:

                  echo "100663296" > /proc/sys/vm/dirty_bytes
                  echo "33554432" > /proc/sys/vm/dirty_background_bytes

                  That much can be written to storage in less than a second. It seems to help. The mulitple second freezes are mostly gone.
                  A better solution may be to use a multi-queue scheduler like BFQ and stay with insane sized cache... Unfortunately small cache affects some apps like OBS and Chrome heavily even with BFQ if for example any file transfer is happening, and I suspect it is about synchronous functions, I suspect AIO would help here.

                  Comment


                  • #19
                    Originally posted by RomuloP View Post

                    A better solution may be to use a multi-queue scheduler like BFQ and stay with insane sized cache... Unfortunately small cache affects some apps like OBS and Chrome heavily even with BFQ if for example any file transfer is happening, and I suspect it is about synchronous functions, I suspect AIO would help here.
                    Holy necro-thread, Batman!

                    Thinking back, at the time I wrote my original post I was running an i7-2700k desktop with 8-16GB ram and a 500GB spinning-rust SATA drive (which was later upgraded to an SSD). This was also before BFQ even existed in the mainline kernel, so there's that

                    Things are much better these days, and I credit the move to SSDs for a large majority of that improvement. That being said, my current work laptop is practically equivalent to the desktop it replaced in performance (haswell quad i7 laptop, 16GB ram, 500GB SSD). It's up for replacement soon, but the main improvements I expect are going to come from jumping to an NVMe SSD and improved battery life from 5-6 years of laptop design improvements.

                    Comment


                    • #20
                      Ops, sorry, sort of result of a old problems still contemporary and lack of attention.

                      SSDs and multi-queue schedulers helped, but problems still exist. A global dirty page limit sucks for slow devices like pendrives, and still today, Linux memory management washes its hands and give all the dirty page limit to cache a X GB ISO that is going into a 20 MBps device. To worsen things, the queueing overhead just make the transference even slower, not to point no file manager can estimate the transference progress correctly.

                      It is hard to point a culprit/solution, with smaller dirty limits that put some sanity for external device transferences, some aggressive synced programs like OBS and Chrome without user space cache for some things, lag/drop frames easely, when something like aggressive file transfer dominates, also in my opinion IO intensive programs doing things like file transfer could offear PSI aware techniques to avoid overpressuring the system or device so much and make a better job with slow devices. The kernel could more intelligently deal with dirty pages, allowing per device/process limits would help a lot. I'm just waiting to see what solution comes first.

                      Comment

                      Working...
                      X