Announcement

Collapse
No announcement yet.

Linux Kernel Getting io_uring To Deliver Fast & Efficient I/O

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • starshipeleven
    replied
    Originally posted by polarathene View Post
    I thought there was quite a bit of discussion about how FreeBSD handles network I/O better than Linux? (no links, or specifics that I can recall though)
    Lately, all times a certain someone stated it there was someone else posting a benchmark where Linux either trades blows or destroys BSD on networking with 10Gbit cards.

    Leave a comment:


  • polarathene
    replied
    Originally posted by jabl View Post

    Efficient network I/O has already been solved a long time ago with epoll. This patchset (io_uring) delivers efficient asynchronous file I/O, including for buffered file I/O which wasn't possible with the old file AIO interface (io_submit() etc.).
    I thought there was quite a bit of discussion about how FreeBSD handles network I/O better than Linux? (no links, or specifics that I can recall though)

    Leave a comment:


  • rene
    replied
    Originally posted by oiaohm View Post

    Problem with your idea its already been tried in the Linux kernel did not provide the performance boost anywhere near expected. Linux kernel is no where near as simple as you think it is.

    https://lwn.net/Articles/755919/

    bpfilter is one of the next generation of Linux kernel drivers. This is a mix of user space and kernel mode and bpf in kernel mode all in a single Linux .ko driver. Linux kernel is tuning into a strange form of hybrid.

    Audit-able intentionally turing incomplete(as in fails the turing test) language that ebpf is that jit to native code by kernel to run in kernel space provides many times the performance boost bundling syscalls can this is shown by ebpf and bundle syscalls being use to attempt performance boost fuse under Linux. It due to the fact some basic logic can be performed kernel side complete event runs can be completed without any context switches with bpf.

    Also you miss one of the big causes of context switching in microkernels. There are many operations that a driver performs that in fact need ring 0. This is not want this is need. IOPL (I/O Privilege level) flag does not provide to rings other than ring 0 the right to mess with memory permissions and other things drivers need to do with DMA driven hardware so this is happening a lot.

    Microkernel core as ring 0 and drivers at like ring1/ring2 with IOPL flag you get performance wrecked by memory permission operations that must happen at ring 0 resulting in a mandatory context switch. so killing performance. This makes spectre performance loses look minor.

    Microkernel core could run as hypervisor ring -1. Then each driver need to run a individual ring 0 vm then hypervisor transfer over head kills you. This results in Microkernel being a watchdog at ring -1 over a big blob monolith at ring 0 as this does perform.

    CPU we have today are not designed to run microkernels effectively. Linux kernel hybred experiments might show way out.

    https://access.redhat.com/articles/3311301
    Big thing you have missed Linux kernel did add spectre and other cpu bug mitigation but then also includes flags to turn those off for those who want speed.

    This is the hard bit design a OS you have security and performance that are both at time mutually exclusive you have to allow the user to choose what one they need more of. So the spectre overhead as a arguement for micro-kernel does not fly. Linus is willing to take a performance over head for security as long as it has a off flag to return the lost performance while decreasing the security. See the micro-kernel fail here its designed for security so you are stuck with the overhead if it suites your current problem or not..
    a short note to a long story: i386 and later have an i/o permission bitmap for supposedly fine grained i/o permission control, also modern hardware is anyway not using classic i/o ports, with (nearly) everything memory mapped ring 3 drivers can drive the hardware just fine without any extra i/o context switching. Also QNX is quite fast, even was a decade ago, so it is not like it is impossible to do. Also more elegant architecture and algorithms can vastly improve performance, e.g. look at the current graphic subsystem performance.
    Last edited by rene; 02-15-2019, 05:52 AM.

    Leave a comment:


  • jabl
    replied
    Originally posted by polarathene View Post

    Oh? So this won't help improve network I/O much?
    Efficient network I/O has already been solved a long time ago with epoll. This patchset (io_uring) delivers efficient asynchronous file I/O, including for buffered file I/O which wasn't possible with the old file AIO interface (io_submit() etc.).

    Leave a comment:


  • polarathene
    replied
    Originally posted by jpg44 View Post

    I've written servers before. Often network code uses a select loop and non blocking I/O with buffering. When the buffer is full, writes will not suceed so you keep the data queue'd and use select to watch for writeability. AIO also is non blocking and queues data in a buffer, and gives you notification when data hits the disk, useful for disk writes when you need to make sure it got to the disk, can be important for databases
    Oh? So this won't help improve network I/O much?

    Leave a comment:


  • oiaohm
    replied
    Originally posted by rene View Post
    And I was just discussing bundled system calls for improved multi-server microkernel on my last night's livestream:
    Problem with your idea its already been tried in the Linux kernel did not provide the performance boost anywhere near expected. Linux kernel is no where near as simple as you think it is.

    https://lwn.net/Articles/755919/

    bpfilter is one of the next generation of Linux kernel drivers. This is a mix of user space and kernel mode and bpf in kernel mode all in a single Linux .ko driver. Linux kernel is tuning into a strange form of hybrid.

    Audit-able intentionally turing incomplete(as in fails the turing test) language that ebpf is that jit to native code by kernel to run in kernel space provides many times the performance boost bundling syscalls can this is shown by ebpf and bundle syscalls being use to attempt performance boost fuse under Linux. It due to the fact some basic logic can be performed kernel side complete event runs can be completed without any context switches with bpf.

    Also you miss one of the big causes of context switching in microkernels. There are many operations that a driver performs that in fact need ring 0. This is not want this is need. IOPL (I/O Privilege level) flag does not provide to rings other than ring 0 the right to mess with memory permissions and other things drivers need to do with DMA driven hardware so this is happening a lot.

    Microkernel core as ring 0 and drivers at like ring1/ring2 with IOPL flag you get performance wrecked by memory permission operations that must happen at ring 0 resulting in a mandatory context switch. so killing performance. This makes spectre performance loses look minor.

    Microkernel core could run as hypervisor ring -1. Then each driver need to run a individual ring 0 vm then hypervisor transfer over head kills you. This results in Microkernel being a watchdog at ring -1 over a big blob monolith at ring 0 as this does perform.

    CPU we have today are not designed to run microkernels effectively. Linux kernel hybred experiments might show way out.

    https://access.redhat.com/articles/3311301
    Big thing you have missed Linux kernel did add spectre and other cpu bug mitigation but then also includes flags to turn those off for those who want speed.

    This is the hard bit design a OS you have security and performance that are both at time mutually exclusive you have to allow the user to choose what one they need more of. So the spectre overhead as a arguement for micro-kernel does not fly. Linus is willing to take a performance over head for security as long as it has a off flag to return the lost performance while decreasing the security. See the micro-kernel fail here its designed for security so you are stuck with the overhead if it suites your current problem or not..

    Last edited by oiaohm; 02-14-2019, 07:47 PM.

    Leave a comment:


  • jpg44
    replied
    Originally posted by polarathene View Post
    So this is I/O perf improvement for disk, memory and network? At least I think those 3(and any other kinds) are all handled differently.

    Applications have to specifically utilize it? I suppose if the application offloads I/O to a lib that handles it(that the dev may not need to do anything explicitly to do on the platform to use) then it's a free improvement?(as in no extra work required), eg Apps on KDE that use KIO perhaps?(assuming libs like KIO would need to first update support for it before it's dependents benefit)
    I've written servers before. Often network code uses a select loop and non blocking I/O with buffering. When the buffer is full, writes will not suceed so you keep the data queue'd and use select to watch for writeability. AIO also is non blocking and queues data in a buffer, and gives you notification when data hits the disk, useful for disk writes when you need to make sure it got to the disk, can be important for databases

    Leave a comment:


  • polarathene
    replied
    So this is I/O perf improvement for disk, memory and network? At least I think those 3(and any other kinds) are all handled differently.

    Applications have to specifically utilize it? I suppose if the application offloads I/O to a lib that handles it(that the dev may not need to do anything explicitly to do on the platform to use) then it's a free improvement?(as in no extra work required), eg Apps on KDE that use KIO perhaps?(assuming libs like KIO would need to first update support for it before it's dependents benefit)

    Leave a comment:


  • discordian
    replied
    Originally posted by treba View Post
    This sounds really exciting. Is it meant for general purpose use or rather for specific uae cases? Like, would it makes sense for a file manager to use it for copying? Or for firefox for profile data/on disk cache?
    Primary for serving multiple outstanding IOs with various speeds/bottlenecks.. means mostly server stuff like sending files over the net.
    filemanagers should use `sendfile`, caches are ideally mem-mapped.

    Leave a comment:


  • danger
    replied
    The name of the ring sounds like a joke.

    Leave a comment:

Working...
X