Announcement

Collapse
No announcement yet.

Fedora 32 Looking At Using EarlyOOM By Default To Better Deal With Low Memory Situations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by k1e0x View Post
    Why not just fix the kernel's OOM killer?
    because this "fix" is really just 10% reduction in available memory. not the best solution for memory-starved systems
    Originally posted by k1e0x View Post
    Other OS's don't hang up like Linux does in that situation..
    other oses never reach that situation, they die much earlier. but if you prefer inferior behaviour of other oses, why not just use them?
    Last edited by pal666; 01-04-2020, 09:13 PM.

    Comment


    • #72
      During these types of discussions, I tend to wonder why I don't seem to have similar problems, then I remember that I also have a set of sysctl settings to avoid them. I haven't done a fresh install on my main desktop for well over a decade, just migrating the old one to new machines and using a rolling release distro. One would think that at least the distributions that target desktops would make an attempt at matching these settings to ones hardware? Do they? My settings are a little different than those already shared: (32G 4core nvme root, 12tb spinning rust)

      Code:
      # Using a ratio instead of a fixed value as described above (you can only use one or the other)
      vm.dirty_background_ratio = 2
      vm.dirty_ratio = 5
      
      # Try to keep things in memory
      vm.swappiness = 5
      vm.vfs_cache_pressure = 25
      https://wiki.archlinux.org/index.php...Virtual_memory

      Contains some discussion on these values and others. For those that have swapping issues, vm.min_free_kbytes as described may help.
      Last edited by set135; 01-04-2020, 09:34 PM.

      Comment


      • #73

        vm.dirty_background_ratio should be low so that the kernel starts writing dirty memory to disk earlier.
        vm.dirty_ratio should be high, so it doesn't stall the system when there is much dirty memory. The difference between the two is the "buffer" the kernel have to work with. If it is unsuccessful in writing out the dirty memory before vm.dirty_ratio is reached it will stall the system until it has done it's task.

        Of course setting vm.dirty_ratio too high will of course increase risk swapping.

        Comment


        • #74
          Originally posted by k1e0x View Post
          Why not just fix the kernel's OOM killer? Other OS's don't hang up like Linux does in that situation.. Wait.. I know.. systemd-oom-bandaid.
          there are some hints to your question on the earlyoom github page: https://github.com/rfjakob/earlyoom
          It is hard for the kernel to know what to kill. You are doing a disservice to the kernel devs if you think it is easy.

          As for other OS: I have stress tested win 10 and ubuntu + earlyoom (2GB, no swap, load 100 tabs in Chrome), and windows 10 does not do very well; most of the time the VM just crashed, sometimes chrome died (all of it). earlyoom kills tabs, the machine stays responsive. In other words, Linux with earlyoom is a much better experience than Windows 10, in my testing.

          Also, there is a new project, nohang, which is more sophisticated. It provides desktop notifications out of the box, and it can use the new memory presssure KPIs and messages from zram to give more warning about low memory. In default settings, it acts very much like earlyoom (but with desktop notifications about low memory). Desktop Linux needs a user space killer, the kernel devs have made that clear, and we have earlyoom and a newer one, nohang, plus gnome is building in notification support. Right now, anyone who complains about OOM deadlocks should install earlyoom, problem solved.

          The kernel killer only fails sometimes, too; there is a lot of exaggeration about how bad it is. I run countless 2GB linux servers, and I never get OOM deadlocks. Obviously it can happen, otherwise facebook wouldn't have added pressure stall KPIs to the kernel, but it is rare.

          Comment


          • #75
            Originally posted by Britoid View Post

            systemd should remain operational as long as pid 1 is still there, which should never be killed or swapped out.

            anything else sounds like a kernel bug.
            All I know is that it's "running" but you can't use it, communicate with it in any way shape or form. New services cannot stop, existing services cannot be shutdown. You can't "reboot". You have to hard kill the box. Call that whatever you want.

            Comment


            • #76
              Originally posted by timrichardson View Post

              there are some hints to your question on the earlyoom github page: https://github.com/rfjakob/earlyoom
              It is hard for the kernel to know what to kill. You are doing a disservice to the kernel devs if you think it is easy.

              As for other OS: I have stress tested win 10 and ubuntu + earlyoom (2GB, no swap, load 100 tabs in Chrome), and windows 10 does not do very well; most of the time the VM just crashed, sometimes chrome died (all of it). earlyoom kills tabs, the machine stays responsive. In other words, Linux with earlyoom is a much better experience than Windows 10, in my testing.

              Also, there is a new project, nohang, which is more sophisticated. It provides desktop notifications out of the box, and it can use the new memory presssure KPIs and messages from zram to give more warning about low memory. In default settings, it acts very much like earlyoom (but with desktop notifications about low memory). Desktop Linux needs a user space killer, the kernel devs have made that clear, and we have earlyoom and a newer one, nohang, plus gnome is building in notification support. Right now, anyone who complains about OOM deadlocks should install earlyoom, problem solved.

              The kernel killer only fails sometimes, too; there is a lot of exaggeration about how bad it is. I run countless 2GB linux servers, and I never get OOM deadlocks. Obviously it can happen, otherwise facebook wouldn't have added pressure stall KPIs to the kernel, but it is rare.
              Good, I'm glad to do a disservice to them. It's broken.

              The problem isn't avoiding the situation before it happens.
              the problem is also not choosing what to kill. That might be nice, but that's not the issue here.

              It's that the existing kernel oom killer doesn't work and hangs the system.

              Comment

              Working...
              X