Announcement

Collapse
No announcement yet.

Fedora 32 Looking At Using EarlyOOM By Default To Better Deal With Low Memory Situations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by k1e0x View Post
    Why not just fix the kernel's OOM killer?
    Because this is a really hard problem? This has been discussed ad infinitum, without any good solutions for the general cases being found (it is sort of like poking a balloon, where you can push this edge in, but it expands elsewhere). For all the proposals, while you can certainly improve things for specific cases, for others it only makes things worse. If you believe you know how to finally solve the general problem, please go do it.

    Comment


    • #12
      Originally posted by Raka555 View Post

      Not sure that systemd (or any init system) really have more information to make a better decision than the kernel.

      To me a kind of ok solution would be on a gui desktop to pop up a dialog informing the user about the situation and asking which process to kill (if you can get pass the chicken and egg to actually get something displayed)
      1. When the system is starving for RAM there might be no resources or/and time to ask the user to terminate something.
      2. The user might be away and by the time he decides what's the best process to kill the system will be completely down.
      3. The user might not be running GUI at all or his session might be down itself.

      We need a solution which works without user intervention. The kernel could be the best place for this feature but kernel developers seem not to care about this issue as much as it's crucial and now we already have at least four user space daemons for the task.

      I feel like my mind is quickly disintegrating (no jokes) and it's very nice to know that some of the worries that I've voiced over more than 20 years of my Linux use have been attended to and I haven't lived in vain. It's still a very long road ahead for Linux desktop applicability without any ifs. I also have a very unpopular opinion that the Linux kernel (or at least its development model) has to go. The Linux kernel is great for supercomputers and other bespoke solutions. It's almost completely inappropriate for desktop, mobile and IoT where it's needed most.
      Last edited by birdie; 03 January 2020, 06:29 PM.

      Comment


      • #13
        At one point I could almost never copy a large-ish file on my laptop without it going into a fit and doing crazy oom things.

        I traced it to the so called "laptop mode", which would delay write back to the disk as much as possible.
        To me that is brain dead just to try and save some battery life at the expense of robustness.
        They delay the IO to the point where the system is out of options.

        In general I hate that the amount of dirty buffers scales with the amount of memory that you have.
        I now always run with the follow settings in my /etc/sysctl.conf:
        vm.dirty_bytes = 16777216
        vm.dirty_background_bytes = 4194304


        I never have any issues while copying large files any more.

        Edit: And I can still push 3GB/s+ on my desktop with a NVMe.
        Edit2: This will fix the stutter/sluggishness of the system under heavy disk IO even if there is no oom happening.
        (don't forget to do sudo sysctl -p)
        Last edited by Raka555; 04 January 2020, 08:48 AM.

        Comment


        • #14
          Usually the problems arise when devices can't empty their buffers or write to or allocate ram quick enough due to memory fragmentation. This happen when you're almost full. Swap won't help here.

          A band aid is to defragment the ram to make larger continous blocks of memory.: "echo 1 > /proc/sys/vm/compact_memory"

          /proc/buddyinfo gives you some idea on current level of fragmentation.

          Comment


          • #15
            Originally posted by CommunityMember View Post
            Because this is a really hard problem? This has been discussed ad infinitum, without any good solutions for the general cases being found (it is sort of like poking a balloon, where you can push this edge in, but it expands elsewhere). For all the proposals, while you can certainly improve things for specific cases, for others it only makes things worse. If you believe you know how to finally solve the general problem, please go do it.
            That is my point. Linux is either up to the challenge of solving the hard problems in OS design.. or it isn't and will need to bolt on a userland solution.

            FreeBSD doesn't do this at all btw. Last time this topic came up I ran some tests and, it wasn't clean.. it wasn't graceful.. it wasn't logical but every single time the oom killer worked and the system never hung once like Linux.. it righted itself instantly.

            From my understanding it's oom killer code isn't even very complex so I don't know what Linux is doing or why it hangs.
            Last edited by k1e0x; 03 January 2020, 06:17 PM.

            Comment


            • #16
              Originally posted by Spam View Post
              Usually the problems arise when devices can't empty their buffers or write to or allocate ram quick enough due to memory fragmentation. This happen when you're almost full. Swap won't help here.

              A band aid is to defragment the ram to make larger continous blocks of memory.: "echo 1 > /proc/sys/vm/compact_memory"

              /proc/buddyinfo gives you some idea on current level of fragmentation.
              According to https://www.uninformativ.de/blog/pos...OSTING-en.html the kernel compacts memory automatically when/if needed. Someone who can read the code should say if it's true or not.

              Comment


              • #17
                Originally posted by Raka555 View Post
                At one point I could almost never copy a large-ish file on my laptop without it going into a fit and doing crazy oom things.

                I traced it to the so called "laptop mode", which would delay write back to the disk as much as possible.
                To me that is brain dead just to try and save some battery life at the expense of robustness.
                They delay the IO to the point where the system is out of options.

                In general I hate that the amount of dirty buffers scales with the amount of memory that you have.
                I now always run with the follow settings in my /etc/sysctl.conf:
                vm.dirty_bytes = 16777216
                vm.dirty_background_bytes =4194304

                I never have any issues while copying large files any more.

                Edit. And I can still push 3GB/s+ on my desktop with a NVMe.
                I raised the concerns that the kernel defaults in regard to dirty buffers were insane six years (!) ago and even Linus Torvalds admitted it was the case that had to be addressed ASAP. If I'm not mistaken the issue has been mostly abandoned and nothing has truly been done to rectify it.
                Last edited by birdie; 03 January 2020, 07:42 PM.

                Comment


                • #18
                  Originally posted by birdie View Post

                  It's almost completely inappropriate for desktop, mobile and IoT where it's needed most.
                  Damn, I thought it's the most popular mobile Kernel (some abstraction later like trash doesn't count). It's also wonderful on desktop which cannot be said of Windows one which is utter crap. What has to go are morons who have no clue how to make a good desktop distribution.

                  Due to popular demand, we're looking into AMD's Threadripper 2990WX performance using Linux. We have the Threadripper 2990WX put against the Core i9-7980XE head to head using...


                  Good luck with Windows being slow as crap.
                  Last edited by Volta; 03 January 2020, 07:07 PM.

                  Comment


                  • #19
                    Originally posted by birdie View Post

                    According to https://www.uninformativ.de/blog/pos...OSTING-en.html the kernel compacts memory automatically when/if needed. Someone who can read the code should say if it's true or not.
                    Indeed. Just that it doesn't work well. There are knobs to tweak when it should do this. But my experience is that it doesn't do it well enough, quick enough so that there is enough large blocks to take from without lots of extra work. When pressure is high, the extra work takes longer, so the pressure is getting higher.... So it feels like a lock up. Common situations would be to copy large files over fast network. This means the network cards and HDD controller fight for resources.

                    Read up on /sys/kernel/debug/extfrag

                    Another irony is that while cgroups can help with cpu stealing and keeping the system responsive during io, the slab caches are not shared between cgroups, so you can end up out ram because of this.
                    Last edited by S.Pam; 03 January 2020, 06:31 PM.

                    Comment


                    • #20
                      Originally posted by Raka555 View Post

                      Not sure that systemd (or any init system) really have more information to make a better decision than the kernel.

                      To me a kind of ok solution would be on a gui desktop to pop up a dialog informing the user about the situation and asking which process to kill (if you can get pass the chicken and egg to actually get something displayed)
                      systemd isn't really an init system it's a service and system manager. It can use dbus to inform the tell the desktop to provide a warning about the system being low on memory, and it already has cgroups so it can kill runaway services cleanly and knows if they're important or not.

                      Comment

                      Working...
                      X