Announcement

Collapse
No announcement yet.

Yes, Linux Does Bad In Low RAM / Memory Pressure Situations On The Desktop

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by aht0 View Post
    It might have been not "FreeBSD's problem" but fault in it's ZFS driver.
    Most likely the latter, FreeBSD has a OOM killer just like Linux, their approach to "not enough RAM" is the same.

    Comment


    • The stalling problem has been going on for at least 8 years. Ive noticed it as long as that. It happens with non ZFS systems.

      I think it has something to do with maybe the paging system, allocation, perhaps i/o scheduling or process scheduling getting caught in some sort of a lock up. If you can manage to get some control killing a big process usually fixes the problem, but I have doubts that it is a just problem with processes not being killed instead the memory situation triggering some other bug that causes thrashing and lock up.

      the LED lights up solidly as mentioned and the system becomes unresponsive and will remain so for hours.

      This is not acceptable and also should be categorized as a security vulnerability as this is a DENIAL OF SERVICE vulnerability as someone can bring down a system by causing these pressures. This is a very serious problem and the Linux kernel developers clearly do not care that Linux can be brought down.

      A way to improve Linux would be to allow default OOM behaviour to be augumented by allowing an external process to be notified of an OOM condition and decide what processes should be killed to free up memory, this can be configured to ask the user or to use a list of processes to kill and or a list of ones to be left alone. Monitor could use another API to know when the OOM system is satifisfied enough memory has been freed. Another feature is memory priority that X, console programs would have high memory priority and they would get access to being paged into memory so they can remain responsive , while other process may be suspended while memory is freed. On a desktop machine killing firefox usually suffices because this is the big memory consumer. X and other critical processes have to be left alone so there will still be a running machine.
      Last edited by Neraxa; 11 August 2019, 02:10 PM.

      Comment


      • Many have suggested this is caused by disabling swap however I have seen it with gigabytes of swap enabled and with most of the swap space being unused. What brings it on is things coming to within 200 MB or so of running out of RAM space. Its usually Firefox, and killing Firefox unlocks the system (it can take hours to actually get that done considering the system is in a virtually locked up state). The OOM killer is obviously not getting rid of Firefox itself, it would not obviously since there are gigabytes of swap still free. So, it kind of looks like the OOM killer is not even involved here since there is plenty of swap space available/. Looks like could be a problem involving i/o scheduling, allocation, process scheduling or something.
        Last edited by Neraxa; 11 August 2019, 08:10 PM.

        Comment


        • Originally posted by Neraxa View Post
          Many have suggested this is caused by disabling swap however I have seen it with gigabytes of swap enabled and with most of the swap space being unused. What brings it on is things coming to within 200 MB or so of running out of RAM space. Its usually Firefox, and killing Firefox unlocks the system (it can take hours to actually get that done considering the system is in a virtually locked up state). The OOM killer is obviously not getting rid of Firefox itself, it would not obviously since there are gigabytes of swap still free. So, it kind of looks like the OOM killer is not even involved here since there is plenty of swap space available/. Looks like could be a problem involving i/o scheduling, allocation, process scheduling or something.
          On my 16 GB laptop w/o swap where I do everything from multimedia to sw development to web browsing with FF and 20+ tabs open I haven't experienced any of the issues mentioned here. But if the default OOM killer is too slow, what about the already mentioned earlyoom or nohang (both with support for psi), or facebook's oomd - anybody tried them in these situations?
          Last edited by halo9en; 12 August 2019, 09:31 AM.

          Comment


          • I just tested this on FreeBSD 12. When a process hits the memory cap it instantly prints pid num, process name, uid num, was killed: out of swap space. Using moused to move the mouse around it doesn't even hickup. It took me less than 10 minutes to test this.

            FreeBSD's oom killer isn't very advanced but there is value in simplicity. This just works.. I know someone is out there writing a huge user land daemon to fix this.. But you don't need systemd-oombandaid. In fact the reason Linux is thrashing the disk is probably system-journal.

            Comment


            • Well I have something that works for me(ie. no disk thrashing), but only made it today(so not tested much) and that is a kernel patch(le9g.patch) to not evict `Active(file):`(see /proc/meminfo) if below 256 MiB (this should depend on your workload). I've tested it with linux-stable 5.2.4 because on linuxgit 5.3.0-rc4-gd45331b00ddb there's a yet-to-be-found-regression that freezes the whole system(without disk thrashing though) whether I use the patch or not, apparently before OOM-Killer would trigger.

              Code:
              diff --git a/mm/vmscan.c b/mm/vmscan.c
              index dbdc46a84f63..7a0b7e32ff45 100644
              --- a/mm/vmscan.c
              +++ b/mm/vmscan.c
              @@ -2445,6 +2445,13 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
                           BUG();
                       }
               
              +    if (NR_ACTIVE_FILE == lru) {
              +      long long kib_active_file_now=global_node_page_state(NR_ACTIVE_FILE) * MAX_NR_ZONES;
              +      if (kib_active_file_now <= 256*1024) {
              +        nr[lru] = 0; //don't reclaim any Active(file) (see /proc/meminfo) if they are under 256MiB
              +        continue;
              +      }
              +    }
                       *lru_pages += size;
                       nr[lru] = scan;
                   }

              Comment


              • Originally posted by howaboutsynergy View Post
                Well I have something that works for me(ie. no disk thrashing), but only made it today(so not tested much) and that is a kernel patch(le9g.patch) to not evict `Active(file):`(see /proc/meminfo) if below 256 MiB (this should depend on your workload). I've tested it with linux-stable 5.2.4 because on linuxgit 5.3.0-rc4-gd45331b00ddb there's a yet-to-be-found-regression that freezes the whole system(without disk thrashing though) whether I use the patch or not, apparently before OOM-Killer would trigger.

                Code:
                diff --git a/mm/vmscan.c b/mm/vmscan.c
                index dbdc46a84f63..7a0b7e32ff45 100644
                --- a/mm/vmscan.c
                +++ b/mm/vmscan.c
                @@ -2445,6 +2445,13 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
                BUG();
                }
                
                + if (NR_ACTIVE_FILE == lru) {
                + long long kib_active_file_now=global_node_page_state(NR_ACTIVE_FILE) * MAX_NR_ZONES;
                + if (kib_active_file_now <= 256*1024) {
                + nr[lru] = 0; //don't reclaim any Active(file) (see /proc/meminfo) if they are under 256MiB
                + continue;
                + }
                + }
                *lru_pages += size;
                nr[lru] = scan;
                }
                Noice!

                Now it would be even greater if it either were a tunable parameter (via sysctl) or computed based on the amount of installed RAM.

                Comment


                • "..Linux Does BadLY ...". Love ye Michael, but your grammar is better than that.

                  Comment


                  • Originally posted by birdie View Post

                    Noice!

                    Now it would be even greater if it either were a tunable parameter (via sysctl) or computed based on the amount of installed RAM.
                    Here you go my friend (up to date here: le9h.patch):

                    Code:
                    le9h.patch
                    
                    this is licensed under all/any of:
                    Apache License, Version 2.0
                    MIT license
                    0BSD
                    CC0
                    UNLICENSE
                    
                    diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
                    index 64aeee1009ca..d0f3f7080f03 100644
                    --- a/Documentation/admin-guide/sysctl/vm.rst
                    +++ b/Documentation/admin-guide/sysctl/vm.rst
                    @@ -68,6 +68,7 @@ Currently, these files are in /proc/sys/vm:
                     - numa_stat
                     - swappiness
                     - unprivileged_userfaultfd
                    +- unevictable_activefile_kbytes
                     - user_reserve_kbytes
                     - vfs_cache_pressure
                     - watermark_boost_factor
                    @@ -848,6 +849,69 @@ privileged users (with SYS_CAP_PTRACE capability).
                     The default value is 1.
                     
                     
                    +unevictable_activefile_kbytes
                    +=============================
                    +
                    +How many kilobytes of `Active(file)` to never evict during high-pressure
                    +low-memory situations. ie. never evict active file pages if under this value.
                    +This will help prevent disk thrashing caused by Active(file) being close to zero
                    +in such situations, especially when no swap is used.
                    +
                    +As 'nivedita' (phoronix user) put it:
                    +"Executables and shared libraries are paged into memory, and can be paged out
                    +even with no swap. [...] The kernel is dumping those pages and [...] immediately
                    +reading them back in when execution continues."
                    +^ and that's what's causing the disk thrashing during memory pressure.
                    +
                    +unevictable_activefile_kbytes=X will prevent X kbytes of those most used pages
                    +from being evicted.
                    +
                    +The default value is 65536. That's 64 MiB.
                    +
                    +Set it to 0 to keep the default behaviour, as if this option was never
                    +implemented, so you can see the disk thrashing as usual.
                    +
                    +To get an idea what value to use here for your workload(eg. xfce4 with idle
                    +terminals) to not disk thrash at all, run this::
                    +
                    +    $ echo 1 | sudo tee /proc/sys/vm/drop_caches; grep -F 'Active(file)' /proc/meminfo
                    +    1
                    +    Active(file):     203444 kB
                    +
                    +so, using vm.unevictable_activefile_kbytes=203444 would be a good idea here.
                    +(you can even add a `sleep` before the grep to get a slightly increased value,
                    +which might be useful if something is compiling in the background and you want
                    +to account for that too)
                    +
                    +But you can probably go with the default value of just 65536 (aka 64 MiB)
                    +as this will eliminate most disk thrashing anyway, unless you're not using
                    +an SSD, in which case it might still be noticeable (I'm guessing?).
                    +
                    +Note that `echo 1 | sudo tee /proc/sys/vm/drop_caches` can still cause
                    +Active(file) to go a under the vm.unevictable_activefile_kbytes value.
                    +It's not an issue and this is how you know how much the value for
                    +vm.unevictable_activefile_kbytes should be, at the time/workload when you ran it.
                    +
                    +The value of `Active(file)` can be gotten in two ways::
                    +
                    +    $ grep -F 'Active(file)' /proc/meminfo
                    +    Active(file):    2712004 kB
                    +
                    +and::
                    +
                    +    $ grep nr_active_file /proc/vmstat
                    +    nr_active_file 678001
                    +
                    +and multiply that with MAX_NR_ZONES (which is 4), ie. `nr_active_file * MAX_NR_ZONES`
                    +so 678001*4=2712004  kB
                    +
                    +MAX_NR_ZONES is 4 as per:
                    +`include/generated/bounds.h:10:#define MAX_NR_ZONES 4 /* __MAX_NR_ZONES */`
                    +and is unlikely the change in the future.
                    +
                    +The hub of disk thrashing tests/explanations is here:
                    +https://gist.github.com/constantoverride/84eba764f487049ed642eb2111a20830
                    +
                     user_reserve_kbytes
                     ===================
                     
                    diff --git a/kernel/sysctl.c b/kernel/sysctl.c
                    index 078950d9605b..c2726324a176 100644
                    --- a/kernel/sysctl.c
                    +++ b/kernel/sysctl.c
                    @@ -110,6 +110,15 @@ extern int core_uses_pid;
                     extern char core_pattern[];
                     extern unsigned int core_pipe_limit;
                     #endif
                    +#if defined(CONFIG_RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING)
                    +unsigned long sysctl_unevictable_activefile_kbytes __read_mostly =
                    +#if CONFIG_RESERVE_ACTIVEFILE_KBYTES < 0
                    +#error "CONFIG_RESERVE_ACTIVEFILE_KBYTES should be >= 0"
                    +#else
                    +  CONFIG_RESERVE_ACTIVEFILE_KBYTES
                    +#endif
                    +;
                    +#endif
                     extern int pid_max;
                     extern int pid_max_min, pid_max_max;
                     extern int percpu_pagelist_fraction;
                    @@ -1691,6 +1701,15 @@ static struct ctl_table vm_table[] = {
                             .extra1        = SYSCTL_ZERO,
                             .extra2        = SYSCTL_ONE,
                         },
                    +#endif
                    +#if defined(CONFIG_RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING)
                    +    {
                    +        .procname    = "unevictable_activefile_kbytes",
                    +        .data        = &sysctl_unevictable_activefile_kbytes,
                    +        .maxlen        = sizeof(sysctl_unevictable_activefile_kbytes),
                    +        .mode        = 0644,
                    +        .proc_handler    = proc_doulongvec_minmax,
                    +    },
                     #endif
                         {
                             .procname    = "user_reserve_kbytes",
                    diff --git a/mm/Kconfig b/mm/Kconfig
                    index 56cec636a1fc..d21b737ca32e 100644
                    --- a/mm/Kconfig
                    +++ b/mm/Kconfig
                    @@ -63,6 +63,39 @@ config SPARSEMEM_MANUAL
                     
                     endchoice
                     
                    +config RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING
                    +    bool "Reserve some `Active(file)` to prevent disk thrashing"
                    +    depends on IKCONFIG_PROC && SYSCTL
                    +    def_bool y
                    +    help
                    +      Keep `Active(file)`(/proc/meminfo) pages in RAM so as to avoid system freeze
                    +      due to the disk thrashing(disk reading only) that occurrs because the running
                    +      executables's code is being evicted during low-mem conditions which is
                    +      why it also prevents oom-killer from triggering until 10s of minutes later
                    +      on some systems.
                    +    
                    +      Please see the value of CONFIG_RESERVE_ACTIVEFILE_KBYTES to set how many
                    +      KiloBytes of Active(file) to keep by default in the sysctl setting
                    +      vm.unevictable_activefile_kbytes
                    +      see Documentation/admin-guide/sysctl/vm.rst for more info
                    +
                    +config RESERVE_ACTIVEFILE_KBYTES
                    +    int "Set default value for vm.unevictable_activefile_kbytes"
                    +  depends on RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING
                    +    default "65536"
                    +    help
                    +      This is the default value(in KiB) that vm.unevictable_activefile_kbytes gets.
                    +      A value of at least 65536 or at most 262144 is recommended for users
                    +      of xfce4 to avoid disk thrashing on low-memory/memory-pressure conditions,
                    +      ie. mouse freeze with constant disk activity (but you can still sysrq+f to
                    +      trigger oom-killer though, even without this mitigation)
                    +    
                    +      You can still sysctl set vm.unevictable_activefile_kbytes to a value of 0
                    +      to disable this whole feature at runtime.
                    +    
                    +      see Documentation/admin-guide/sysctl/vm.rst for more info
                    +      see also CONFIG_RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING
                    +
                     config DISCONTIGMEM
                         def_bool y
                         depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || DISCONTIGMEM_MANUAL
                    diff --git a/mm/vmscan.c b/mm/vmscan.c
                    index dbdc46a84f63..0dcd4e2dc02d 100644
                    --- a/mm/vmscan.c
                    +++ b/mm/vmscan.c
                    @@ -2445,6 +2445,16 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
                                 BUG();
                             }
                     
                    +#if defined(CONFIG_RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING)
                    +    extern unsigned int sysctl_unevictable_activefile_kbytes; //FIXME: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
                    +    if (NR_ACTIVE_FILE == lru) { //FIXME: warning: comparison between ‘enum node_stat_item’ and ‘enum lru_list’ [-Wenum-compare]
                    +      long long kib_active_file_now=global_node_page_state(NR_ACTIVE_FILE) * MAX_NR_ZONES;
                    +      if (kib_active_file_now <= sysctl_unevictable_activefile_kbytes) {
                    +        nr[lru] = 0; //ie. don't reclaim any Active(file) (see /proc/meminfo) if they are under sysctl_unevictable_activefile_kbytes see Documentation/admin-guide/sysctl/vm.rst and CONFIG_RESERVE_ACTIVEFILE_TO_PREVENT_DISK_THRASHING and CONFIG_RESERVE_ACTIVEFILE_KBYTES
                    +        continue;
                    +      }
                    +    }
                    +#endif
                             *lru_pages += size;
                             nr[lru] = scan;
                         }
                    Last edited by howaboutsynergy; 05 November 2019, 09:32 AM. Reason: using archive org url because github account is deleted

                    Comment


                    • howaboutsynergy

                      This patch looks like it could be merged with mainline. Why don't you try sending it to linux-mm?

                      Comment

                      Working...
                      X