Announcement

Collapse
No announcement yet.

Yes, Linux Does Bad In Low RAM / Memory Pressure Situations On The Desktop

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by pininety View Post

    The system should start by messaging processes and clearing up cached pages. At least all multi user systems I have seen up to now do not do this which means that using something like mmaps to map huge files partially into memory becomes pointless because they never become unmapped again even if you read the data only once. The only way is to manually close the mmap and open it again with a offset. If the system would clear cached pages, it could unload the ones which there used the least amount automatically making more space for other stuff. Not 100% sure if this is because of the linux kernel messing up or the sys admins mis configuring their systems but it is bloody annoying if you deal with bigger than RAM data files.
    ?

    echo 3 > /proc/sys/vm/drop_caches
    It is also (in the end) exactly what the kernel does if it experiences stuff that needs to be RSS. It will slowly drop caches to fill RAM with resident pages.
    In the end there will be little to no caches and everything (more or less) will be in asked-to-be-in resident pages.

    Comment


    • #32
      Originally posted by milkylainen View Post

      Umm, no. He is complaining that the situation gets out of hand.
      It's a technical analysis situation, not that he cannot enable swap.
      Ladies and gentleman, defence intentionally disabled swap on low ram machine.
      I rest my case.

      Comment


      • #33
        This has been a pet peeve of mine and my number one annoyance with Linux on the desktop for as long as I have been running it on a regular basis.

        It is obviously worse on the desktop, where workloads are varied in nature, but I also had the problem on embedded boards, and servers. Having to physically reboot your server because memory pressure doesn't let you ssh in should be a big no-no. I know I could have used a hypervisor, but come on... this is a small homeserver.

        Also, I must run Sysrq+F a couple times (2-4) a day when working. THAT IS A SERIOUS ISSUE. Luckily it's the browser that gets killed most of the time, and "Form history control" helps me not lose data.

        One of the issues at play is that there is no easy way for an app developer to know when under memory pressure. The only thing you're being told is that malloc() can fail. In practice, it never fails. 90% of the time, you just end up being OOM killed, after you have used the whole memory and swap space. Usually 2 hours after you filled them both, at which point you cannot even move the mouse cursor or switch to a TTY.
        I once wrote an app to simulate chemical reactions... Depending on the timestep, etc, it could take huge amounts of RAM, and I ended up leaving it this way as it was difficult to know how much ram I could work with beforehand (or in real-time) anyway.

        We need better defaults, and better tools to deal with that.

        I had never heard of PSI, I will investigate. But there are a couple standing points:
        • Recovery options (ssh, switch VT, launch basic programs such as "ps", "kill", or even "htop") should always work
        • Desktop environments would know how to prioritize resource allocation, better than the kernel, so give them tools to provide hints, and preemptively (de-)swap applications.
        Basically, I think that namespaces should be extended to the other resources (that might be what cgroups are for, I'm not familiar with them yet): give a slice of memory to the user session. Same with networking. Reset capabilities under the namespace, to let a process schedule everything the way it wants, create sub-namespaces, and drop capabilities as needed. Those capabilities could increase swappiness on-the-fly for a process of the namespace, send signals related to performance pressure (memory, I/O, net), and throttle everything.

        It's just about time we finally get QoS in these areas as well! I am quite pleased to see the issue being acknowledged, and hope we'll get some meaningful solution out of this. I've been contemplating getting my hands dirty for this as well.

        Bonus, selection of previous rants/mentions of the problem I made:
        https://www.phoronix.com/forums/foru...45#post1084845 (yeah, IO and swap are terrible)
        https://www.phoronix.com/forums/foru...47#post1085047 (worse without swap, compressed memory fares better)
        https://www.phoronix.com/forums/foru...98#post1055698 ( Includes some possible solutions I thought were interesting)
        https://www.phoronix.com/forums/foru...92#post1104492 (My father is a non-techie and was hit by this)

        Comment


        • #34
          Originally posted by tildearrow View Post

          Sadly, this happened to me even with swap on.
          Because your machine can not possibly have issues of it's own. Also, swap is not substitute for proper hardware.

          Comment


          • #35
            Originally posted by nivedita View Post

            Executables and shared libraries are paged into memory, and can be paged out even with no swap.

            I think that’s part of the reason this is being considered a bug. The kernel is dumping those pages and likely immediately reading them back in when execution continues.
            Yes, that's what's happening here IIRC. The stack is swapped out and re-read from disk, which obviously slows everything down to a crawl. And Linux doesn't even honor the sticky bit, so you have no recourse but wait for the OOM reaper, hard reboot or sysrq-F your way out of this, before you can even SSH in. (fun stuff is you can technically SSH in, but the handshake takes longer than the timeout).

            Comment


            • #36
              Originally posted by nivedita View Post
              I think that’s part of the reason this is being considered a bug. The kernel is dumping those pages and likely immediately reading them back in when execution continues.
              Agreed 100%. I think this could be considered a behavior bug with the heuristic overcommit. It commits, finds overpressure, dumping pages, re-reading them and ending up in severe (maybe unsolvable?) pressure stall instead of the application stopping when unresolvable non-overcommit occurs. In reality most programs have shitty handling on allocation fail, so maybe the kernel should invoke the oom killer on the runaway process causing the situation? Atleast an option? echo 666 > /proc/sys/vm/overcommit_memory ?

              Comment


              • #37
                Originally posted by M@yeulC View Post

                Yes, that's what's happening here IIRC. The stack is swapped out and re-read from disk, which obviously slows everything down to a crawl. And Linux doesn't even honor the sticky bit, so you have no recourse but wait for the OOM reaper, hard reboot or sysrq-F your way out of this, before you can even SSH in. (fun stuff is you can technically SSH in, but the handshake takes longer than the timeout).
                Um no. Stack pages cannot be swapped out if swap is turned off. Only ro pages, basically the text pages can be.

                Comment


                • #38
                  Originally posted by M@yeulC View Post
                  One of the issues at play is that there is no easy way for an app developer to know when under memory pressure. The only thing you're being told is that malloc() can fail. In practice, it never fails.
                  You can disable the default overcommit and hope for your applications to have a sane memory allocation failure strategy. On no overcommit the pressure should cause a malloc fail when there are no caches to free. Otherwise it's probably a bug. Or atleast a very odd behavior. No overcommit should not page out seldomly used pages to resolve memory pressure.

                  Comment


                  • #39
                    Originally posted by milkylainen View Post

                    You can disable the default overcommit and hope for your applications to have a sane memory allocation failure strategy. On no overcommit the pressure should cause a malloc fail when there are no caches to free. Otherwise it's probably a bug. Or atleast a very odd behavior. No overcommit should not page out seldomly used pages to resolve memory pressure.
                    I agree, but since it is not doable on a per-app basis (that I know of), it isn't really within the app developer's responsibility. And I don't see for instance a game or web browser developer to tell the user to turn off the overcommitting mechanism (if those apps would even work without it in the first place).
                    If there was a signal during high memory pressure (or really, any resource contention issue), the application could drop some of its cached data: the browser could unload some tabs, the game... IDK? Reduce the corpse count or the AI's proficiency ?

                    Comment


                    • #40
                      The problem is that the oomkiller sucks at doing what it's supposed to. Kill the browser I'm using when I run out of memory for all I care, just do something rather than wasting 30 minutes constantly moving the binary in and out of memory before the kernel decides to kill something.

                      Comment

                      Working...
                      X