Announcement

Collapse
No announcement yet.

Facebook Developing "OOMD" For Out-of-Memory User-Space Linux Daemon

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • andreano
    replied
    Is there anything like a `nice` command for OOM score, so it's possible to run a buildsystem safely?

    I think the typical way I run out of memory is that I (momentarily) get a process tree a bit like this

    Code:
    make -j8
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    ├─ ninja -j8
    │  └─ ...
    └─ ninja -j8
       └─ ...
    ↑ That's 64 jobs instead of the intended 8.

    And somebody needs to land a patch in make or ninja that relaxes the number of jobs when they start getting swapped out or something. There is no point in running more jobs than you have RAM for.
    Last edited by andreano; 22 October 2018, 01:15 PM.

    Leave a comment:


  • Guest
    Guest replied
    So why not just contribute to the Kernel?

    Leave a comment:


  • schmidtbag
    replied
    Originally posted by Spooktra View Post
    When I went with my R5 1600, because I was on a tight budget and only bought 8 GB of DDR4, I set Ubuntu up with a 20 GB swap file on a fast SSD and a 20 GB swap file on a spinning rust drive. Guess what? No more lock ups or system freezes, even when it's using all available system ram, 10 GB of swap and nearly 2 GB of vram (our of 2 GB total), the system still stays responsive and even if one app freezes for a minutes, it recovers in less than a minute.

    I don't think there's anything wrong with the way the Linux kernel handles memory, I think the problem rests with the way people configure their systems.
    Well let's put it this way - have you run out of swap space? Because if your swap ever gets maxed out, your system will be even more unresponsive than if you didn't have it at all. Keep in mind that the rate at which RAM fills up seems to be correlated to how unresponsive your system becomes, especially if you're using a swap drive. I assume you would probably get by just fine with 16GB of RAM, no swap, and a discrete GPU. Furthermore, if you increase your swappiness, that may help with long-term performance. By default, Linux only uses swap when there's insufficient RAM, but you can tell it to use swap sooner, much like how Windows does it. This is good if you expect to run out of RAM, but it obviously hurts short-term performance.

    Also I'm a little bit confused - if you have a 1600, how exactly are you dedicating 2GB of RAM to the GPU? I ask because the 1600 doesn't have an IGP, and GPUs aren't used for system memory*. Or am I just misunderstanding what you meant there?


    * There is one exception:
    Last edited by schmidtbag; 22 October 2018, 09:39 AM.

    Leave a comment:


  • nanonyme
    replied
    Originally posted by chithanh View Post
    I think the OOM analogy which Andries Brouwer came up with in 2004 is still the best one:


    That's especially funny since airplane companies routinely sell more tickets to airplanes that they have seats for because on average sufficient amounts of people do late cancellation that everyone fits anyway. If plane is full, people get bumped to next plane (though not terminated)

    Leave a comment:


  • Spooktra
    replied
    My experience with OOM and low ram scenarios, my primary system used to be a Haswell based Xeon with 16 GB and an 8 GB swap file and I used to lock that system up very easily by opening up a bunch of Firefox tabs, while starting a video encoding job and trying to play a 4k video at the same time, I can't remember how many hard reboots I had to do to recover the system.

    When I went with my R5 1600, because I was on a tight budget and only bought 8 GB of DDR4, I set Ubuntu up with a 20 GB swap file on a fast SSD and a 20 GB swap file on a spinning rust drive. Guess what? No more lock ups or system freezes, even when it's using all available system ram, 10 GB of swap and nearly 2 GB of vram (our of 2 GB total), the system still stays responsive and even if one app freezes for a minutes, it recovers in less than a minute.

    I don't think there's anything wrong with the way the Linux kernel handles memory, I think the problem rests with the way people configure their systems.

    Leave a comment:


  • skeevy420
    replied
    What's the point when we can always download more ram?

    But seriously, things like better tuned default settings to your hardware (like min_free_kbytes suggested above), putting swap on ssd, or setting memory limits with cgroups, chpst, node limiting with numactl, and etc before running high-usage programs seems to be a better solution to me -- set yourself up so if it fails you're covered then you won't need failure mechanisms.

    Leave a comment:


  • schmidtbag
    replied
    I agree with others that one of the things Linux is especially bad at is figuring out how to handle OOM situations. It's one of the very few things Windows seemed to have got right a looong time ago. That being said, the Windows approach isn't exactly perfect either, but at least the system [might] remain usable in the event a program has a memory leak.

    What I think the Linux default behavior should be is to sigstop a process that is about to make the total memory usage (not including buffers) exceed 99%, and, is a process that ranks within the top 10 memory-consuming processes. That way, your whole system should still remain usable, you didn't lose everything in the entire process (though any new incoming data might be lost), and you get a chance to recover while still seeing what's going on. Then, if you feel like risking it, you can sigcont the process whenever you're ready.

    Leave a comment:


  • ynari
    replied
    Sounds like a good idea if implemented properly, where processes to kill can be chosen. Kernel OOM is one of the most bonkers parts of Linux. Yes, I can see there are ways to disable it, or select some processes that shouldn't be killed, but it shouldn't be possible to get in that situation in the first place. The default should be to only kill processes that are identified as disposable, not select one at random.

    As far as I'm aware, the OOM killer is also a hard kill, and what's really needed is something that will allow for an orderly process shutdown.

    Leave a comment:


  • M@yeulC
    replied
    I gave a lot of thought to that problem. Here are my two cents:
    • Didn't distribution investigate increasing default swappiness recently? Preemptively swapping out unused programs could make the system a lot more responsive
    • I would like to see a few memory management-related signals being introduced: about to swap, under memory pressure, etc. This would be useful for application to delete some data when it can help, and I believe a similar mechanism exists on Android. Sometimes deleting cached data is cheaper than swapping it out.
    • The shell (desktop environment or otherwise) should handle these signals and dispatch them accordingly (default handler should broadcast to children). It could decide to send them to arbitrary processes.
    • A way to reliably estimate memory usage per application. As mentioned in this thread, it would be nice if the DE could detect high memory usage by an application, send it a SIGSTOP and prompt the user wether to continue (best used together with the above, as a last resort after notifying the other applications doesn't improve the situation)
    • Desktop environment should be able to send hints about which program to swap or not to swap. Please stop swapping out the compositor for one. DEs likely have a good idea which programs are being used actively, which are background tasks, and which are required to retain interactivity.
    • Edit: DEs could already set the OOM priority among their children processes, couldn't they?
    Last edited by M@yeulC; 22 October 2018, 07:16 AM.

    Leave a comment:


  • ssokolow
    replied
    Sounds like a more featureful version of the earlyoom daemon I use to keep leaks while I'm developing from from sometimes essentially locking up my 16GiB sytem. ("Essentially" because I don't notice until the system starts thrashing and, once it's begun thrashing, it can go for hours without becoming responsive again on its own.)

    (Basically, I configured it so that, if memory consumption passes 90%, it picks one of the largest processes and kills it, preferring Firefox and Chrome content processes in the event there are several qualifying processes to choose from. That leaves me 10% of 16GiB guaranteed for disk cache on my system where all of the SATA ports plus one USB 3.0 port are occupied by high-capacity rotating platter drives.)
    Last edited by ssokolow; 22 October 2018, 06:24 AM.

    Leave a comment:

Working...
X