Announcement

Collapse
No announcement yet.

MGLRU Patches Merged To "mm-stable" Ahead Of Linux 6.1 - New Benchmarks Look Good

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • timrichardson
    replied
    Originally posted by [email protected] View Post

    timrichardson

    I vaguely remember you had similar questions. Hope this helps. Thanks.

    From the MGLRU admin guide:

    Thrashing prevention
    --------------------
    Personal computers are more sensitive to thrashing because it can cause janks (lags when rendering UI) and negatively impact user experience. The multi-gen LRU offers thrashing prevention to the majority of laptop and desktop users who do not have ``oomd``.

    Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of ``N`` milliseconds from getting evicted. The OOM killer is triggered if this working set cannot be kept in memory. In other words, this option works as an adjustable pressure relief valve, and when open, it terminates applications that are hopefully not being used.

    Based on the average human detectable lag (~100ms), ``N=1000`` usually eliminates intolerable janks due to thrashing. Larger values like ``N=3000`` make janks less noticeable at the risk of premature OOM kills.

    The default value ``0`` means disabled.​
    Just saw this. Thanks for participating in this forum, it is a great honour for Phoronix. Like so many others, I am very pleased to see MGLRU head to mainstream linux, based on my low memory testing it is the biggest advance in a decade or longer.

    Leave a comment:


  • arQon
    replied
    Originally posted by HD7950 View Post
    Should i disable the systemd-oomd service
    At a minimum, I'd say "obviously so". The whole point of improving memory pressure handling is to, well, handle the pressure. All oomd's "handle" pressure by just killing processes at semi-random based on fairly-garbage heuristics, but the systemd one in particular makes famously poor choices and may well be equally good at *when* it does so, i.e. prematurely.

    It's apparently sort-of semi-fixable with enough rework, judging by what Canonical has spent the last few months going through, but I personally wouldn't waste my time on that unless someone was paying me. YMMV.

    Leave a comment:


  • HD7950
    replied
    Originally posted by [email protected] View Post

    If MGLRU and min_ttl_ms work well for you, then feel free to disable/uninstall systemd-oomd.

    Theoretically​, 4 devices of 2 GB of the same priority, e.g., swapon -p 0 /dev/zram[0-3], would work better, since they increase the parallelism. (Each zram device has a global lock. Though the lock has a fine granularity, it still can theoretically be contented.)
    Although I have enough memory to not get into those situations in 99.99% of scenarios, I will continue to experiment so that I don't have to hard reset my computer again. Thanks!

    Leave a comment:


  • yuzhao@chromium.org
    replied
    Originally posted by HD7950 View Post
    What is your advice then? Should i disable the systemd-oomd service and start using the kernel OOM killer & MGLRU solution? What about the ZRAM size for 32 GB of RAM? 1 device of 8 GB or 4 devices of 2 GB? What is better?

    Thanks.
    If MGLRU and min_ttl_ms work well for you, then feel free to disable/uninstall systemd-oomd.

    Theoretically​, 4 devices of 2 GB of the same priority, e.g., swapon -p 0 /dev/zram[0-3], would work better, since they increase the parallelism. (Each zram device has a global lock. Though the lock has a fine granularity, it still can theoretically be contented.)

    Leave a comment:


  • HD7950
    replied
    Originally posted by [email protected] View Post

    oomd and min_ttl_ms are two different approaches:
    1. oomd​ is a userspace Out-Of-Memory (OOM) killer, and it monitors a kernel metric called Pressure Stall Information (PSI) to decide when to kill. There are two problems with this approach: a) oomd​ itself can be starved of memory and not make progress b) PSI thresholds (for oomd​ to kick in) vary on different systems (it needs to be fine tuned).
    2. min_ttl_ms​ is a kernel space feature built on top of MGLRU (it only functions when using MGLRU) and the existing kernel OOM killer. It doesn't monitor PSI or inactive memory size (le9), which also varies on different systems. Instead, it shifts the problems to the frequency domain, which makes them easier to solve. min_ttl_ms​ = minimum time to live in milliseconds, which is easier to tune.
    Since min_ttl_ms is not absolutely required, it's disabled in the kernel. Eventually when most distros pick up MGLRU, they will set it from userspace.
    What is your advice then? Should i disable the systemd-oomd service and start using the kernel OOM killer & MGLRU solution? What about the ZRAM size for 32 GB of RAM? 1 device of 8 GB or 4 devices of 2 GB? What is better?

    Thanks.

    Leave a comment:


  • yuzhao@chromium.org
    replied
    Originally posted by HD7950 View Post

    Thanks. It seems to be working, or at least it worked on the first attempt:

    kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0

    Out of memory: Killed process 171570 (ld.lld) total-vm:22883428kB, anon-rss:17890652kB, file-rss:6100kB, shmem-rss:2065328kB, UID:1000 pgtables:4264 4kB oom_score_adj:200


    ​To be more precise I could compile Firefox on RAM a few minutes ago, using the default PKGBUILD. The problem of lack of memory and hard reboots happened when I was trying to compile Firefox with LTO. It seems that by changing that value, the systemd-oomd is more efficient.

    Why 0 is the default option for min_ttl_ms?​
    oomd and min_ttl_ms are two different approaches:
    1. oomd​ is a userspace Out-Of-Memory (OOM) killer, and it monitors a kernel metric called Pressure Stall Information (PSI) to decide when to kill. There are two problems with this approach: a) oomd​ itself can be starved of memory and not make progress b) PSI thresholds (for oomd​ to kick in) vary on different systems (it needs to be fine tuned).
    2. min_ttl_ms​ is a kernel space feature built on top of MGLRU (it only functions when using MGLRU) and the existing kernel OOM killer. It doesn't monitor PSI or inactive memory size (le9), which also varies on different systems. Instead, it shifts the problems to the frequency domain, which makes them easier to solve. min_ttl_ms​ = minimum time to live in milliseconds, which is easier to tune.
    Since min_ttl_ms is not absolutely required, it's disabled in the kernel. Eventually when most distros pick up MGLRU, they will set it from userspace.
    Last edited by [email protected]; 01 October 2022, 02:42 AM.

    Leave a comment:


  • HD7950
    replied
    Originally posted by [email protected] View Post

    Gotcha. That's what min_ttl_ms is designed for. Usually distros like XanMod hardcode a value (other than the default 0) for you. With the mainline, you have to set it yourself.
    Thanks. It seems to be working, or at least it worked on the first attempt:

    kswapd0 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0

    Out of memory: Killed process 171570 (ld.lld) total-vm:22883428kB, anon-rss:17890652kB, file-rss:6100kB, shmem-rss:2065328kB, UID:1000 pgtables:4264 4kB oom_score_adj:200


    ​To be more precise I could compile Firefox on RAM a few minutes ago, using the default PKGBUILD. The problem of lack of memory and hard reboots happened when I was trying to compile Firefox with LTO. It seems that by changing that value, the systemd-oomd is more efficient.

    Why 0 is the default option for min_ttl_ms?​
    Last edited by HD7950; 01 October 2022, 01:52 AM.

    Leave a comment:


  • yuzhao@chromium.org
    replied
    Originally posted by HD7950 View Post

    What I really wanted to know/test is if with MGLRU and ZRAM I could avoid once and for all hard reboots due to lack of memory.

    If more memory is needed at some point, I'm fine if the most memory consuming processes are killed, but what I don't want is a totally locked system.

    I'm compiling Firefox again with min_ttl_ms​ changed to 1000. Let's see what happens.
    Gotcha. That's what min_ttl_ms is designed for. Usually distros like XanMod hardcode a value (other than the default 0) for you. With the mainline, you have to set it yourself.

    timrichardson

    I vaguely remember you had similar questions. Hope this helps. Thanks.

    From the MGLRU admin guide:

    Thrashing prevention
    --------------------
    Personal computers are more sensitive to thrashing because it can cause janks (lags when rendering UI) and negatively impact user experience. The multi-gen LRU offers thrashing prevention to the majority of laptop and desktop users who do not have ``oomd``.

    Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of ``N`` milliseconds from getting evicted. The OOM killer is triggered if this working set cannot be kept in memory. In other words, this option works as an adjustable pressure relief valve, and when open, it terminates applications that are hopefully not being used.

    Based on the average human detectable lag (~100ms), ``N=1000`` usually eliminates intolerable janks due to thrashing. Larger values like ``N=3000`` make janks less noticeable at the risk of premature OOM kills.

    The default value ``0`` means disabled.​
    Last edited by [email protected]; 01 October 2022, 12:33 AM.

    Leave a comment:


  • HD7950
    replied
    Originally posted by [email protected] View Post

    So you want to test whether livelock can be avoided with MGLRU? (It's still not clear to me what exactly you want to test.)

    If so, did you set min_ttl_ms? E.g., echo 1000 >/sys/kernel/mm/lru_gen/min_ttl_ms
    What I really wanted to know/test is if with MGLRU and ZRAM I could avoid once and for all hard reboots due to lack of memory.

    If more memory is needed at some point, I'm fine if the most memory consuming processes are killed, but what I don't want is a totally locked system.

    I'm compiling Firefox again with min_ttl_ms​ changed to 1000. Let's see what happens.

    Leave a comment:


  • yuzhao@chromium.org
    replied
    Originally posted by HD7950 View Post

    This was an attempt to experiment with ZRAM & MGLRU and check if hard reboots could finally be avoided on Linux.​
    So you want to test whether livelock can be avoided with MGLRU? (It's still not clear to me what exactly you want to test.)

    If so, did you set min_ttl_ms? E.g., echo 1000 >/sys/kernel/mm/lru_gen/min_ttl_ms

    Leave a comment:

Working...
X