Announcement

Collapse
No announcement yet.

MGLRU Could Land In Linux 5.19 For Improving Performance - Especially Low RAM Situations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • MGLRU Could Land In Linux 5.19 For Improving Performance - Especially Low RAM Situations

    Phoronix: MGLRU Could Land In Linux 5.19 For Improving Performance - Especially Low RAM Situations

    MGLRU is a kernel innovation we've been eager to see merged in 2022 and it looks like that could happen for the next cycle, v5.19, for improving Linux system performance especially in cases of approaching memory pressure...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Xanmod and a couple of other downstream Kernels already carry these patches for some time, so anyone who wants to enjoy the benefits sooner already has options... but here is hope it will finally land upstream sooner rather than later.

    Comment


    • #3
      Michael

      Typo "evicction" should be "eviction".

      You can't go faster than the speed of light (c). But type fast enough and you can go 2 x c.



      Comment


      • #4
        I'm not an expert, but I'd love to see what would happen if the kernel itself and ALL allocations were 'huge' instead of the processor's native regular page size. Not just 'transparent_hugepage=always', but the whole kernel and system right from boot. I have a feeling that even with Huge Pages being preferred in user space, the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency. I'll bet huge pages would compress and swap MUCH better.

        Comment


        • #5
          Originally posted by ms178 View Post
          Xanmod and a couple of other downstream Kernels already carry these patches for some time, so anyone who wants to enjoy the benefits sooner already has options... but here is hope it will finally land upstream sooner rather than later.
          if you tried it, are there any appreciable benefit of using xanmod on a already reasonably fast (and with enough RAM for my usage) machine?

          Comment


          • #6
            Sounds really promising. I need to try this in my 4GB matepad which had constant swap hangs.

            Comment


            • #7
              Originally posted by cynic View Post

              if you tried it, are there any appreciable benefit of using xanmod on a already reasonably fast (and with enough RAM for my usage) machine?
              It depends on your use cases and how much performance you need, for me it is a difference between night and day for low-latency gaming even on a fairly decent machine - low spec systems should profit even more, as usually every percent of performance translates into a better experience. The default Kernels of every major distribution I tried (Ubuntu, openSUSE, Arch) are not optimized at all for gaming. As an example, the in-game benchmark from Company of Heroes 2 shows an improvement on my system from around 40-45 fps to 86 average fps just by using the Xanmod Kernel and tweaking the Kernel config a bit. With a couple of additional patches on top and an hand-tuned toolchain/Mesa/DXVK/x86-64-v3-Repository, I recently got to 100 fps. This is all on a Haswell-EP system with a Vega 56. All the people who claimed for years that optimizations like these didn't matter and with a mindset that O2 is enough for the world were wrong. At least for the best low-latency gaming experience it matters a lot, but I have to admit that the process of optimizing the system to this level and keeping it stable is time consuming and not easy, I had to learn a lot about compilers and how to build them and other performance-sensitive packages properly in the process. I'd rather see the distributions doing that work for a better out-of-the-box experience, e.g. with target-specific ISOs, using profiled-guided optimizations etc.

              R41N3R The MGLRU patches should help you quite a bit, as it should reduce these swapping hangs. However I haven't tested it on my 11 year old Sandy Bridge notebook with 8GB RAM yet.
              Last edited by ms178; 27 March 2022, 04:53 AM.

              Comment


              • #8
                Originally posted by ms178 View Post

                It depends on your use cases and how much performance you need, for me it is a difference between night and day for low-latency gaming even on a fairly decent machine - low spec systems should profit even more, as usually every percent of performance translates into a better experience. The default Kernels of every major distribution I tried (Ubuntu, openSUSE, Arch) are not optimized at all for gaming. As an example, the in-game benchmark from Company of Heroes 2 shows an improvement on my system from around 40-45 fps to 86 average fps just by using the Xanmod Kernel and tweaking the Kernel config a bit. With a couple of additional patches on top and an hand-tuned toolchain/Mesa/DXVK/x86-64-v3-Repository, I recently got to 100 fps. This is all on a Haswell-EP system with a Vega 56. All the people who claimed for years that optimizations like these didn't matter and with a mindset that O2 is enough for the world were wrong. At least for the best low-latency gaming experience it matters a lot, but I have to admit that the process of optimizing the system to this level and keeping it stable is time consuming and not easy, I had to learn a lot about compilers and how to build them and other performance-sensitive packages properly in the process. I'd rather see the distributions doing that work for a better out-of-the-box experience, e.g. with target-specific ISOs, using profiled-guided optimizations etc.

                R41N3R The MGLRU patches should help you quite a bit, as it should reduce these swapping hangs. However I haven't tested it on my 11 year old Sandy Bridge notebook with 8GB RAM yet.
                thank you for the reply!

                once I used to tweak my kernel myself but now I'm getting too old and busy :/
                I guess I should start doing it agan

                Comment


                • #9
                  Originally posted by mangeek View Post
                  I'm not an expert, but I'd love to see what would happen if the kernel itself and ALL allocations were 'huge' instead of the processor's native regular page size. Not just 'transparent_hugepage=always', but the whole kernel and system right from boot. I have a feeling that even with Huge Pages being preferred in user space, the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency. I'll bet huge pages would compress and swap MUCH better.
                  It's really not that simple. If it was it would have been done already.
                  Variable sized pages complicate a lot of things. THP or just HP has been a constant source of strange performance regressions since it was introduced in Linux.
                  IMHO, a larger base pagesize is better. Like 64k.
                  I prefer predictable and manageable systems over optimized systems whenever both are hard to combine.

                  Comment


                  • #10
                    Originally posted by mangeek
                    the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency.
                    It does, and it does. It's not especially "visible" on x86 in most cases, but on weak ARM systems (where you can choose between 4K and 64K pages) the additional overhead is *really* noticeable once you're dealing with any non-small amount of data.

                    As far as swap goes, regardless of how much better you think that swap would compress (so you're obviously expecting zswap/etc, which isn't necessarily desirable), you'll be swapping a lot more "dead" space thanks to the increased page size. Whether that's a net win or not is going to be very system specific, but the additional IO is probably one of the bigger arguments against larger pages by default.

                    Originally posted by milkylainen View Post
                    IMHO, a larger base pagesize is better. Like 64k.
                    I prefer predictable and manageable systems over optimized systems whenever both are hard to combine.
                    It does depend a lot on what the machine's doing but in general, yes: I think that would be a good compromise, and it not being any more prone to randomly misbehavior than 4K is a compelling argument, provided you can afford the "wastefulness" of it (which just about everything can, even in the embedded space). The problem is, IIRC x86's only options are 4K and 4MB, and that's significantly different. IMO it's a bit of a Goldilocks situation: 4K is too small and as a result has far too much overhead, but 4MB is too large and has too much waste to really be a good default either, in a lot of cases.

                    Comment

                    Working...
                    X