Announcement

**ms178** · 26 March 2022, 06:17 PM

Xanmod and a couple of other downstream Kernels already carry these patches for some time, so anyone who wants to enjoy the benefits sooner already has options... but here is hope it will finally land upstream sooner rather than later.

**JEBjames** · 26 March 2022, 06:49 PM

Michael

Typo "evicction" should be "eviction".

You can't go faster than the speed of light (c). But type fast enough and you can go 2 x c.

**mangeek** · 26 March 2022, 07:02 PM

I'm not an expert, but I'd love to see what would happen if the kernel itself and ALL allocations were 'huge' instead of the processor's native regular page size. Not just 'transparent_hugepage=always', but the whole kernel and system right from boot. I have a feeling that even with Huge Pages being preferred in user space, the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency. I'll bet huge pages would compress and swap MUCH better.

**cynic** · 27 March 2022, 03:12 AM

Originally posted by ms178 View Post

Xanmod and a couple of other downstream Kernels already carry these patches for some time, so anyone who wants to enjoy the benefits sooner already has options... but here is hope it will finally land upstream sooner rather than later.

if you tried it, are there any appreciable benefit of using xanmod on a already reasonably fast (and with enough RAM for my usage) machine?

**R41N3R** · 27 March 2022, 04:24 AM

Sounds really promising. I need to try this in my 4GB matepad which had constant swap hangs.

**ms178** · 27 March 2022, 04:46 AM

Originally posted by cynic View Post

if you tried it, are there any appreciable benefit of using xanmod on a already reasonably fast (and with enough RAM for my usage) machine?

It depends on your use cases and how much performance you need, for me it is a difference between night and day for low-latency gaming even on a fairly decent machine - low spec systems should profit even more, as usually every percent of performance translates into a better experience. The default Kernels of every major distribution I tried (Ubuntu, openSUSE, Arch) are not optimized at all for gaming. As an example, the in-game benchmark from Company of Heroes 2 shows an improvement on my system from around 40-45 fps to 86 average fps just by using the Xanmod Kernel and tweaking the Kernel config a bit. With a couple of additional patches on top and an hand-tuned toolchain/Mesa/DXVK/x86-64-v3-Repository, I recently got to 100 fps. This is all on a Haswell-EP system with a Vega 56. All the people who claimed for years that optimizations like these didn't matter and with a mindset that O2 is enough for the world were wrong. At least for the best low-latency gaming experience it matters a lot, but I have to admit that the process of optimizing the system to this level and keeping it stable is time consuming and not easy, I had to learn a lot about compilers and how to build them and other performance-sensitive packages properly in the process. I'd rather see the distributions doing that work for a better out-of-the-box experience, e.g. with target-specific ISOs, using profiled-guided optimizations etc.

R41N3R The MGLRU patches should help you quite a bit, as it should reduce these swapping hangs. However I haven't tested it on my 11 year old Sandy Bridge notebook with 8GB RAM yet.

**cynic** · 27 March 2022, 05:50 AM

Originally posted by ms178 View Post

It depends on your use cases and how much performance you need, for me it is a difference between night and day for low-latency gaming even on a fairly decent machine - low spec systems should profit even more, as usually every percent of performance translates into a better experience. The default Kernels of every major distribution I tried (Ubuntu, openSUSE, Arch) are not optimized at all for gaming. As an example, the in-game benchmark from Company of Heroes 2 shows an improvement on my system from around 40-45 fps to 86 average fps just by using the Xanmod Kernel and tweaking the Kernel config a bit. With a couple of additional patches on top and an hand-tuned toolchain/Mesa/DXVK/x86-64-v3-Repository, I recently got to 100 fps. This is all on a Haswell-EP system with a Vega 56. All the people who claimed for years that optimizations like these didn't matter and with a mindset that O2 is enough for the world were wrong. At least for the best low-latency gaming experience it matters a lot, but I have to admit that the process of optimizing the system to this level and keeping it stable is time consuming and not easy, I had to learn a lot about compilers and how to build them and other performance-sensitive packages properly in the process. I'd rather see the distributions doing that work for a better out-of-the-box experience, e.g. with target-specific ISOs, using profiled-guided optimizations etc.

R41N3R The MGLRU patches should help you quite a bit, as it should reduce these swapping hangs. However I haven't tested it on my 11 year old Sandy Bridge notebook with 8GB RAM yet.

thank you for the reply!

once I used to tweak my kernel myself but now I'm getting too old and busy :/
I guess I should start doing it agan

**milkylainen** · 27 March 2022, 08:06 AM

Originally posted by mangeek View Post

I'm not an expert, but I'd love to see what would happen if the kernel itself and ALL allocations were 'huge' instead of the processor's native regular page size. Not just 'transparent_hugepage=always', but the whole kernel and system right from boot. I have a feeling that even with Huge Pages being preferred in user space, the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency. I'll bet huge pages would compress and swap MUCH better.

It's really not that simple. If it was it would have been done already.
Variable sized pages complicate a lot of things. THP or just HP has been a constant source of strange performance regressions since it was introduced in Linux.
IMHO, a larger base pagesize is better. Like 64k.
I prefer predictable and manageable systems over optimized systems whenever both are hard to combine.

**arQon** · 27 March 2022, 06:08 PM

Originally posted by mangeek

the TLB probably gets clobbered by activity in 4k areas, including the kernel. I think this would allow for a shallower nesting of page tables and significant performance gains, at the price of some memory efficiency.

It does, and it does. It's not especially "visible" on x86 in most cases, but on weak ARM systems (where you can choose between 4K and 64K pages) the additional overhead is *really* noticeable once you're dealing with any non-small amount of data.

As far as swap goes, regardless of how much better you think that swap would compress (so you're obviously expecting zswap/etc, which isn't necessarily desirable), you'll be swapping a lot more "dead" space thanks to the increased page size. Whether that's a net win or not is going to be very system specific, but the additional IO is probably one of the bigger arguments against larger pages by default.

Originally posted by milkylainen View Post

IMHO, a larger base pagesize is better. Like 64k.
I prefer predictable and manageable systems over optimized systems whenever both are hard to combine.

It does depend a lot on what the machine's doing but in general, yes: I think that would be a good compromise, and it not being any more prone to randomly misbehavior than 4K is a compelling argument, provided you can afford the "wastefulness" of it (which just about everything can, even in the embedded space). The problem is, IIRC x86's only options are 4K and 4MB, and that's significantly different. IMO it's a bit of a Goldilocks situation: 4K is too small and as a result has far too much overhead, but 4MB is too large and has too much waste to really be a good default either, in a lot of cases.

Announcement

MGLRU Could Land In Linux 5.19 For Improving Performance - Especially Low RAM Situations

MGLRU Could Land In Linux 5.19 For Improving Performance - Especially Low RAM Situations

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment