Originally posted by tomas
View Post
Announcement
Collapse
No announcement yet.
Clear Linux Set To Begin Offering EarlyOOM For Better Dealing With Memory Pressure
Collapse
X
-
Originally posted by polarathene View Post
Isn't it just oom kicking in too late? swap is apparently needed for kernel buffers and without swap in the situation performs worse, I don't know how much that buffers stuff is, so not necessarily a large amount of swap(although I'd also have thought if that's the case why not just have some memory reserved for it?).
Then others have said that programs are getting dropped from caches and as they're still running, they get read back into memory and dropped again(there's a project on github just for handling this iirc that suspends processes briefly to prevent this type of thrashing activity so the kernel can actually avoid dealing with it and do it's job), and something about context switches which in this situation is pretty bad and degrades performance further?
So, just sounds like the issue is a lack of resources to work with and a lot of contention stressing/confusing the kernel that cause the snail pace slowdown and this supposedly is when oom is triggered but it's too late to respond swiftly, thus doing it in advance resolves it?(simple fix)
I don't know kernel development, but I would have thought that Facebook would, so if they decide to go with a userspace solution instead of contributing one to kernel to fix it there like you suggest, they'd have done so? Do you know for sure other OS aren't utilizing userspace at all to deal with the same situation?
I was just making the point that the underlying problem for why the system locks up and freezes before the kernel oom kicks in is still unresolved. there's still something fundamentally broken about how Linux pages to swap. even if the kernel oom were disabled and earlyoom was used exclusively there's still something fundamentally broken somewhere.
- Likes 1
Leave a comment:
-
yes, zram. I call it transparent, because by just enabling it, you have more effective ram available. It also works as a scratch-pad so that more stale pages get swapped out, also giving a feel of less swopping.
Leave a comment:
-
Originally posted by grigi View PostI say perceived because Linux manages memory much better than other desktop OS'es.
Especially once enabling transparent memory compression that can give one a 10-20% more effective ram which really makes a difference on a lowly 4G system.
Leave a comment:
-
Originally posted by duby229 View Postthe other part of the story is that there is something fundamentally wrong with how Linux pages to swap and that is the underlying cause of the hang and freeze before the kernel oom kicks in.
Then others have said that programs are getting dropped from caches and as they're still running, they get read back into memory and dropped again(there's a project on github just for handling this iirc that suspends processes briefly to prevent this type of thrashing activity so the kernel can actually avoid dealing with it and do it's job), and something about context switches which in this situation is pretty bad and degrades performance further?
So, just sounds like the issue is a lack of resources to work with and a lot of contention stressing/confusing the kernel that cause the snail pace slowdown and this supposedly is when oom is triggered but it's too late to respond swiftly, thus doing it in advance resolves it?(simple fix)
I don't know kernel development, but I would have thought that Facebook would, so if they decide to go with a userspace solution instead of contributing one to kernel to fix it there like you suggest, they'd have done so? Do you know for sure other OS aren't utilizing userspace at all to deal with the same situation?
Leave a comment:
-
Originally posted by timrichardson View PostThe facebook solution is dependent on a yet-to-be-found default configuration that works well across all kinds of different situations. This is the hard part, of course. Killing stuff is easy, killing stuff intelligently is hard. Otherwise this debate would not exist (all the people who think this problem is due to negligence or arrogance of the kernel developers just look stupid in my eyes).
earlyoomd stops the deadlocking. a new project, nohang, can be tweaked to use memory pressure stats (Facebook work which is in recent kernels) and it can use zram stats too, both of which can provide more sophisticated warning of pending memory problems. It's important to remember that if your system becomes unusable because you ran out of ram, you already know what the fundamental problem is. You need more ram (or better code). The responsibility of the OS/user space is not to fix this, it is to fail gracefully. earlyoom does this, for a rough approximation of 'gracefully'. nohang is a bit more elegant (you can get desktop notifications out of the box, both for pending problems, and for what was killed)
the other part of the story is that there is something fundamentally wrong with how Linux pages to swap and that is the underlying cause of the hang and freeze before the kernel oom kicks in. earlyoom and oomd seem to work before that paging flaw, whatever it might be, appears to lock up the system. while they may well be more elegant solution for oom than the kernel oom, the way Linux pages to swap is still fundamentally broken somehow.Last edited by duby229; 09 January 2020, 03:48 AM.
Leave a comment:
-
Originally posted by tomas View Post
Did you read my follow-up post?
In order for something to be labeled a "workaround" there must be some notion of what a "proper solution" would be and what the "root cause" of the "problem" is. At least on a conceptual level. What is your perception of what the "problem" is and what a "proper" solution to that "problem" is? How can the "problem" be solved by the kernel? From my viewpoint this is about user space allocating too much of something that can be considered to be a finite resource, i.e. memory. The solution to that is either for user space to start releasing memory it does not need (cached etc) and hopefully it will be enough so that the system can continue functioning. But if user space anyway continues requesting more and more memory, the only option left will eventually be to somehow start killing processes, preferably "the offending ones" if that is possible to decide, and hopefully that will be enough in order for the system to continue functioning.
Finally, if this problem would have been easy to solve, don't you think that would already have happened by now? I mean, it's not like other operating systems like Windows or MacOs handle this significantly better, do they?Last edited by duby229; 09 January 2020, 03:45 AM.
Leave a comment:
-
The facebook solution is dependent on a yet-to-be-found default configuration that works well across all kinds of different situations. This is the hard part, of course. Killing stuff is easy, killing stuff intelligently is hard. Otherwise this debate would not exist (all the people who think this problem is due to negligence or arrogance of the kernel developers just look stupid in my eyes).
earlyoomd stops the deadlocking. a new project, nohang, can be tweaked to use memory pressure stats (Facebook work which is in recent kernels) and it can use zram stats too, both of which can provide more sophisticated warning of pending memory problems. It's important to remember that if your system becomes unusable because you ran out of ram, you already know what the fundamental problem is. You need more ram (or better code). The responsibility of the OS/user space is not to fix this, it is to fail gracefully. earlyoom does this, for a rough approximation of 'gracefully'. nohang is a bit more elegant (you can get desktop notifications out of the box, both for pending problems, and for what was killed)Last edited by timrichardson; 08 January 2020, 06:59 PM.
- Likes 1
Leave a comment:
-
Originally posted by grigi View PostI say perceived because Linux manages memory much better than other desktop OS'es.
Especially once enabling transparent memory compression that can give one a 10-20% more effective ram which really makes a difference on a lowly 4G system.
For me, zram with zstd usually compresses RAM to better than half the size, so I would say it's closer to 70-80% more effective RAM.
Leave a comment:
-
My main worry with whatever systemd comes up with is whether it'll support an equivalent to -m 10 --avoid '(^|/)(init|X|sshd|leafpad|python(\d(.\d)?)?)$' --prefer '(^|/)firefox .*-contentproc( |$)'
(I have 16GiB of RAM and no SSD, so -m 10 to have it kick in when memory usage hits 90% before it evicts too much of my disk cache. The rest just causes it to avoid killing things that don't auto-save and prefer killing browser tabs that have gotten rude about their memory usage. If it is one of my Python creations leaking, better to notify me with a big "this tab died" message so I can manually decide what to do.)
It used to be much more necessary before I realized that having some swap was critical to the kernel's RAM defragmentation strategy and enabled zram. Now, it rarely triggers but, when it does, it's far preferrable to letting things slow to a crawl as disk cache and the runaway process fight for what remains in the lead-up to actually enterting an OOM state.
I do wish there was a simple, intuitive way to request that things be pinned in RAM so I could go through and request pinning for everything that gets accessed when I fire up a terminal window and run htop. As-is, prior to having earlyoom, I bought a drive-bay LCD specifically so I could have a "top memory consumers" readout that couldn't get covered up by other windows and printed a reference for triggering the OOM killer via SysRq to tape to the bottom of my center monitor.Last edited by ssokolow; 08 January 2020, 04:22 PM.
Leave a comment:
Leave a comment: