While this in principle sounds like a good idea I ponder if it would not have been better to do this deduplication as part of zswap or zram (no they are not the same). As I see this it would not be any point in scanning for duplicates unless the kernel finds it interesting to swap out data. zswap needs to process the memory anyway so it should know the checksum for a block. If it match it can compare, deduplicate and optionally compress as well.
Announcement
Collapse
No announcement yet.
UKSM Is Still Around For Data Deduplication Of The Linux Kernel
Collapse
X
-
Originally posted by boxie View PostFor shared systems I would agree - but if you control the host and the stuff running on it, why not? might be a good way to reduce memory footprint
- Likes 1
Comment
-
Originally posted by starshipeleven View Post"you control the host and the stuff running on it" lol. Yeah right. Because you can stop malware by just looking at your PC intensely to remind it who's boss.
Shared memory is still a problem because of trust. Lets say you nitpicked through every application on your system and found it memory safe except one app, the web browser. That webbrowser could wreck havok with memory side channel attacks on the vannilla kernel. So do you trust all the people who ever worked on that project? Even if you did they are human so they make mistakes. Do you trust every website you connect to not to use those mistakes to say have persistent ad tracking for money? I for one do not.
Comment
-
Originally posted by starshipeleven View Post"you control the host and the stuff running on it" lol. Yeah right. Because you can stop malware by just looking at your PC intensely to remind it who's boss.
Comment
-
Originally posted by boxie View PostSure, I can imagine plenty of cases where lots of stuff might be duplicated in memory and that you do know there is no (ok, maybe a remote) chance of something nasty. Not every server is connected directly to the Internet!
KVM (a hypervisor) implements memory deduplication already since a long time (2009 according to google) and there it's kinda ok as it's done at a different level. And there it also makes the most sense, too. If you fire up a dozen of cloned VMs you can save a ton of RAM if you just dedup it.
Last edited by starshipeleven; 27 February 2017, 07:04 AM.
Comment
-
Originally posted by sarfarazahmad View Postis the CPU used while frequently running through the memory to find duplicates low enough to make this worthwhile ? For virtual machines we already have KSM . For what kind of workloads can this be useful ?
The big issue was that it was pulling the CPU out of low-power modes (not sleep modes, when it was sleeping KSM was also suspended) to do the scan so the battery lasted for less time. Not by a lot, by a few hours over a 2-day charge (normal usage, so most of these 2 days it was sleeping).
Comment
-
Originally posted by starshipeleven View PostApart from some HPC, or maybe some industrial applications that will work disconnected from the internet, I don't see that large userbase.
KVM (a hypervisor) implements memory deduplication already since a long time (2009 according to google) and there it's kinda ok as it's done at a different level. And there it also makes the most sense, too. If you fire up a dozen of cloned VMs you can save a ton of RAM if you just dedup it.
Not saying there are potential problems with it, side channel attacks are definitely problematic. It does seem like an interesting option - especially if it can be enabled per container - then we can use flatpack/snap and dedupe memory just inside certain memory spaces.
Comment
-
Originally posted by boxie View PostHow about databases. They shouldn't be connected directly to the Internet and can really use from dedupe. heck not every database backs a website either.
And a general rule of thumb in the trade here is DO NOT fuck with databases in any way. If your idea increases performance or data safety they are usually doing it already.
Comment
Comment