Originally posted by jabl
View Post
Announcement
Collapse
No announcement yet.
Linux 5.13 To Allow Zstd Compressed Modules, Zstd Update Pending With Faster Performance
Collapse
X
-
- Likes 1
-
Originally posted by timofonic View PostZstd is becoming more interesting these days. I hope some crazy geniuses are able to optimize it even more than currently is.
Of LZ4, XXH64 and FSE fame.
Aka. the human being who can understand Jarek Duda's papers on ANS and turn them into code (above mentioned FSE).
That's crazy-genuis enough in my book.
Originally posted by Joe2021 View PostWhat is the situation of compression ratio? Is it yet maxed out or can we hope for an increased compression ratio yield in the future?
- Entropy coding is done with FSE (tANS-like) which is provably equivalent to Range- and Arithmetic- encoding, etc. i.e.: it hits the Shannon limit (unlike the Huffman trees of older compression methods like GZip's Deflate) but is much faster (boils down to a few bit twiddling and look-ups, no arithmetic involved).
- In slower modes, the dictionary search is supposedly exhaustive and close to optimal.
The only way one could improve beyond that is incorporating some modelling / machine learning on top of it. Which is what LZMA (remember that MA stands for MArkov) and PAQ do. But which is NOT what Zstd aims for (it aims for speed so ML is out of the question).
Originally posted by birdie View PostI've followed quite a large number of compression algorithms over the past 25 years, and none of them have improved in terms of compression ratio more than 3% since its inception. Some have gained multithreaded compressions/decompression but that's it.
But in Zstd we're already pretty close to optimum.
Originally posted by birdie View PostI only wonder if zstd's `-22 --ultra --long` mode can be sped up (it's currently very very slow) but otherwise the algo is great.
You're basically limited by the look-up time to search the dictionary space.
Originally posted by birdie View PostThe compression ratio is a tad lower than LZMA2's but the decompression speed is up to eight times higher.
That's what the latest up-to-date research in information theory has enabled.
(and no surprise about the speed of LZMA, that's the cost at which machine learning/modelling comes).
Originally posted by rene View PostThis news is a bit misleading. Linux did support zstd compression before, as this is done in user space by kmod. This merely wires up compressing modules during the kernel build.
In user-space, KMOD 28 already supports dealing with Zstd-compressed modules.
Originally posted by rene View PostThis also is not as useful as it sounds, as modern file system with transparent compression have a similar benefit,
e.g.: BTRFS compress stripes of 128KB,.
And even after XZ compression I have more than 130 modules larger than that in my distro. (Probably more than 350 uncompressed cover multiple stripes of 128KB, but I am too busy to measure this properly).
It's not the same result (it trades random read-write access vs. compression performance).
Also depending on the device, the bootloader might not be able to load from a partition that supports compression at all.
So usually your kernel is stuck on a peculiar partition format that the bootloader can understand, and together with it any further component needed to boot the system (be it initramfs or modules).
E.g.: Raspberry Pi up to 3 can only boot from FAT32 because the main ARM CPU is boot-strapped by GPU and the GPU can only understand FAT32. Also it doesn't support initramfs by default neither.
E.g.: EFI can only boot from FAT. That's why most Linux distribution put a boot loader there (e.g.: GRUB, ELILO, etc.) and then load the actual kernel and initramfs and extra boot-time modules from a proper Linux-y boot/root partition.
E.g.: the U-boot used on multiple Pine64 devices (e.g.: my Pinebook Pro) only understands FAT and EXT3/4 (without any compression extensions supported in EXT).
Also note that most distributions' kernel doesn't come with every file-system enabled in-kernel. Only a few are supported out-of-the-box and any other filesystem requires drivers to be loaded.
e.g.: most embed/ARM kernel only support ext4, f2fs, fat and udf out of the box.
on a distro that doesn't use initramfs like raspbian, it means that you can only exclusively boot on root partitions that are either ext4 (the default used by the system) or f2fs (not documented but supported).
Transparent compression isn't mainstream in Ext4 yet, and F2fs only added Zstd recently.
Same is also valid for some smartphone that block module loading over "safety" concerns.
So although in theory you could use BTRFS for your boot/root partition, turn compression on and call it a day, there are numerous use cases where this isn't practically doable.
Originally posted by rene View Postand the distribution packages are smaller w/o kernel module compression as the thousand pack of them as a whole compresses way more than thousand compressed modules without any redundancy left.
The main purpose here is speed. Zstd is one of the few ultra-fast compression, where the speed of decompression is faster than the storage bandwidth.
Individual ZStd files == loads much faster, no matter if the .RPM/.DEB is slightly bigger.
(And speaking of speed: decompressing the whole LZMA-compressed package, and then individually recompressing all the files as Zstd at max level in the filesystem would be a bit resource-intensive and time-consuming. One would need to balance the benefits/costs depending on the use case. Though it happens only once per update, so there might be some bandwidth constrained use-cases -- like updating over 2G / over Satellite / remote IoT sensors over LoRaWAN)
On the other hand, a initramfs compressed with Zstd and containing most of the needed modules would definitely benefit from whole-compression of uncompressed modules.
(I use that on my pine book pro: the u-boot doesn't support FS-compression, but the default kernel in Manjaro ARM supports Zstd compressed ram disks).
- Likes 5
Leave a comment:
-
Originally posted by gigi View Post
only two
is there any particular reason?
- Likes 1
Leave a comment:
-
Originally posted by ms178 View Post
Not with certainty, but if you've read the introduction of the newest patch set, it gives you an overview of the changes from the version used in the Kernel to the updated version and it shows significant improvements in the stated scenarios. It is an educated guess that similar improvements will materialize in future updates as well. Not necessarily specific to the compression ratio, but this could see further improvements, too.
What makes you so doubtful? It is an active project and the devs seem to improve upon all metrics each release.
I only wonder if zstd's `-22 --ultra --long` mode can be sped up (it's currently very very slow) but otherwise the algo is great. The compression ratio is a tad lower than LZMA2's but the decompression speed is up to eight times higher.
- Likes 1
Leave a comment:
-
Originally posted by Joe2021 View PostWhat is the situation of compression ratio? Is it yet maxed out or can we hope for an increased compression ratio yield in the future?
New experimental param, controlled by ZSTD_c_splitBlocks. Some rudimentary benchmarks (fastest of 4 runs with fullbench.c): (Decompression speed not measured, though I'd assume that to be a bit slo...
- Likes 1
Leave a comment:
-
Originally posted by jabl View PostDo any distros actually make use of module compression? gz and xz have apprently been there a while, and at least ubuntu seems to ship them uncompressed (of course the .deb package itself is compressed, so this just wastes a bit of disk space).
Same for firmwares, FWIW.
Their firmware files seem still uncompressed, though.Last edited by zxy_thf; 04 May 2021, 08:00 AM.
- Likes 1
Leave a comment:
-
This news is a bit misleading. Linux did support zstd compression before, as this is done in user space by kmod. This merely wires up compressing modules during the kernel build. This also is not as useful as it sounds, as modern file system with transparent compression have a similar benefit, and the distribution packages are smaller w/o kernel module compression as the thousand pack of them as a whole compresses way more than thousand coppressed modules without any redundancy left.
- Likes 2
Leave a comment:
-
Originally posted by jabl View PostDo any distros actually make use of module compression? gz and xz have apprently been there a while, and at least ubuntu seems to ship them uncompressed (of course the .deb package itself is compressed, so this just wastes a bit of disk space).
Same for firmwares, FWIW.
Where module files ends with ko.xz.
- Likes 3
Leave a comment:
-
Do any distros actually make use of module compression? gz and xz have apprently been there a while, and at least ubuntu seems to ship them uncompressed (of course the .deb package itself is compressed, so this just wastes a bit of disk space).
Same for firmwares, FWIW.
- Likes 1
Leave a comment:
Leave a comment: