Announcement

Collapse
No announcement yet.

Patches Revived For A Zstd-Compressed Linux Kernel While Dropping LZMA & BZIP2

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by hotaru View Post
    no, you couldn't. in bzip2, the blocks are compressed independently, so they don't need to be processed sequentially.
    Exactly. The blocks are the workload. You can do that with any kind of coding.

    Comment


    • #22
      Originally posted by ermo View Post

      So far, I've seen two claims of parallel/multithreaded LZ4 implementations. One of them is even listed on lz4's home page.
      where? I've seen claims that they exist, but the lz4 utility in my Linux machine obviously doesn't do parallel decompression (never uses more than one core). its compression ratio is also terrible (a lot worse than even gzip), so it's not really useful even if it is faster than bzip2.

      Comment


      • #23
        Originally posted by hotaru View Post

        where? I've seen claims that they exist, but the lz4 utility in my Linux machine obviously doesn't do parallel decompression (never uses more than one core). its compression ratio is also terrible (a lot worse than even gzip), so it's not really useful even if it is faster than bzip2.
        Not sure how you could miss it, but here you go (this is right in the link I provided, all you had to do was follow it and scroll):

        ---- from the lz4 home page ( https://lz4.github.io/lz4/ )

        Compatible CLI versions

        Here are a few compatible alternatives to lz4 command line utility :
        C++11 multi-threads Takayuki Matsuoka https://github.com/t-mat/lz4mt
        (...)

        ---- END

        The argument is that I'm personally fine with living with a kernel size of 8-10MB instead of 4-5MB if it means the decompression is 3x faster. Keep in mind that I'm decompressing to a ~40-50 MB uncompressed image on a single core of an old Q9400 where this sort of performance delta is actually noticeable. But what's more, ye olde Q9400 is more or less on par with a modern Pentium J5005 quad core 10W CPU. So it's not like you can't buy modern hardware where my scenario can be duplicated.

        I'm not asking you to adopt my use case -- I'm just pointing out that, if you desperately need it, a multithreaded/parallel implementation of the lz4 encoder/decoder does in fact exist.

        Comment


        • #24
          Originally posted by ermo View Post
          ---- from the lz4 home page ( https://lz4.github.io/lz4/ )

          Compatible CLI versions

          Here are a few compatible alternatives to lz4 command line utility :
          C++11 multi-threads Takayuki Matsuoka https://github.com/t-mat/lz4mt
          (...)

          ---- END
          that one doesn't seem to be able to decompress lz4 at all. it just says "lz4mt: INVALID_MAGIC_NUMBER" on any lz4-compressed file I give it.

          Comment


          • #25

            Code:
            lz4c -l -c1
            You could get slightly better compression ratio by using instead :
            Code:
            lz4 -l -12
            I'm not sure to see the point of multi-threaded decompression of lz4 for a command line utility. Even on a RAM drive, lz4 is fast enough to need less than one core to maximize the drive bandwidth. Multi-threading is more useful when integrated into other products that require performance and are already multi-threaded. For example, ZFS uses multi-threaded LZ4 compression/decompression for transparent file system compression, but as the name implies, it's transparent, so no one can "see" it ...

            Comment


            • #26
              Just found out that zst (the command line tool) actually supports multi-threaded compression. No big surprise so far.
              What I would call "black magic" if I wouldn't be able to verify myself is that the speedup is nearly linear, while the output is bit-exact to the file produced with 1 thread. No splitting into chunks and degrading compression-ratio involved like most other tools do (xz).

              I am rather impressed by that.

              Comment


              • #27
                Originally posted by poorguy View Post
                I'm not sure to see the point of multi-threaded decompression of lz4 for a command line utility.
                there isn't a point of lz4 at all with lots of cores and storage connected through USB 2.0. bzip2 on enough cores is fast enough to maximize the drive bandwidth and ends up being faster than lz4 because of the higher compression ratio.

                Comment


                • #28
                  Originally posted by EmbraceUnity View Post
                  This link details a lot of arguments for why LZMA2 and XZ are poor quality archive formats, and why LZMA1 is superior to them.
                  LZMA2 is faster for 4-threads, if you compress big file (more than 256 MB), so 7-Zip will be able to split it to blocks. If the zip file exceeds that size, 7-zip will split it into multiple files automatically, such as integration_serviceLog.zip.001, integration_serviceLog.zip.002, etc. (Way back when, PK Zip used this to span zip files across multiple floppy disks.) You'll need all the files to be present to unzip them. LZMA2 was created for XZ format and it includes changes that are good for that stream compression format. Also LZMA2 is better than LZMA, if you compress already compressed data. LZMA decoder is simple. But PPMd decoder is complex. So now I don’t like the idea of PPMd / LZMA mixing.

                  Comment

                  Working...
                  X