Announcement

Collapse
No announcement yet.

Arch's Switch To Zstd: ~0.8% Increase In Package Size For ~1300% Speedup In Decompression Time

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Misel View Post
    I guess, I should have put more emphasis on disk I/O and network speed rather than the actual required space. The actual space these days is not an issue. Most storage devices are more than large enough for the extra .8%.

    What I was getting at - or better trying to - was that 1300% speed up could mean 1ms instead of 13ms. As far as I can. tell they only mention the actual decompression - but not the download and disc access time. (Again, I may be wrong on that one).

    So are there any benchmarks against other algorithms. E.g. bz2 which is much slower but has a much better compression in comparison. So download time and disc reads should be much faster. The question is, though, how do they factor in?

    So again, are there any benchmarks with absolute numbers?
    I suspect that amount of disk access time would be statistically the same. 0.8% more disk space shouldn't hurt performance by more than that amount, particularly if we're talking sequential reads from the same place on the disk (which matters more for HDDs than SSDs.) So, I'd gladly incur a 0.8% read penalty if decompressing it happens over an order of magnitude faster. If you're on an SSD, you probably wouldn't even notice the difference. Even with a HDD I suspect seek latency would dominate I/O time since sequential reads on a HDD after seeking should be very fast and that 0.8% more usage is not really going to make a difference when push comes to shove.

    Stuff like this doesn't mean a lot for people booting Linux on a desktop or laptop computer. It means a whole lot more for devices that need to start quickly, like for embedded applications where you might have some very real startup time constraints.

    Comment


    • #12
      Originally posted by Misel View Post
      E.g. bz2 which is much slower
      bz2 can actually be significantly faster than zstd if you use multithreaded decompression. the last time I ran benchmarks, multithreaded bz2 decompression (with lbzip2) was about 30% faster than even lz4, which was faster than zstd.

      Comment


      • #13
        Originally posted by hotaru View Post

        bz2 can actually be significantly faster than zstd if you use multithreaded decompression. the last time I ran benchmarks, multithreaded bz2 decompression (with lbzip2) was about 30% faster than even lz4, which was faster than zstd.
        Doesn't sound right. Look at these https://larryreznick.com/2019/09/19/...g-compression/

        Besides, 30% faster than lz4 sounds almost as fast as memcpy https://github.com/lz4/lz4

        Comment


        • #14
          Originally posted by caligula View Post

          Doesn't sound right. Look at these https://larryreznick.com/2019/09/19/...g-compression/

          Besides, 30% faster than lz4 sounds almost as fast as memcpy https://github.com/lz4/lz4
          I think we should see if the original assertion can show what data files were used for compression to get the stated result. Some file types compress better/faster with some compression programs than others. I did encounter one very particular (anomalous imo) case where the ancient compress (zip/pkzip) algorithm outperformed zstd 1.3 (obviously dated) in both size and speed last year. I didn't make notes because the difference was negligible on a desktop so I can't provide the evidence. But it makes me cautious about dismissing claims like the OP out of hand.

          It does push home the point that any specific use case needs to be tested before drawing a broad conclusion when your hardware varies from the norm most developers use, such as when you're doing embedded stuff on MIPS, ARM, etc where processor features vary and certain performance features taken for granted on x86-64 is absent.

          Comment


          • #15
            Originally posted by caligula View Post

            Doesn't sound right. Look at these https://larryreznick.com/2019/09/19/...g-compression/

            Besides, 30% faster than lz4 sounds almost as fast as memcpy https://github.com/lz4/lz4
            1. those benchmarks use a lot fewer cores.

            2. in my benchmarks, lz4 and bz2 were both limited by the speed of reading the compressed file from the hard drive, but bz2's better compression ratio means it spent less time waiting for the hard drive.

            3. if I decompress from memory to memory, lbzip2 is close to memcpy speed (and sometimes even exceeds it), but that's not a realistic scenario for real world use.

            Comment


            • #16
              Originally posted by AndyChow View Post
              I'll take a 1% size penalty any day when the trade is more than tenfold increase in decompression time. I mean it's literally an order of magnitude performance increase for a statistical rounding error of size.
              Honestly, I'm on the fence about this. For regular package downloads, I have no complaints about decompression time. I understand there are systems with fewer than 32 threads and systems without SSDs, but it really doesn't seem to be that much of a bottleneck. I'd rather see pacman utilize multiple threads for downloading, etc. without having to add an AUR. Package databases could all be updated at the same time, and multiple download threads could easily be specified by a setting in pacman.conf if implemented by pacman.

              That being said, I have no complaints about Arch. It runs and works great. It provides the smoothest Linux experience I've ever had while remaining bloat free.

              Comment


              • #17
                Originally posted by stormcrow View Post

                I think we should see if the original assertion can show what data files were used for compression to get the stated result. Some file types compress better/faster with some compression programs than others. I did encounter one very particular (anomalous imo) case where the ancient compress (zip/pkzip) algorithm outperformed zstd 1.3 (obviously dated) in both size and speed last year. I didn't make notes because the difference was negligible on a desktop so I can't provide the evidence. But it makes me cautious about dismissing claims like the OP out of hand.

                It does push home the point that any specific use case needs to be tested before drawing a broad conclusion when your hardware varies from the norm most developers use, such as when you're doing embedded stuff on MIPS, ARM, etc where processor features vary and certain performance features taken for granted on x86-64 is absent.
                this may be true generally but as far as Pacman is concerned Zstd is indeed lots faster than XZ in both AARCH64 and X86_64(ArchLinux only works on ARM and X86 hence is fine)

                Comment


                • #18
                  Originally posted by atomsymbol
                  Is there information about which compression level (the default compression level, the compression level 9 or the compression level 19) was used to obtain the published 100.8% size in respect to xz?

                  COMPRESSZST=(zstd -c -T0 --ultra -20 -)

                  Comment


                  • #19
                    Originally posted by Misel View Post
                    I guess, I should have put more emphasis on disk I/O and network speed rather than the actual required space. The actual space these days is not an issue. Most storage devices are more than large enough for the extra .8%.

                    What I was getting at - or better trying to - was that 1300% speed up could mean 1ms instead of 13ms. As far as I can. tell they only mention the actual decompression - but not the download and disc access time. (Again, I may be wrong on that one).

                    So are there any benchmarks against other algorithms. E.g. bz2 which is much slower but has a much better compression in comparison. So download time and disc reads should be much faster. The question is, though, how do they factor in?

                    So again, are there any benchmarks with absolute numbers?
                    I did benchmarks at https://community.centminmod.com/thr...-xz-etc.18669/ that is round 4 of my continuous compression algorithm comparison benchmarks and not surprised zstd has such awesome decompression speed especially with zstd 1.4.4

                    I benchmark the following on CentOS 7.7 64bit
                    • zstd v1.4.4 - Facebook developed realtime compression algorithm here. Run with multi-threaded mode.
                    • brotli v1.0.7 - Google developed Brotli compression algorithm
                    • gzip v1.5
                    • bzip2 v1.06
                    • pigz v2.4 - multi-threaded version of gzip
                    • pbzip2 v1.1.13 – multi-threaded version of bzip2
                    • lbzip2 v2.5– multi-threaded version of bzip2
                    • lzip v1.21 – based on LZMA compression algorithm
                    • plzip v1.8 – multi-threaded version of lzip
                    • xz v5.2.2
                    • pxz v5.2.2 - multi-threaded version of xz
                    I've already added support for zstd to all my backup scripts and switched logrotate compression over from gzip to zstd as well

                    Hope that helps ^_^

                    Comment


                    • #20
                      Originally posted by Misel View Post
                      I guess, I should have put more emphasis on disk I/O and network speed rather than the actual required space. The actual space these days is not an issue. Most storage devices are more than large enough for the extra .8%.

                      What I was getting at - or better trying to - was that 1300% speed up could mean 1ms instead of 13ms. As far as I can. tell they only mention the actual decompression - but not the download and disc access time. (Again, I may be wrong on that one).

                      So are there any benchmarks against other algorithms. E.g. bz2 which is much slower but has a much better compression in comparison. So download time and disc reads should be much faster. The question is, though, how do they factor in?

                      So again, are there any benchmarks with absolute numbers?
                      All your words give me a feeling you have not updated your system for at least 5 years (or maybe 10 for areas with fast Internet like Japan/South Korea)

                      Network has not been the bottleneck for a long time. Currently drpm rebuild takes the majority of my time during the update.
                      Asking absolution numbers makes zero sense when every user has a clearly understanding these numbers are NOT on the critical path.

                      Comment

                      Working...
                      X