Announcement

Collapse
No announcement yet.

Zlib "Next Generation" Preparing Massive Decompression Speed-Up

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Joe2021 View Post
    I recently was surprised by bzip3​ , which achieved much better compression ratios with some data types than it competitors - but with other data types quite the opposit.

    Looks like we are still far away from the perfect all-purpose compressor.
    I don't think there will ever be something like that (but I might be wrong, of course!)
    Probably we will still need to use different algorithms for different sets of data.

    Comment


    • #12
      Originally posted by avis View Post
      Nice improvements but the core algorithm is outdated and ZSTD nowadays runs circles around zlib. It has basically made zlib obsolete.

      I wonder if the PNG standard could add ZSTD compression. That would be fantastic.

      Actually someone did that over five years ago but it's not gained any traction: https://github.com/catid/Zpng
      And BetaMax will obsolete VHS any day now. ZSTD might be better that zlib but the fact is that zlib is used in A LOT of software, systems and formats and in many of those places ZSTD will not be added due to incompatible format.

      Comment


      • #13


        Originally posted by Joe2021 View Post
        I recently was surprised by bzip3​ , which achieved much better compression ratios with some data types than it competitors - but with other data types quite the opposit.

        Looks like we are still far away from the perfect all-purpose compressor.
        Originally posted by cynic View Post
        I don't think there will ever be something like that (but I might be wrong, of course!)
        Probably we will still need to use different algorithms for different sets of data.
        Simple. You only need to determine the Kolmogorov complexity by algorithmic means*.

        *This is a joke.

        Comment


        • #14
          Originally posted by cynic View Post

          I don't think there will ever be something like that (but I might be wrong, of course!)
          Probably we will still need to use different algorithms for different sets of data.
          Well, with an abstract view on it, a script trying all available algorithms and picking the best result IS actually such an (meta-)algorithm. It is very inefficient CPU cycle wise, but it nevertheless returns the best result.

          Comment


          • #15
            Originally posted by Joe2021 View Post

            Well, with an abstract view on it, a script trying all available algorithms and picking the best result IS actually such an (meta-)algorithm. It is very inefficient CPU cycle wise, but it nevertheless returns the best result.
            hey, that's not fair!

            Comment


            • #16
              Originally posted by F.Ultra View Post

              And BetaMax will obsolete VHS any day now. ZSTD might be better that zlib but the fact is that zlib is used in A LOT of software, systems and formats and in many of those places ZSTD will not be added due to incompatible format.
              It'll probably eventually be added to some common compression programs, but it's unlikely to become the default in any of them. '1337' h4X0rz' will still be using rar just like they were in the 90s as if it were new and counter. Mundanes will still use zip because it's built into Windows and Mac. That covers 99.999% of users because all OSes in widespread use have out of the box support for decompressing zip. It generally sucks for Unix systems as archives from Windows don't usually support Unix style file permissions but it's still there. Those "in the know" will still use LZMA2/7z on Windows. Zstd will mostly stay in the realm of open source background services largely invisible to users like filesystem level compression. In many cases zip, rar, and 7z are already "good enough" to care to change from defaults in communities and popular programs. Just like xz, bz and other common compression algorithms never visibly made it to mainstream Windows and Mac users, but extremely common in the open source world, zstd will likely remain the same. It won't even gain a lot of user facing use even in the open source world until the tar programs natively support piping data streams to zstd same as it does bzip2, xz, & gzip.

              Comment


              • #17
                Originally posted by Anux View Post
                Hu, that doesn't sound logic at all. I just took a random png (15,7 MB) to 7z (14,9 MB) and a random jpeg 103 KB to 76 KB. You could do such things yourself in a few seconds before you post wild claims on the internet.
                That's not a wild claim, although no proof has been shown.

                Starting from the fact that jpeg and png have both an entropy encoder/compression algorithm as a last pass of their process, theory says compression will basically add more entropy to the final output by definition, otherwise it would not be a good compression/encoder algorithm.

                Jpeg usually uses an huffman encoder as entropy encoder, which is not the best encoder thing around for compression ratios, but is quite fast and simple enough.
                Png, AFAIK, uses zlib, which is way more complex than huffman encoding but has the chances to compress much much better, since it is a real compression algorithm.

                In your benchmark the random png shrinks by ~5% recompressing to .7z, while jpeg shrinks by a quarter, but you did not consider the metadata the files carry around, which is not compressed and may be very compressible, hence the high gain in the jpeg case.

                I did a quick benchmark by myself: stripped any metadata out of a jpeg file with GIMP (ICC color profile, thumbnail, EXIF things, etc...), obtaining a 168,6kb file. After compressing it to .7z, it becomes 168,5kb.

                Side note: the entropy of a file can be calculated quite easily (random answer from google here). From that number you can get, by definition, the maximum compression ratio you can achieve without losing information, no matter the compression algorithm you choose. Jpeg/PNG data block (as seen) already produce high-entropy output, so compressing them again with a lossless compressor is basically senseless.

                Comment


                • #18
                  Just gzip... gz

                  Comment


                  • #19
                    Originally posted by avis View Post
                    Nice improvements but the core algorithm is outdated and ZSTD nowadays runs circles around zlib. It has basically made zlib obsolete.

                    I wonder if the PNG standard could add ZSTD compression. That would be fantastic.

                    Actually someone did that over five years ago but it's not gained any traction: https://github.com/catid/Zpng
                    Then why don't they merge the projects or move the developers from zlib over to ztsd?

                    Comment


                    • #20
                      Great project, however the code is split into way too many small files imho.

                      Comment

                      Working...
                      X