Announcement

Collapse
No announcement yet.

Zlib "Next Generation" Preparing Massive Decompression Speed-Up

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by timtas View Post

    As JPEG XL was initially supported by Google, it uses brotli for compression. So, the comparison is between brotli and gzip, then.
    I don't find a reference about JPEG XL using Brotli, but anyway that's not really what I mean. PNG is basically simple predictor-LZ-Huffman, and swapping in zstd gives simple predictor-LZ-ANS. Lossless JPEG XL is, IIRC, essenstially more sophisticated predictor-LZ-ANS. So I'd expect the latter to fare a bit better, but details (e.g. LZ window size) tend to matter a lot.

    Comment


    • #32
      So, anyone bothered to benchmark this shit and post a link here? I found nothing on OpenBenchmarking (OBM).
      Hi

      Comment


      • #33
        Originally posted by Anux View Post
        Hu, that doesn't sound logic at all. I just took a random png (15,7 MB) to 7z (14,9 MB) and a random jpeg 103 KB to 76 KB. You could do such things yourself in a few seconds before you post wild claims on the internet.
        Well, especially for png, that's expected. PNG is just something absolutely not optimal to compress images, also almost no one PNG-producing tools actually efficiently compress PNG neither use PNG features to efficiently compress them. Like, do the try, pick a collection of PNG, runs oxipng on them, you'll save between 20% to 50% of size per PNG, but both before and after it would be PNG.

        Even common software may be broken if you start to use PNG features to compress better, because no one ever used them, so no one really tested them.

        So, yes, you can compress something compressed, but that just mean that was not really compressed (like 99.999% of PNG in the world are, not really compressed).

        So, yes, it's make sense to compress at file system level even to store compressed files. I do it myself anyway. But the logic should be “what's already compressed should not be really compressible more”, the fact compressed files can be recompressed and save sometime a lot of size isn't logic at all, because it means compressed files aren't really compressed, which is not logic.

        Comment


        • #34
          Originally posted by illwieckz View Post

          Well, especially for png, that's expected. PNG is just something absolutely not optimal to compress images, also almost no one PNG-producing tools actually efficiently compress PNG neither use PNG features to efficiently compress them. Like, do the try, pick a collection of PNG, runs oxipng on them, you'll save between 20% to 50% of size per PNG, but both before and after it would be PNG.

          Even common software may be broken if you start to use PNG features to compress better, because no one ever used them, so no one really tested them.

          So, yes, you can compress something compressed, but that just mean that was not really compressed (like 99.999% of PNG in the world are, not really compressed).

          So, yes, it's make sense to compress at file system level even to store compressed files. I do it myself anyway. But the logic should be “what's already compressed should not be really compressible more”, the fact compressed files can be recompressed and save sometime a lot of size isn't logic at all, because it means compressed files aren't really compressed, which is not logic.
          You fail to understand that compression is a compromise between time and size.
          Applications, which create PNG files, choose the level of compression, which gives an acceptable response time and decent compression level.
          Not the best compression possible, since that would be too slow.

          And it is not true that PNG files will break if you use advanced optimization options. Almost all software reads/writes PNG via libpng, nobody writes its own version of PNG. You can optimize the PNG by optipng, and use advanced optimization features. The resulting file is still a valid PNG, and will be read without problems by all software, that uses libpng. And that is practically all software that supports PNG.

          Comment


          • #35
            Originally posted by dpeterc View Post
            You fail to understand that compression is a compromise between time and size.
            You're just disclosing you don't know from where PNG efficiency comes. It's fine, no one is required to know unless that one is developing a PNG processing software. I'm not talking about increasing the zlib compression level (like -5 or -9).

            The major PNG compression efficiency doesn't come from zlib and its compression level.

            PNG compression efficiency comes from the fact it provides like 16 (I'm lazy to count) different profiles that works better or not given the input, basically you can chose for the colors between 1-bit, 2-bit, 4-bit, 8-bit palettes, 1-bit, 8-bit grayscale, 24-bit rgb, then for the alpha channel 1bit alpha or 8 bit alpha, or none. Just for images with alpha channel there is 8 combinations. I haven't counted the combinations without alpha channel. Picking the right profile for the data is not like running zlib with -9, but that's what will give you a bump of 25% or 50% of compression.
            I'm not saying libpng will choose, I'm saying you will chose, as a developer writing a software with libpng.

            It happens that all those variants are poorly tested. I myself had to fix a software that had broken png support on some of those variants no one uses. 99% of the softwares out there produce RGBA PNG even when there is no alpha data to store. These days I just identified a bug in Python Pillow module (standard Python Image library), that reads as 1-bit (black-and-white) some greyscale PNG variant. It's not just a random software no one uses, actually all Python applications doing PNG with the standard library are affected. But no one cares, because no one does something else than RGBA PNG.

            Applications, which create PNG files, choose the level of compression, which gives an acceptable response time and decent compression level.
            They even don't chose the level of compression. Almost everything does RGBA with default compression level.

            And it is not true that PNG files will break if you use advanced optimization options.
            Now that I develop or contribute to softwares processing PNG I now understand why Internet Explorer 6 had incomplete PNG support and I now feel compassion for the Internet Explorer developers.

            The resulting file is still a valid PNG, and will be read without problems by all software, that uses libpng.


            Yes, it's a valid PNG. No it will not be read without any problems by all software that uses libpng. The quoted NetRadiant software was using libpng but had a bug. Why? Because libpng is a micromanaged library, I'm not talking about project management, but how the developer deals with the libpng library in the software he writes code for. One has to write code to configure libpng for this or that png profile.

            One doesn't do:

            Code:
            pixmap = libpng.read(filepath);
            One does:

            Code:
            png_read_info( png_ptr, info_ptr );
            int bit_depth = png_get_bit_depth( png_ptr, info_ptr );
            int color_type = png_get_color_type( png_ptr, info_ptr );
            if ( color_type == PNG_COLOR_TYPE_GRAY || color_type == PNG_COLOR_TYPE_GRAY_ALPHA ) {
                png_set_gray_to_rgb( png_ptr );
            } else if ( color_type == PNG_COLOR_TYPE_PALETTE ) {
                png_set_palette_to_rgb( png_ptr );
            }
            if ( color_type == PNG_COLOR_TYPE_GRAY && bit_depth < 8 ) {
                png_set_expand_gray_1_2_4_to_8( png_ptr );
            }
            if ( png_get_valid( png_ptr, info_ptr, PNG_INFO_tRNS ) ) {
                png_set_tRNS_to_alpha( png_ptr );
            } else if ( !( color_type & PNG_COLOR_MASK_ALPHA ) ) {
                png_color_16 my_background, *image_background;
                if ( png_get_bKGD( png_ptr, info_ptr, &image_background ) ) {
                    png_set_background( png_ptr, image_background, PNG_BACKGROUND_GAMMA_FILE, 1, 1.0 );
                } else {
                    png_set_background( png_ptr, &my_background, PNG_BACKGROUND_GAMMA_SCREEN, 0, 1.0 );
                }
                png_set_filler( png_ptr, 0xff, PNG_FILLER_AFTER );
            }
            png_read_update_info( png_ptr, info_ptr );
            color_type = png_get_color_type( png_ptr, info_ptr );
            bit_depth = png_get_bit_depth( png_ptr, info_ptr );
            int width = png_get_image_width( png_ptr, info_ptr );
            int height = png_get_image_height( png_ptr, info_ptr );
            RGBAImage* image = new RGBAImage( width, height );
            And I'm not excluding the possibility there is a mistake in that code. And if there is, you would not know there is one, neither what the mistake it.

            Chosing the compression level of a PNG is not about turning one knob, it's about turning between 10 and 20 knobs. Most of the applications even don't turn the first one.
            Last edited by illwieckz; 01 May 2023, 10:07 PM.

            Comment


            • #36
              Here I produced a list of reference PNG files for various PNG formats, the ones names `test-*.png`:



              There is only 8 of them because I haven't produced the non-alpha variants, so it's only half the PNG formats, a complete collection would have 16 of them, if no more if I missed some. There is not one PNG format, there are at least 16 PNG formats.
              Last edited by illwieckz; 01 May 2023, 10:13 PM.

              Comment


              • #37
                Originally posted by ll1025 View Post
                and if it helps, you're probably doing something sub-optimal.
                I never said it is the optimal solution, just that there is allways another compression algorithm, that gives you better results just because there is no perfect compression algorithm out there and therefore some redundancy is allways left in a compressed file.

                For instance if you use the fastest zip compression against a text file, you will be leaving some compression headroom on the table.
                Sure and even if you use the best zip compression there is allways another algorithm with bigger search windows that compresses better/further.

                So in your examples, you probably could have used a higher compression level in PNG and ended with a file smaller than 14.9MB-- or you could have used a better image format.
                That was allready Gimps highest compression level. The fact that there are better algorithms just proves that it is possible to further compress and if we wait a few years there will be another even better algorithm.

                Also if we allready have a big list of algorithms that "don't count", how does that still fall under "Data compressed using most compression algos"?

                Originally posted by blackshard View Post
                I did a quick benchmark by myself: stripped any metadata out of a jpeg file with GIMP (ICC color profile, thumbnail, EXIF things, etc...), obtaining a 168,6kb file. After compressing it to .7z, it becomes 168,5kb.
                Uh I totally forgot about metadata but even without metadata my argument still holds: 89 kb -> 77 kb​ and that's only logical because the compression algorithm is pretty old and there should be room for improvement.

                Originally posted by illwieckz View Post
                So, yes, you can compress something compressed, but that just mean that was not really compressed (like 99.999% of PNG in the world are, not really compressed).
                ​But avis said "Data compressed using most compression algos" not "Data compressed using best possible compression algos with maximum settings".

                In my language "most" means "all with a few special exceptions" but I'm not a native english speaker.

                Comment


                • #38
                  Originally posted by Anux View Post
                  ​But avis said "Data compressed using most compression algos" not "Data compressed using best possible compression algos with maximum settings".
                  Yeah but in case of PNG, the software even don't try to use actual features of the png format. Even without using best possible compression algo with maximum settings, what most of software do with PNG is no better than a TGA or a BMP in a zip. That's why for example a TGA in a 7z is smaller than a PNG, just because 7z compresses better than zlib, not because 7z is better for images. Almost all softwares only use PNG as a custom zip format, no more, and in fact parsing a TGA out of a zip would be easier than parsing a PNG, so the PNG situation is very sad: we get the worst of what can be done: something no better than a TGA in a zip, but with a non-standard zip format. To get something better than a TGA in a zip, one has to use PNG features, and almost no software does it.

                  Anyway, there are usually remaining ways to compress things that already compressed. For example a zip archive compresses files separately, so if you put 3 times the same file in a zip, the files are stored 3 times compressed and not deduplicated. If you zip the zip, you can actually deduplicate those compressed files. Some other tricks may be more crazy, like settings all files from a zip to the same date and time would allow to save more space if compressing the zip with zip, the best can be obtained if you set the file datetime withing the zip to 0 (zip epoch) and zip the zip.

                  Comment


                  • #39
                    Originally posted by illwieckz View Post
                    ... just because 7z compresses better than zlib, not because 7z is better for images ...
                    But independent of implementaition details this just shows, that most compressed files can actually be compressed further. Anyone can do a simple test, take all your data (large enough sample size), compress it with zstd or zlib and then compress this archive with 7z (max settings) and you will see a small improvement. I'm not saying one should do this, just use the better algorithm to begin with or if you need a certain algorithm due to compatibility, for sure use that one.

                    Only the current best algorithms in its fields are able to compress without further possible improvements (JXL for lossless images or PPMd for text files) and thats far from most. I'm especially not arguing about PNG beeing the best.

                    Comment


                    • #40
                      Originally posted by ayumu View Post
                      Interesting there's LZOP and ZSTD but no LZ4.
                      It's just a convenience option; you can always use `tar --use=` or `tar ... | ...` anyway.

                      Comment

                      Working...
                      X