Announcement

Collapse
No announcement yet.

Zlib "Next Generation" Preparing Massive Decompression Speed-Up

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by illwieckz View Post

    You're just disclosing you don't know from where PNG efficiency comes. It's fine, no one is required to know unless that one is developing a PNG processing software. I'm not talking about increasing the zlib compression level (like -5 or -9).

    The major PNG compression efficiency doesn't come from zlib and its compression level.

    PNG compression efficiency comes from the fact it provides like 16 (I'm lazy to count) different profiles that works better or not given the input, basically you can chose for the colors between 1-bit, 2-bit, 4-bit, 8-bit palettes, 1-bit, 8-bit grayscale, 24-bit rgb, then for the alpha channel 1bit alpha or 8 bit alpha, or none. Just for images with alpha channel there is 8 combinations. I haven't counted the combinations without alpha channel. Picking the right profile for the data is not like running zlib with -9, but that's what will give you a bump of 25% or 50% of compression.
    I'm not saying libpng will choose, I'm saying you will chose, as a developer writing a software with libpng.

    It happens that all those variants are poorly tested. I myself had to fix a software that had broken png support on some of those variants no one uses. 99% of the softwares out there produce RGBA PNG even when there is no alpha data to store. These days I just identified a bug in Python Pillow module (standard Python Image library), that reads as 1-bit (black-and-white) some greyscale PNG variant. It's not just a random software no one uses, actually all Python applications doing PNG with the standard library are affected. But no one cares, because no one does something else than RGBA PNG.



    They even don't chose the level of compression. Almost everything does RGBA with default compression level.



    Now that I develop or contribute to softwares processing PNG I now understand why Internet Explorer 6 had incomplete PNG support and I now feel compassion for the Internet Explorer developers.



    Yes, it's a valid PNG. No it will not be read without any problems by all software that uses libpng. The quoted NetRadiant software was using libpng but had a bug. Why? Because libpng is a micromanaged library, I'm not talking about project management, but how the developer deals with the libpng library in the software he writes code for. One has to write code to configure libpng for this or that png profile.

    One doesn't do:

    Code:
    pixmap = libpng.read(filepath);
    One does:

    Code:
    png_read_info( png_ptr, info_ptr );
    int bit_depth = png_get_bit_depth( png_ptr, info_ptr );
    int color_type = png_get_color_type( png_ptr, info_ptr );
    if ( color_type == PNG_COLOR_TYPE_GRAY || color_type == PNG_COLOR_TYPE_GRAY_ALPHA ) {
    png_set_gray_to_rgb( png_ptr );
    } else if ( color_type == PNG_COLOR_TYPE_PALETTE ) {
    png_set_palette_to_rgb( png_ptr );
    }
    if ( color_type == PNG_COLOR_TYPE_GRAY && bit_depth < 8 ) {
    png_set_expand_gray_1_2_4_to_8( png_ptr );
    }
    if ( png_get_valid( png_ptr, info_ptr, PNG_INFO_tRNS ) ) {
    png_set_tRNS_to_alpha( png_ptr );
    } else if ( !( color_type & PNG_COLOR_MASK_ALPHA ) ) {
    png_color_16 my_background, *image_background;
    if ( png_get_bKGD( png_ptr, info_ptr, &image_background ) ) {
    png_set_background( png_ptr, image_background, PNG_BACKGROUND_GAMMA_FILE, 1, 1.0 );
    } else {
    png_set_background( png_ptr, &my_background, PNG_BACKGROUND_GAMMA_SCREEN, 0, 1.0 );
    }
    png_set_filler( png_ptr, 0xff, PNG_FILLER_AFTER );
    }
    png_read_update_info( png_ptr, info_ptr );
    color_type = png_get_color_type( png_ptr, info_ptr );
    bit_depth = png_get_bit_depth( png_ptr, info_ptr );
    int width = png_get_image_width( png_ptr, info_ptr );
    int height = png_get_image_height( png_ptr, info_ptr );
    RGBAImage* image = new RGBAImage( width, height );
    And I'm not excluding the possibility there is a mistake in that code. And if there is, you would not know there is one, neither what the mistake it.

    Chosing the compression level of a PNG is not about turning one knob, it's about turning between 10 and 20 knobs. Most of the applications even don't turn the first one.
    Everything you describe is a fault of application programmers, not of PNG compression.
    The format and the png library is well defined and tested in all its options.
    If some application does not support writing 1 bit images, or 8 bits and alpha channel, it is a problem of the application. Or maybe not, as each application has a purpose and developers may choose to ignore some variants on saving. It is easy to upscale everything to RGBA during loading, but on saving, it is questionable if the application should do image depth conversion, if such depth is not natively supported within the application.

    In a typical graphics application, program internally supports a couple of different image formats (for example: 8 bit colormap, 8 bit greyscale, 24 bit RGB, 32 bit RGBA ... ). Some functionality is available for certain image depths, other are not. Some applications may choose to only support RGBA, to simplify the code.
    Then the program outputs the image in various formats like PNG, TIFF, GIF, JPEG, WebP... each with its own color space and depth limitations. Supporting every option in every format is usually not feasible, we support typical use cases, which benefit users and fit the application internal image model.
    In my software, I have the option to post-process the saved PNG with optipng after saving. So users can have reasonably fast saving, and small archived files.

    I also use zlib for compression of undo steps. Here, top level compression is not necessary, we need reasonable compression and fast response time. File size is not the most important thing, as undo steps are deleted on exit.

    So we must view the compression in context of application usage.
    Last edited by dpeterc; 03 May 2023, 11:17 AM.

    Comment


    • #42
      Originally posted by stiiixy View Post
      So, anyone bothered to benchmark this shit and post a link here? I found nothing on OpenBenchmarking (OBM).
      So, I might be wrong about this, but, I think the PTS is meant to test hardware against a shared set of benchmarks, not against other versions of software. I contribute code to zlib-ng and would be happy to provide some benchmarks/comparisons if someone provided me with a means of submitting that data to convey a particular fork of zlib.

      I'll say for the average case (and with x86) you're usually twice as fast or more for decompression and compression, but it's all very data dependent. Apart from aarch64, which we have a potential enhancement for from someone else, we might actually have the fastest adler32 checksum implementation of everything I've seen out there.

      Comment


      • #43
        Originally posted by avis View Post
        Nice improvements but the core algorithm is outdated and ZSTD nowadays runs circles around zlib. It has basically made zlib obsolete.

        I wonder if the PNG standard could add ZSTD compression. That would be fantastic.

        Actually someone did that over five years ago but it's not gained any traction: https://github.com/catid/Zpng
        That's actually pretty cool, unfortunate it didn't catch on as a new image standard. What's a bit interesting about that fork seems to be that he wrote new, perhaps more intelligent compression filter? The ones in libpng have always been the slower parts of compression/decompression but it looks like this one might not be bad.

        That having been said, there are other more popular lossless image formats out there that are intended to also be png replacements, and they haven't really garnered that much adoption, either (e.g. webp, apple's thing, FLIF, AVIF). Not that all of these are all similar, they all have pros and cons. That first adopter momentum and ubiquity is hard to overcome.

        Comment


        • #44
          Originally posted by dpeterc View Post
          Everything you describe is a fault of application programmers, not of PNG compression.
          Yes, totally, it just happens that I cannot trust any PNG application before actually testing it, it's just a fact.

          It just happens that the world almost never uses non RGB/RGBA PNG variants and then applications are almost never tested with the other variants.

          Something that remains though is that the libpng may not be that convenient to use if so few people knows how to properly use it, compared to the very large amount of software that supports PNG (even by using libpng), but that's not really the topic of PNG not being used properly (but that may explain it).

          The libpng library may have no bugs, the PNG format may not be that bad on paper, it happens that the real world almost never uses the PNG format and the libpng library in an efficient way, and the real world very often do it in a buggy way. I don't mind the hypothetical possible world PNG format and library can offer, I mind the actual world we live in.

          The format and the png library is well defined and tested in all its options.
          This has nothing to do with PNG applications, libpng may have no bugs and format variants well tested (no one said the contrary), there will still be many buggy applications out there and it will still be true than almost no one outisde of libpng people are using and then testing non RGB/RGBA variants. No one said the PNG format is not well defined neither that libpng is not tested in all its options. The PNG format being well defined and the PNG library being properly tested can't help with the problem if people using PNG format or PNG library don't care to begin with.

          Comment


          • #45
            Originally posted by Anux View Post
            I never said it is the optimal solution, just that there is allways another compression algorithm, that gives you better results just because there is no perfect compression algorithm out there and therefore some redundancy is allways left in a compressed file.


            Sure and even if you use the best zip compression there is allways another algorithm with bigger search windows that compresses better/further.


            That was allready Gimps highest compression level. The fact that there are better algorithms just proves that it is possible to further compress and if we wait a few years there will be another even better algorithm.

            Also if we allready have a big list of algorithms that "don't count", how does that still fall under "Data compressed using most compression algos"?


            Uh I totally forgot about metadata but even without metadata my argument still holds: 89 kb -> 77 kb​ and that's only logical because the compression algorithm is pretty old and there should be room for improvement.


            ​But avis said "Data compressed using most compression algos" not "Data compressed using best possible compression algos with maximum settings".

            In my language "most" means "all with a few special exceptions" but I'm not a native english speaker.
            This is really needless pedantry.

            If you go to any storage vendor and ask if you should turn on dedup and compression for a large stack of compressed video files, they will emphatically say no, because it will probably provide negligible benefits, if not increase size, at non-trivial computational cost. And if you asked that question for a huge array of OOXML documents (which are essentially zlib-compressed XML files), again, you're looking at a size increase if the system actually tries to compress them.

            "Compressed data is incompressible" is not some new concept-- it's a pretty widely understood rule of thumb. That does not mean there are no circumstances where the data can be compressed a bit more, but as I said-- in those cases, if you're planning to process those files, you're far better off transcoding them into the better compression algorithm.

            The problem is that lossy-compressed files which might have some headroom will only remain usable if you use a video codec (rather than an archive / tarball) and are probably only going to see worthwhile gains with another, more advanced lossy compression algorithm, which will significantly compromise video quality. And lossless compression is often going to be applied to highly-compressible data like text, where there are very little gains in going to e.g. 7z over zlib-- and if there are gains, you should just extract and recompress and avoid the double encryption, and you'll get both better decryption performance *and* better compression.

            There just are very few practical applications for double compression, other than the narrow use case of a very large amount of truly archival data that cannot be transcoded and that you do not need to access, where you might benefit packing all of your video files into a bzip at maximum settings.
            Last edited by ll1025; 05 May 2023, 09:04 AM.

            Comment


            • #46
              Originally posted by ll1025 View Post
              ... you're far better off transcoding them into the better compression algorithm.
              That's what I said earlier. But I think I said everything and the topic is not important enough to endlessly fight over nuances.

              An even more interesting thing on the compression front:
              Modern games store material data in multiple files. NVIDIA says, let's just mash it all together and have it look better in the process.


              and I think they're also working on a video codec with similar technology.

              Comment


              • #47
                Originally posted by ilikerackmounts View Post

                So, I might be wrong about this, but, I think the PTS is meant to test hardware against a shared set of benchmarks, not against other versions of software. I contribute code to zlib-ng and would be happy to provide some benchmarks/comparisons if someone provided me with a means of submitting that data to convey a particular fork of zlib.

                I'll say for the average case (and with x86) you're usually twice as fast or more for decompression and compression, but it's all very data dependent. Apart from aarch64, which we have a potential enhancement for from someone else, we might actually have the fastest adler32 checksum implementation of everything I've seen out there.
                It can still be used to benchmark different software versions. You just need to run the different tests on the same hardware, as if you were testing the vardware. But PTS might be a bit overkill.

                Comment

                Working...
                X