Announcement

Collapse
No announcement yet.

New Compression Codecs Risk Making Zlib Obsolete

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by curaga View Post
    I use zlib in a lot of memory-constrained places. Brotli with its huge dictionary is completely out of the question, I'll reserve judgement on BitKnit until I can see the source. But so far looks like zlib won't be obsolete any time soon.
    I know! I was pleasantly surprised by its performance. Granted lz4 is faster to compress, the rather significant size difference may mean that, for a typical user system, the performance is awash... Which justifies the btrfs devs decision to not include lz4, yet.

    Comment


    • #22
      Originally posted by tillschaefer View Post
      Well, decrompression throughput might be important for many situations like package distribution, i.e. compress once, decrompress many times. However, the compression speed seems not irrelevant for me. E.g. I am regularity compressing data from computer experiments. In this situation I compress more often, than decompressing things, as i generate a lot of data that is not always needed at the end.

      -> talking about obsolete zlib without giving the compression speeds (which vary a lot more than decompression speeds) seems a bit over the top.
      That's a poor example. Compress once/decompress many times is exactly when you want to maximize compression. Decompression throughput matter more with something like Linux's transcendental cache.

      Comment


      • #23
        Originally posted by Imroy View Post

        Almost every file/stream type that uses a lot of bandwidth is already compressed e.g audio, video, archives.
        All, save the last (well, maybe.... It depends on the context) are lossy. If you allow lossy, and maybe, even for documents, we should it if the loss is somethig humans, or machine learning, can easily recover (droppin vowels, in many cases, can result in non-ambiguous, but still lossy results)

        Comment


        • #24
          Originally posted by SystemCrasher
          @curaga btw, which constrained environments you targed? Some algos do not even need memory to decompress them, you only should have room for source and destination. This can work even in ultra-low-memory cases like microcontrollers. But its not about zlib, its more bloated...
          Microcontrollers and old consoles. I've tried many algos, and when you can afford ~1kb ram, zlib is the best. When you can't, LZO seems best.

          Comment


          • #25
            Originally posted by liam View Post
            All, save the last (well, maybe.... It depends on the context) are lossy. If you allow lossy, and maybe, even for documents, we should it if the loss is somethig humans, or machine learning, can easily recover (droppin vowels, in many cases, can result in non-ambiguous, but still lossy results)
            If it's lossy, you cannot recover it. You can just do without. But I don't see how that's relevant with respect to "you cannot meaningfully compress further audio and video streams".

            Comment


            • #26
              Originally posted by erendorn View Post
              If it's lossy, you cannot recover it. You can just do without. But I don't see how that's relevant with respect to "you cannot meaningfully compress further audio and video streams".
              Lossy just means you can't mathematically recover all the info with complete confidence. Humans are pretty good at filling in blanks IN CERTAIN SITUATIONS. That was the only point I was trying to make there.
              I only mentioned this because the poster tossed in two fundamentally different types of compression methods.If you are looking at lossy compression there's no agreed upon floor as to how low you can take it. Lossless encoding, however, is a well defined field.

              Comment


              • #27
                It looks like Facebook is interested in Brotli. Most of my roommate's mobile data usage comes from Facebook so hopefully this helps.
                Last edited by My8th; 21 January 2016, 03:35 AM.

                Comment


                • #28
                  Originally posted by curaga View Post
                  Microcontrollers and old consoles. I've tried many algos, and when you can afford ~1kb ram, zlib is the best. When you can't, LZO seems best.
                  Ahh, I like these use cases. I can point on some couple things I've found and which are fancy in this regard.

                  1) LZ5. It just nice. Reworked LZ4 with smarter bitstream and larger window. And more aggressive match finding on high levels (-dev branch has got hidden surprise aka even better compression ratios on highest levels). You can get it close to zlib/UCLratios on its highest levels on most data, it can even beat zlib in terms of ratio on large chunks due to larger window. Decompression "takes no memory", in sense you do not need anything except source and destination. Rest can be register math. And it both compresses better than LZO and decompression tends to be faster. You get UCL-like ratios at 2-3x faster speed. If you're real maniac, you can even code decompressor in assembly, since data format is quite simple. And it FAST to decompress. There is no huffman, almost no state, etc. Much faster than ZLIB. I mean, C version beats zlib like 2-3x in terms of speed. If you understand what it means: it is byte-aligned LZ capable of beating many bit-aligned LZs and even sometimes taking on ZLIB.

                  Caveat: zlib would do better on e.g. real-world photo-like graphics, because LZ suxx on these kinds of data without preprocessing, but entropy coding like huffman can squeeze some little bit due to non-uniform distribution. But on most of redundant data... it reaches similar ratios and much faster to decompress. And not really needs state except dest buffer.

                  2) Wanna even more ratio?! While still caring about trivial/fast decompression? Ask github about LZOMA. As it name suggests, it can get close to LZMA ratios while being not so far from LZO speed in terms of decompression. It even got assembly decomperssor. Preying on zlib is boring. This thing pwned Brotli level 8 when I've compressed Linux kernel. And it lacks entropy coding, unlike Bro. Not to mention much smaller decompressor, IIRC it also not req's any memory except source and destination. ]

                  Caveat: hardcore WIP/subject to change/very experimental. But whole look of plain LZ + little cheat on bitstream beating not just zlib, but most Brotli levels looks impressive for those understanding what it means. Compressor it also slow and memory hog, due to very thorough match finding. Yet its amazing to see old LZ idea reaching new hieghts here and there.

                  ...these are two things I've spotted when caring about somewhat similar use cases

                  Comment


                  • #29
                    Thanks for the tips, will check lzoma out. About LZ5, I actually tested LZ5HC on a NES this morning. The compression ratio was worse than LZO or zlib, and it decompressed slower than anything else I've tried. Not too impressed, sadly.

                    Comment


                    • #30
                      Originally posted by curaga View Post
                      Thanks for the tips, will check lzoma out. About LZ5, I actually tested LZ5HC on a NES this morning. The compression ratio was worse than LZO or zlib, and it decompressed slower than anything else I've tried. Not too impressed, sadly.
                      NP, it can happen, because various algos perform in different ways, and depending on data nature and/or properties of your target system sometimes algos can perform surprisingly good or suprisingly bad. Or just give you "strange" results. E.g. I've managed to "invert" zlib levels, getting best ratio on lvl 1, worst ratio on lvl 9 and mid-range on 6. So level 1 both compressed better and faster. Sounds unusual, eh? You can take a look on lzbench tool, by the same author like LZ5, on github. It allows one to give a shot to bunch of various LZ-based algos at once to take a look how they perform on YOUR data. Though some of algos are experimental/uncommon, but can be fun to tale a look for comparison vs other things. Also, in microcontrollers, RAM usually really low and wasting 1K on compression state sounds not very inspiring for me. Though maybe you're using some thing with ext. RAM or high-end things with large RAM. Can you describe such hardware?

                      I'm also curious to take a look on data where LZ5 manges to lose to LZO in terms of ratio and/or speed, can you give some example? Because... uhm, I've fed quite assorted kinds of data into it, ranging from bitmaps to TTF fonts and ARM machine code and many other strange or fancy things. LZO lost in virtually all cases. But I've probably used different settings and/or data, etc. E.g. whicn block size you were using/technical capable of, etc? Sounds like it can be fun to try to compress such data, it seems I can get yet another unusual results

                      As for LZOMA, this thing is WIP. But can put show in terms of ratio, especially if you can afford some headroom for block size. It uses some smartass trick to represent specific kinds of matches in compact way and its match finding is pushed to the limits. On most data it proven to be best "entropyless" LZ I ever seen. E.g. when compressing Linux kernel it pwned Brotli 8. While decoder is nowhere close to brotli in terms of size or memory demands . And in cases like this one HAVE to add decoder size to data size to keep things fair. And if decompressor lives in e.g boot loader, there're quite some technical limits on total size and/or mem operations. Uhm, well, at the end of day I do not mind having fun in some "strange" environments

                      Comment

                      Working...
                      X