Announcement

Collapse
No announcement yet.

New Compression Codecs Risk Making Zlib Obsolete

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by bug77 View Post
    Just look at the assault of data caps. Devices will pretty much have to compress before sending, if ISPs have their way. Thus the ability to compress fast will not go away.
    do you seed torrents from your device? most traffic in the world is inbound

    Comment


    • #12
      Originally posted by pal666 View Post
      do you seed torrents from your device? most traffic in the world is inbound
      Not when every kid can upload a video of themselves crossing the street. Well, technically the video can only be uploaded once and can be seen several times, but you get my drift.

      Comment


      • #13
        Originally posted by tillschaefer View Post
        Well, decrompression throughput might be important for many situations like package distribution, i.e. compress once, decrompress many times. However, the compression speed seems not irrelevant for me. E.g. I am regularity compressing data from computer experiments. In this situation I compress more often, than decompressing things, as i generate a lot of data that is not always needed at the end.

        -> talking about obsolete zlib without giving the compression speeds (which vary a lot more than decompression speeds) seems a bit over the top.
        You might find zstd (zstandard) interesting, as it seems to offer high compression and decompression speed while offering decent compression ratios (zlib level).

        Comment


        • #14
          Originally posted by bug77 View Post
          Just look at the assault of data caps. Devices will pretty much have to compress before sending, if ISPs have their way. Thus the ability to compress fast will not go away.
          Almost every file/stream type that uses a lot of bandwidth is already compressed e.g audio, video, archives.

          Comment


          • #15
            Originally posted by bug77 View Post
            Just look at the assault of data caps. Devices will pretty much have to compress before sending, if ISPs have their way. Thus the ability to compress fast will not go away.
            Just about every large file type or high-bandwidth stream type is already compressed e.g audio, video, archives. They'll always improve their compression and/or encoding (for lossy types), but incrementally. You're not going to be able to suddenly double your effective cap because data that is already compressed doesn't tend to compress much further on a second/third/etc try.

            Comment


            • #16
              Originally posted by milkylainen View Post
              I don't think zlib will be obsolete anytime soon either.
              It is well established, implemented and vetted code. Gains are still minor and not without trade-offs.
              I think the author forgets why something becomes a standard.
              ...
              It is hard to appreciate just how ubiquitous zlib really is.
              A lot of protocols, data compressions, boot compressions, stream compressions etc. Just about everything, everywhere relies on zlib. It is not going away anytime soon.

              Technically, brotli is already part of at least one standard, it's part of webfonts.

              Algorithms go obsolete? What does that mean? That people stop using it or people stop understanding it or what? There are hardware implementations of deflate as well as some IETF specifications, it's not going to stop being used for a long while, probably never.

              There are trade-offs but what has changed with some recent innovations are that the outright performance of competitors is beating zlib/deflate. ZStd is doing the same thing, dramatically faster than zlib/deflate (I think it's aiming at deflate levels 3 to 6) and better compression.

              gzip -9 compression the Linux 4.4 source tree to 127M in 30s on my workstation
              brotli level 6 compresses it to 104M in 23s
              brotli level 7 compresses it to 101M in 30s. Of course brotli uses more memory, that's the trade off but in many application memory is fairly plentiful and this is text which Brotli is optimized for. These are interesting trade-offs though, not sure I'd call them minor gains. The time seems to go up pretty dramatically with level's 10 and 11 and the payback is pretty incremental but the decompression times are all similar. ZStd has a pretty small memory footprint though

              What is fascinating, as someone who's been a compression dork for a very long time, it seems there is more interest and then more actual engineering effort being put in now than ever before. Apple has LZFSE, Google has a family of "bread" algorithms; those are corporations that have paid people on staff working on compression. That people even know LZMA, LZHam, etc.. is incredible to me. It really seems to go to Yann Collet's work with FSE

              Comment


              • #17
                Originally posted by Nelson View Post
                It really seems to go to Yann Collet's work with FSE
                And to Jarek Duda to invent ANS.

                Comment


                • #18
                  Originally posted by tillschaefer View Post
                  However, the compression speed seems not irrelevant for me.
                  Compression speed would matter when what you are compressing is "realtime". e.g. like a protocol. This is what LZ4 was designed for. Even LZ4HC has significantly better compression time vs ZLib. It has it's uses (but is of course not as good at compressing)

                  https://github.com/Cyan4973/lz4

                  You would switch to something like LZ4 if you really need to compress and decompress as fast as possible and you are willing to give up some compression in return. When performance matters and you only have a few CPU cycles to go around, you often make such compromises.

                  Comment


                  • #19
                    Sorry, but replacing zlib is not as easy as it sounds...

                    1) Licensing. Zlib comes with very liberal license. So browsers, servers and virtually everyone can use it. For any purpose. Not like if some proprietary stuff from RadGameTools would do. It can be good, etc. But as long as it proprietary, it is not a snowball chance in the hell to see it everywhere. Who would put proprietary algo to server? Or browser? That's where plan to take over the world comes to halt.

                    2) Zlib is long time obsolete. It uses small 32KiB dictionary. But ... but overall, it has been optimized a lot, etc. And so, scheme provides quite reasonable compression/decompression speed at not so bad ratios. And data format is quite flexible. E.g. SLZ achieves faster compression by doing clever trick to avoid doing huffman at all. Costs some ratio but offloads server. And e.g. Brotli is much slower on compression. On ARM lzham and brotli are much slower than zlib to decompress, so even extra ratio comes at little fun. You lose decompression speed 2x or so. Doh!

                    3) Reasonable API and implementation. Sorry, but LZHAM comes as awful C++ monster, it only ok for gamedevs but not okay to wire it up in each and every program, e.g. web servers. Similar issue for Brotli. Which has got C decompressor, but compressor is ... hard to wire into other programs and libs we may want to care.

                    4) On some data zlib provides good balance and hard to beat by noticeable margin, justifying compatibility loss & remake of all software.

                    So there're some codecs. They are better in some domains. But we're yet to see new enjoyable balance. Actually, zlib still performs quite reasonably on most data and sometimes does it surprisingly good. Probably because it has been optimized over years. And e.g. brand new zstd? Ok, zstd HC level compression 20 is uhm, very slow. Zlib 9 is much faster. It ratio may or may not beat zlib, even on HC 20, depending on data. And on ARM it barely 20% faster than zlib to decompress. It faster on x86, but x86s are fast on their own and zlib speed is hardly an issue. At the end of day you get more or less zlib ratio and hardly a 20% speed up. Is it worth of breaking compatibility all over place? Unlikely. Brttli and so on offer more compression. But on ARM they lose speed a lot. And huge dictionary makes it really niche thing. It is ok to wire in browser. But useless if you want to compress e.g. image in libpng-future instead of zlib. So you just get crapload of useless dictionary and can't get rid of it. FAIL. Good luck to wire LZHAM into something like libpng-next, ha-ha. So from practical standpoint there is heck a lot of room for improvement.

                    curaga btw, which constrained environments you targed? Some algos do not even need memory to decompress them, you only should have room for source and destination. This can work even in ultra-low-memory cases like microcontrollers. But its not about zlib, its more bloated...
                    Last edited by SystemCrasher; 19 January 2016, 01:27 AM.

                    Comment


                    • #20
                      Originally posted by Nelson View Post
                      Technically, brotli is already part of at least one standard, it's part of webfonts.
                      And demanding to bring huge dictionary to be able to decompress binary data is pretty lame. But its Google, they do strange things like a half of time. E.g snappy was written in C++ for some reasons. It killed it adoption. Yet, LZ4 usually compersses better and faster on most kinds of data. And also comes as easy to use C lib. Needless to say LZ4 secured a major victory in this league. It got even wired up into Linux kernel.

                      ZStd is doing the same thing, dramatically faster than zlib/deflate (I think it's aiming at deflate levels 3 to 6) and better compression.
                      Not to be taken as granted. Compression ratio is generally slightly better than zlib. But can be both better or worse, overall gain is low. Speed? Compression speed isn't huge, HC is slow. Decompression only fast on x86, but only slightly faster on e.g. ARM. So yells about speed could be overrated.

                      gzip -9 compression the Linux 4.4 source tree to 127M in 30s on my workstation
                      brotli level 6 compresses it to 104M in 23s
                      brotli level 7 compresses it to 101M in 30s.
                      Cool. Now try to beat gzip (zlib) 1..3 speeds using whatever setup of Brotli. Servers can't usually afford zlib -9 dammit. It way too costly in terms of CPU. And even level 1 of Brotli is quite slow. Sounds like a problem for e.g. web servers.

                      Of course brotli uses more memory, that's the trade off but in many application memory is fairly plentiful and this is text which Brotli is optimized for.
                      ...but when server would try to serve 10 000 clients and would need to keep 10 000 compression states, RAM usage would explode to unbearable levels. Take a look on SLZ, they did a good job dealing with prob's like this.

                      These are interesting trade-offs though, not sure I'd call them minor gains. The time seems to go up pretty dramatically with level's 10 and 11 and the payback is pretty incremental but the decompression times are all similar. ZStd has a pretty small memory footprint though
                      For me ZStd has been a mixed bag. It generally a bit better than zlib, but hard to call epic win. On some data it can actually perform worse than zlib.

                      What is fascinating, as someone who's been a compression dork for a very long time, it seems there is more interest and then more actual engineering effort being put in now than ever before. Apple has LZFSE, Google has a family of "bread" algorithms; those are corporations that have paid people on staff working on compression. That people even know LZMA, LZHam, etc.. is incredible to me. It really seems to go to Yann Collet's work with FSE
                      Still, I can admit there're plenty of new tradeoffs. These are sound cool. E.g. LZ5 is a mod of LZ4 which attempts to retain most (but not all) of LZ4 decompression speed but its best levels are getting close to zlib. LZ4 can never afford it, its tradeoffs are all about speed. It beats zstd in decompress speed. On ARM it literally pwns it. So, it is like LZO1X-999 but with even better compression ratio, and better decompression speed. I think it is another nice tradeoff when one needs FAST decompression but wants ratio beyond of LZ4.
                      Last edited by SystemCrasher; 19 January 2016, 01:45 AM.

                      Comment

                      Working...
                      X