Announcement

Collapse
No announcement yet.

New Compression Codecs Risk Making Zlib Obsolete

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • name99
    replied
    Originally posted by milkylainen View Post
    I don't think zlib will be obsolete anytime soon either.
    It is well established, implemented and vetted code. Gains are still minor and not without trade-offs.
    I think the author forgets why something becomes a standard.
    CDs, mp3s etc are not the pinnacle of technology by any means yet they managed to become their respective technology standard for a very long time.
    Every time a competitor came along it was shot down because the standard had become ubiquitous and the competing ones did not offer any noticeable advantages to the average user on the receiving end.
    In this case it is the developers that are mostly on the receiving end. Do most developers really care about compression algorithm as long as it is well rounded and has decent performance? It is just another library with another function. Of course, some will. But calling zlib soon to be obsolete...

    It is hard to appreciate just how ubiquitous zlib really is.
    A lot of protocols, data compressions, boot compressions, stream compressions etc. Just about everything, everywhere relies on zlib. It is not going away anytime soon.
    zlib is used in lots of places, and many of those places are quite susceptible to its replacement.
    Google is obviously pushing brotli to replace it in the web space. MS and Apple will probably go along because, WTF not?
    WIthin a particular app's private usage (eg compressing assets for the use of a program) replacement depends simply on how easy the replacement is. Apple have provided libraries for iOS/OSX that offer the exact same API as zlib, but give either faster performance at zlib compression levels, or better compression at zlib performance levels (and they're likely to hardware accelerate this at some point). So it makes sense for those Apple developers to replace zlib. MS may well do the same. etc etc
    Standards that rely on network effects can persist forever; but if someone who (substantially) controls the network wants to force a change (Google; Apple) AND if the change is basically cost free, the situation is extremely different from something like mp3's. (And even there, of course, there is a substantial fraction of the world using a newer spec, namely AAC.)

    Leave a comment:


  • curaga
    replied
    My test data was plain English text, 768 bytes of it. The small size of it probably contributes. I used the highest levels of each, lzo1x-999 and lz5hc-16. For my results see https://github.com/clbr/nes/blob/mas...ession/results

    Leave a comment:


  • SystemCrasher
    replied
    Originally posted by curaga View Post
    Thanks for the tips, will check lzoma out. About LZ5, I actually tested LZ5HC on a NES this morning. The compression ratio was worse than LZO or zlib, and it decompressed slower than anything else I've tried. Not too impressed, sadly.
    NP, it can happen, because various algos perform in different ways, and depending on data nature and/or properties of your target system sometimes algos can perform surprisingly good or suprisingly bad. Or just give you "strange" results. E.g. I've managed to "invert" zlib levels, getting best ratio on lvl 1, worst ratio on lvl 9 and mid-range on 6. So level 1 both compressed better and faster. Sounds unusual, eh? You can take a look on lzbench tool, by the same author like LZ5, on github. It allows one to give a shot to bunch of various LZ-based algos at once to take a look how they perform on YOUR data. Though some of algos are experimental/uncommon, but can be fun to tale a look for comparison vs other things. Also, in microcontrollers, RAM usually really low and wasting 1K on compression state sounds not very inspiring for me. Though maybe you're using some thing with ext. RAM or high-end things with large RAM. Can you describe such hardware?

    I'm also curious to take a look on data where LZ5 manges to lose to LZO in terms of ratio and/or speed, can you give some example? Because... uhm, I've fed quite assorted kinds of data into it, ranging from bitmaps to TTF fonts and ARM machine code and many other strange or fancy things. LZO lost in virtually all cases. But I've probably used different settings and/or data, etc. E.g. whicn block size you were using/technical capable of, etc? Sounds like it can be fun to try to compress such data, it seems I can get yet another unusual results

    As for LZOMA, this thing is WIP. But can put show in terms of ratio, especially if you can afford some headroom for block size. It uses some smartass trick to represent specific kinds of matches in compact way and its match finding is pushed to the limits. On most data it proven to be best "entropyless" LZ I ever seen. E.g. when compressing Linux kernel it pwned Brotli 8. While decoder is nowhere close to brotli in terms of size or memory demands . And in cases like this one HAVE to add decoder size to data size to keep things fair. And if decompressor lives in e.g boot loader, there're quite some technical limits on total size and/or mem operations. Uhm, well, at the end of day I do not mind having fun in some "strange" environments

    Leave a comment:


  • curaga
    replied
    Thanks for the tips, will check lzoma out. About LZ5, I actually tested LZ5HC on a NES this morning. The compression ratio was worse than LZO or zlib, and it decompressed slower than anything else I've tried. Not too impressed, sadly.

    Leave a comment:


  • SystemCrasher
    replied
    Originally posted by curaga View Post
    Microcontrollers and old consoles. I've tried many algos, and when you can afford ~1kb ram, zlib is the best. When you can't, LZO seems best.
    Ahh, I like these use cases. I can point on some couple things I've found and which are fancy in this regard.

    1) LZ5. It just nice. Reworked LZ4 with smarter bitstream and larger window. And more aggressive match finding on high levels (-dev branch has got hidden surprise aka even better compression ratios on highest levels). You can get it close to zlib/UCLratios on its highest levels on most data, it can even beat zlib in terms of ratio on large chunks due to larger window. Decompression "takes no memory", in sense you do not need anything except source and destination. Rest can be register math. And it both compresses better than LZO and decompression tends to be faster. You get UCL-like ratios at 2-3x faster speed. If you're real maniac, you can even code decompressor in assembly, since data format is quite simple. And it FAST to decompress. There is no huffman, almost no state, etc. Much faster than ZLIB. I mean, C version beats zlib like 2-3x in terms of speed. If you understand what it means: it is byte-aligned LZ capable of beating many bit-aligned LZs and even sometimes taking on ZLIB.

    Caveat: zlib would do better on e.g. real-world photo-like graphics, because LZ suxx on these kinds of data without preprocessing, but entropy coding like huffman can squeeze some little bit due to non-uniform distribution. But on most of redundant data... it reaches similar ratios and much faster to decompress. And not really needs state except dest buffer.

    2) Wanna even more ratio?! While still caring about trivial/fast decompression? Ask github about LZOMA. As it name suggests, it can get close to LZMA ratios while being not so far from LZO speed in terms of decompression. It even got assembly decomperssor. Preying on zlib is boring. This thing pwned Brotli level 8 when I've compressed Linux kernel. And it lacks entropy coding, unlike Bro. Not to mention much smaller decompressor, IIRC it also not req's any memory except source and destination. ]

    Caveat: hardcore WIP/subject to change/very experimental. But whole look of plain LZ + little cheat on bitstream beating not just zlib, but most Brotli levels looks impressive for those understanding what it means. Compressor it also slow and memory hog, due to very thorough match finding. Yet its amazing to see old LZ idea reaching new hieghts here and there.

    ...these are two things I've spotted when caring about somewhat similar use cases

    Leave a comment:


  • My8th
    replied
    It looks like Facebook is interested in Brotli. Most of my roommate's mobile data usage comes from Facebook so hopefully this helps.
    Last edited by My8th; 21 January 2016, 03:35 AM.

    Leave a comment:


  • liam
    replied
    Originally posted by erendorn View Post
    If it's lossy, you cannot recover it. You can just do without. But I don't see how that's relevant with respect to "you cannot meaningfully compress further audio and video streams".
    Lossy just means you can't mathematically recover all the info with complete confidence. Humans are pretty good at filling in blanks IN CERTAIN SITUATIONS. That was the only point I was trying to make there.
    I only mentioned this because the poster tossed in two fundamentally different types of compression methods.If you are looking at lossy compression there's no agreed upon floor as to how low you can take it. Lossless encoding, however, is a well defined field.

    Leave a comment:


  • erendorn
    replied
    Originally posted by liam View Post
    All, save the last (well, maybe.... It depends on the context) are lossy. If you allow lossy, and maybe, even for documents, we should it if the loss is somethig humans, or machine learning, can easily recover (droppin vowels, in many cases, can result in non-ambiguous, but still lossy results)
    If it's lossy, you cannot recover it. You can just do without. But I don't see how that's relevant with respect to "you cannot meaningfully compress further audio and video streams".

    Leave a comment:


  • curaga
    replied
    Originally posted by SystemCrasher
    @curaga btw, which constrained environments you targed? Some algos do not even need memory to decompress them, you only should have room for source and destination. This can work even in ultra-low-memory cases like microcontrollers. But its not about zlib, its more bloated...
    Microcontrollers and old consoles. I've tried many algos, and when you can afford ~1kb ram, zlib is the best. When you can't, LZO seems best.

    Leave a comment:


  • liam
    replied
    Originally posted by Imroy View Post

    Almost every file/stream type that uses a lot of bandwidth is already compressed e.g audio, video, archives.
    All, save the last (well, maybe.... It depends on the context) are lossy. If you allow lossy, and maybe, even for documents, we should it if the loss is somethig humans, or machine learning, can easily recover (droppin vowels, in many cases, can result in non-ambiguous, but still lossy results)

    Leave a comment:

Working...
X