Announcement

Collapse
No announcement yet.

Ubuntu 13.04 To Look At XZ-Compressed Packages

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by chithanh View Post
    xz (lzma) is an ok choice, but it is by far not the best compressor. The only advantage is that it is already installed by default almost everywhere. If decompression speed is important, then e.g. lzham would be more suitable.

    http://mattmahoney.net/dc/text.html
    Decompression speed is indeed important, but so is compression ratio. So any choice in this matter will likely weigh in all factors and I guess that for package distribution, compression speed is the least important factor as it is not something which affects end users. Compression on the other hand affects the time it takes to recieve the packages and decompression speed affects the time it takes to install them once they are downloaded.

    So a solution which provides great compression and great decompression speed is likely a prime candidate. On my machines the packages I get from the Arch repos (xz compressed) unpack and install very quickly but then again I have a core i5 and a core i7 so it's hard for me to judge how effective it is overall.

    Still, lzma should have proved itself as striking a good balance between compression/decompression speed and compression ratio given that it is used in so many compression tools.

    Comment


    • #22
      lzham offers much faster decompression than lzma (9 vs. 36 seconds in enwik9), at a very modest increase in archive size on average. It is even faster than gzip in decompression.

      Comment


      • #23
        Originally posted by chithanh View Post
        lzham offers much faster decompression than lzma (9 vs. 36 seconds in enwik9), at a very modest increase in archive size on average. It is even faster than gzip in decompression.
        Well lzma (as in the utils, not the overall compression algorithm) is no longer being developed in favour of xz, so any worthwhile comparison should be made against xz utils rather than lzma utils as I assume there has been improvements made since the development switch to xz.

        Comment


        • #24
          Originally posted by chithanh View Post
          lzham offers much faster decompression than lzma (9 vs. 36 seconds in enwik9), at a very modest increase in archive size on average. It is even faster than gzip in decompression.
          Is lzham proven? Is it installed an nearly every linux box?
          Because if you have even the silghtest error comressing/decompressing, it literally means thousands of broken installation after one update. And you'd wish you spent a little more time decompressing.

          Comment


          • #25
            Yes, I was referring to the lzma algorithm, not the software package with the same name.

            The lzham algorithm itself is correct. The implementation could contain bugs: as lzham is not in widespread use, one might be less confident in the code. But even if you insert an extra checksumming and verification step, you would still be ahead of xz(lzma).

            Comment


            • #26
              Originally posted by bug77 View Post
              Is lzham proven? Is it installed an nearly every linux box?
              It is not installed everywhere, it would be the distribution's installer and package manager who has to ensure that it is available before extracting lzham compressed packages.

              Comment


              • #27
                Originally posted by chithanh View Post
                Yes, I was referring to the lzma algorithm, not the software package with the same name.
                There's your problem. Choosing a solution for something with such a widespread use, it has to be quasi-ubiquitous and rock solid in the first place. Who knows, a few years down the road maybe that algorithm will prove itself, see wide spread adoption and replace xz. But it has to happen in that order.

                Comment


                • #28
                  Originally posted by Ibidem View Post
                  4. Delta-debs were proposed and rejected on the grounds that a lot of people skip at least one package release, and also you can't rely on the debs being present locally if they clean the package archive. If you want them, explain to the Ubuntu developers why those aren't a problem once you've read the relevant Brainstorm pages.
                  Hmm. That's not the way I remember the discussion at UDS-O. I think I found the right Brainstorm link. But I don't see it as being rejected. What I found was marked "Being Implemented"... although obviously it stalled after that.

                  What I do remember is that it was a blueprint for Oneiric but ended up getting blocked/postponed during that development cycle. I'm not sure how intrusive it really would be, but I suspect it was then a little too radical for an LTS. After that the interest seemed to die out. But I'm not positive what the whole story was. In any case, there wasn't a discussion at UDS-Q.

                  I remember the UDS debdelta session a few cycles ago. It sounded a little ambitious (not nerly as easy as many seem to think) but doable. It is a bummer that it didn't end up landing. So far it's not on the schedule for UDS-R but perhaps it will be brought up in the dpkg-xz session in couple weeks.

                  Comment


                  • #29
                    Originally posted by chithanh View Post
                    xz (lzma) is an ok choice, but it is by far not the best compressor. The only advantage is that it is already installed by default almost everywhere. If decompression speed is important, then e.g. lzham would be more suitable.

                    http://mattmahoney.net/dc/text.html
                    That benchmark is not very useful on its own, as it only tests compression of text files (and it's a text file with very specific characteristics too), and they only test with the highest available compression ratio. Most .deb packages don't consist of XML dumps of Wikipedia and it's unlikely that they use the currently used compressors at their highest compression ratio (because often that affects the speed or memory use too much).

                    Edit: note that I'm not saying that lzham does badly on binaries, just that that page is not useful as a test for .deb compression (unless it's a .deb that contains mostly text files maybe).
                    Last edited by JanC; 18 October 2012, 02:40 PM.

                    Comment


                    • #30
                      Be aware that I only referred to decompression speed, not compression ratio. For meaningful results on compression ratio, a more diverse benchmark would indeed be necessary.

                      One can however say with some confidence that the relative decompression speed will not change on different types of data.

                      Comment

                      Working...
                      X