Announcement

Collapse
No announcement yet.

Ubuntu Still Unsure On Using XZ Packages

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ubuntu Still Unsure On Using XZ Packages

    Phoronix: Ubuntu Still Unsure On Using XZ Packages

    While Fedora has been using XZ-compressed packages for their RPMs for a while now with having a greater compression ratio than Gzip, Ubuntu developers remain unsure of switching to using XZ compression for the Ubuntu 13.04 release...

    http://www.phoronix.com/vr.php?view=MTIxOTY

  • #2
    Right now, just from watching the Debian mailing lists.
    It looks like Debian is going to be using XZ compression on a very small number of packages for the sole purpose of cramming more content onto installation media that is targeted for Desktops.

    It was discussed at DebConf that unpacking XZ files eats a lot of system memory (RAM) and if you don't have the RAM, the unpacking could take several times longer than expected. This may make using XZ compression a bad idea for tablets and cell phones which is important to Ubuntu right now (See the videos of Ubuntu running on cell phones / tablets at Ubuntu's UDS Youtube feed).

    If you have the RAM (2GBs or more) then you're golden for XZ compression as it really isn't noticeably slower than the alternative.

    Debian recognizes that a lot of hardware out there still has less than 2GBs of RAM which means switching entirely to XZ compression could be a *very* bad idea. So they're not going to do it for the Wheezy release for the overwhelming majority of packages and likely won't do it for Wheezy+1 either, but it's still up in the air.

    Using XZ everywhere would be talking about bumping the RAM requirements of Debian from 64MB to 2GBs... That's one heck of a leap.
    Last edited by Sidicas; 11-01-2012, 12:06 PM.

    Comment


    • #3
      You're not going to run a bloaty distribution like ubuntu on something with low ram... I hope.

      Meanwhile archlinux has used xz packages for a while and is still unsure about using lrzip.

      Comment


      • #4
        The memory usage and performance problems may be tackled simply by reducing the compression setting. xz level 0 or 1 use very little memory and should decompress much faster than e.g. bzip2, with much better compression ration than old gzip.

        Comment


        • #5
          I'm not sure about RAM usage, but as far as decompression speed on x86 and x86-64 goes, LZMA/XZ are second in speed only to LZOP/LZ4. Well, for files filled with zeroes at least.
          Here's some research I did back in the day, improving on some wildly inaccurate benchmarks I found in Wikipedia: http://www.webupd8.org/2011/07/free-...utilities.html

          Comment


          • #6
            If Ubuntu/Canonical actually care about bandwidth then they'd fix their servers. It took me several attempts (and several regressions) to get them to reliably set etags. They were using the Apache defaults which include using the inode number so that meant each server gave a different etag which then prevented any kind of caching since each server was essentially claiming the same file was different. (DNS round robin meant you would hit different servers when downloading packages.) They eventually fixed this.

            I then tried to get them to set correct HTTP caching headers. Package files can't change so they could be set to be forever cacheable. However the Canonical people refused to set such headers which means that even if there are http caches, they have to keep contacting the servers to double check the package files are still valid.

            Comment


            • #7
              How does XZ compare to 7zip or bzip2?

              I've never liked tar/Gzip archives. Too slow to open large archives because they have to be completely decompressed first. I can get a directory listing of the contents of a an equivalent 7z archive much faster. The deb files are created with Gnu ar then Gzip, also requiring a two-stage extraction process FWICT.

              Comment


              • #8
                Originally posted by ChrisXY View Post
                You're not going to run a bloaty distribution like ubuntu on something with low ram... I hope.

                Meanwhile archlinux has used xz packages for a while and is still unsure about using lrzip.
                lrzip for packages? Dear God, unpacking one of those can take 8gb of ram. What the heck is their target audience?

                Comment


                • #9
                  Originally posted by jhansonxi View Post
                  How does XZ compare to 7zip or bzip2?

                  I've never liked tar/Gzip archives. Too slow to open large archives because they have to be completely decompressed first. I can get a directory listing of the contents of a an equivalent 7z archive much faster. The deb files are created with Gnu ar then Gzip, also requiring a two-stage extraction process FWICT.
                  xz is just compression, in the same way as gzip - you generally still use tar as the archiver (or whatever archiver you want).

                  Where exactly are the memory concerns coming from? The xz manpage says:

                  Code:
                  The following table summarises the features of the presets:
                  
                                       Preset   DictSize   CompCPU   CompMem   DecMem
                                         -0     256 KiB       0        3 MiB    1 MiB
                                         -1       1 MiB       1        9 MiB    2 MiB
                                         -2       2 MiB       2       17 MiB    3 MiB
                                         -3       4 MiB       3       32 MiB    5 MiB
                                         -4       4 MiB       4       48 MiB    5 MiB
                                         -5       8 MiB       5       94 MiB    9 MiB
                                         -6       8 MiB       6       94 MiB    9 MiB
                                         -7      16 MiB       6      186 MiB   17 MiB
                                         -8      32 MiB       6      370 MiB   33 MiB
                                         -9      64 MiB       6      674 MiB   65 MiB
                  So using the default, 6, you only need 9MiB RAM to decompress.

                  As for speed, I'm sure they're right to look further into it, but pacman uses xz and is arguably the fastest package manager (and certainly the fastest I've ever used).

                  Comment


                  • #10
                    Originally posted by curaga View Post
                    lrzip for packages? Dear God, unpacking one of those can take 8gb of ram. What the heck is their target audience?
                    https://bbs.archlinux.org/viewtopic.php?id=137918&p=1

                    It doesn't seem like it's really being considered at all for building packages yet (see the responses there from the devs), just talk of possibly adding it as another supported format in pacman through libarchive, once the work in the library is done.
                    Last edited by strcat; 11-01-2012, 03:33 PM.

                    Comment


                    • #11
                      Originally posted by Sidicas View Post
                      If you have the RAM (2GBs or more) then you're golden for XZ compression as it really isn't noticeably slower than the alternative.
                      I have run Fedora on machines with considerably less than 1 GB of RAM and never had any problems with it's XZ compression.

                      Comment


                      • #12
                        Originally posted by J___ View Post
                        Code:
                        The following table summarises the features of the presets:
                        
                                             Preset   DictSize   CompCPU   CompMem   DecMem
                                               -9      64 MiB       6      674 MiB   65 MiB
                        I moved a bunch of data over to xz a while back and I looked into memory usage. I have a number of embedded machines in my house and one of the tiny ones--a Seagate Dockstar ARM device with 128MB of memory--had no problems decompressing all of the xz files I threw at it.

                        A 64MB machine (like the minimum one speced for Debian) would have an issue with this, but we don't need to move the minimum up to 2GB to fix that problem like one poster suggested. 128MB would be quite sufficient. After all, it's not like you can cache much of anything while installing. Trying to do so would just be wasting memory--most I/O is 'read once' or 'write and forget'. If the kernel and the install program take up more than 64MB, then someone's doing it wrong. (and the current value of 64MB for a minimum debian system would alread be broken)

                        I used to run debian on a 64MB Geode system with a 300MHz processor. I ran out of CPU before I ran out of memroy. Tasks that we now take for granted (like logging in with ssh using public key crypto or running 'aptitude') take quite a while.

                        Comment


                        • #13
                          does xz do multithreaded decompression yet? last i saw it was "coming sometime".

                          i've always found unbuntu kind of slow with package installs, even on multicore cpus.

                          although the time to update package lists is unrelated to the decompression, i wish that would be improved too.

                          Comment


                          • #14
                            Originally posted by willmore View Post
                            I moved a bunch of data over to xz a while back and I looked into memory usage. I have a number of embedded machines in my house and one of the tiny ones--a Seagate Dockstar ARM device with 128MB of memory--had no problems decompressing all of the xz files I threw at it.

                            A 64MB machine (like the minimum one speced for Debian) would have an issue with this, but we don't need to move the minimum up to 2GB to fix that problem like one poster suggested. 128MB would be quite sufficient. After all, it's not like you can cache much of anything while installing. Trying to do so would just be wasting memory--most I/O is 'read once' or 'write and forget'. If the kernel and the install program take up more than 64MB, then someone's doing it wrong. (and the current value of 64MB for a minimum debian system would alread be broken)

                            I used to run debian on a 64MB Geode system with a 300MHz processor. I ran out of CPU before I ran out of memroy. Tasks that we now take for granted (like logging in with ssh using public key crypto or running 'aptitude') take quite a while.
                            heh i tried debian on 64mb of ram, apt-get is very bloated! ssh seemed fine though? you could try dropbear, and/or a uclibc based distribution.

                            Comment


                            • #15
                              Originally posted by mercutio View Post
                              i've always found unbuntu kind of slow with package installs, even on multicore cpus.
                              That is because the install speed has nothing to do with package formats or anything else discussed here.

                              The install process for a package (or group of packages) is written so that it could fail at any point in time. In order to do this, you'll get one file operation (eg putting a new file onto the filesystem, renaming the new file over the old file) done and then an fsync (or two). This ultimately adds up to several fsyncs per file. (Use strace on dpkg to see this going on.)

                              fsyncs are the slowest possible operation on a filesystem since they only return once the data is on the media. This involves the kernel flushing outstanding data (and on some filesystems like ext it will often flush all data for the filesystem not just the file), having a barrier and then waiting for the media to confirm the writes are on media. A 7200 rpm drive is 120 rps. Under perfect circumstances you would get 120 fsyncs per second but in practise there will be time waiting for the rotation, and possibly more than one write. For SSDs it is also slow since they like to buffer writes up and do big blocks at a time. Tiny little writes can be a lot slower.

                              You can disable fsync by using 'eatmydata' which stubs it out - eg 'sudo eatmydata apt-get install ....' and see the actual install speed ignoring the media. I do this for Ubuntu distro upgrades and it makes them orders of magnitude faster.

                              The better solution to this sort of thing is to have filesystem transactions which either succeed as a group of operations or all fail/rollback. You start a transaction, do all the operations and commit at the end. It should be noted NTFS now has this functionality and that btrfs can sort of fake it.

                              Comment

                              Working...
                              X