Announcement

Collapse
No announcement yet.

EROFS File-System Adding DEFLATE Compression Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • EROFS File-System Adding DEFLATE Compression Support

    Phoronix: EROFS File-System Adding DEFLATE Compression Support

    While the EROFS Linux read-only file-system already supports LZ4 and microLZMA support, Zlib DEFLATE support is also being worked on and could be introduced in the next Linux kernel cycle...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Isn't deflate piece of crap compared to zstd?

    Comment


    • #3
      Originally posted by caligula View Post
      Isn't deflate piece of crap compared to zstd?
      Pretty much, but the reason they added this is existing hardware to decompress deflate.

      Comment


      • #4
        Originally posted by caligula View Post
        Isn't deflate piece of crap compared to zstd?
        Like LZ4, deflate is sufficient to the task on older/embedded processors. Since its 40+ years old & ubiquitous, most processors have sufficient silicon capable of dealing with it in reasonable time. Zstd is a non-starter on them. Best tool for the job at hand. Zstd isn't always the best choice for general filesystem storage compression even if you have state-of-the-art processors. It can slow down I/O since you have to finish the stream before it knows whether the data is compressible or not. LZ4 at least can evaluate its effectiveness on the fly and bail if compression isn't above a threshold. Deflate being an old standard is also easy on the CPU, but I'm not sure whether or not it has that particular capability. You just do your performance benchmarks and pick the compression algorithm, if any, that best fits your stored data and processor's capabilities.

        Comment


        • #5
          Originally posted by stormcrow View Post

          Like LZ4, deflate is sufficient to the task on older/embedded processors. Since its 40+ years old & ubiquitous, most processors have sufficient silicon capable of dealing with it in reasonable time. Zstd is a non-starter on them. Best tool for the job at hand. Zstd isn't always the best choice for general filesystem storage compression even if you have state-of-the-art processors. It can slow down I/O since you have to finish the stream before it knows whether the data is compressible or not. LZ4 at least can evaluate its effectiveness on the fly and bail if compression isn't above a threshold. Deflate being an old standard is also easy on the CPU, but I'm not sure whether or not it has that particular capability. You just do your performance benchmarks and pick the compression algorithm, if any, that best fits your stored data and processor's capabilities.
          Well, this is a read-only filesystem - so in pretty much any case zstd would be the better algorithm choice.
          some SOCs have hardware for deflate compression, which is why this got added. I am not sure thus will be useful in many cases, as this will need overhead fir accessing and possibly sharing this hardware.

          Comment


          • #6
            Originally posted by discordian View Post

            Well, this is a read-only filesystem - so in pretty much any case zstd would be the better algorithm choice.
            LZ4 is still the default option for EROFS for real-time high-performance decompression for Android scenarios. Zstd doesn't perform quite well since it leads to less throughput and higher latency than LZ4 (especially on high-performance UFS/NVMe storage) so system/app boot performance will be greatly impacted and drive smartphone end-users away.

            Anyway, LZMA/DEFLATE (or later Zstd) are just some alternative compression algorithms for EROFS to choose and they don't change the main targeted scenarios (the lowest decompression latency).

            Originally posted by discordian View Post
            some SOCs have hardware for deflate compression, which is why this got added. I am not sure thus will be useful in many cases, as this will need overhead fir accessing and possibly sharing this hardware.
            Description The Intel® In-Memory Analytics Accelerator (Intel® IAA) is a hardware accelerator that provides very high throughput compression and decompression combined with primitive analytic funct...

            In terms of performance, IAA decompression has 9%~15% improvement compared with ZSTD, and is very close to snappy, 21%(SSB) and 79%(TPCH) improvement compared with gzip.


            and zswap/zram are also working on Intel IAA:
            Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite


            Algorithm compress decompress
            ----------------------------------------------------------
            iaa sync 3,177 2,235
            iaa async irq 6,847 5,840
            software deflate 108,978 14,485​​
            In short, it provides better performance compared with ZSTD on Intel Xeon IAA instances while it consumes less CPU overhead. IOWs, they will benefit to _all_ cloud workloads on any new Intel Xeon instances starting from 2023, why not utilizing it now?

            Even for software decompression, zlib itself is too old to be compared with (zlib author never did any performance improvement for modern processors these years), but if you compare zstd with a modern implementation libdeflate (instead of zlib). they are pretty much comparative, see https://github.com/inikep/lzbench
            Last edited by hsiangkao; 25 July 2023, 11:26 PM.

            Comment


            • #7
              Originally posted by stormcrow View Post
              Since its 40+ years old & ubiquitous, most processors have sufficient silicon capable of dealing with it in reasonable time. Zstd is a non-starter on them. Best tool for the job at hand. Zstd isn't always the best choice for general filesystem storage compression even if you have state-of-the-art processors.
              LZMA/XZ is the one that's too heavy on older/embed processors (the whole Markov model predictor step that gives its MA name is the killer).

              ZSTD is the same kind of architecture as GZ (a dictionary based compressor that feeds into a fast entropy coder), and has similar requirements. Most (general purpose) CPUs that can run GZ, can also run ZSTD, there's no complicated statistical modelling going under the hood.

              ZSTD out-performs GZ simply due to the advances in the field during the 4 decades between them:
              - leverages the slightly more modern (bit faster and better performing) dictionary searches that Yann Collet initially implemented in LZ4.
              - uses the modern tANS (table-based asymmetrical numeral system) that flat out beat Huffman trees (higher speed while at the same time not making compromises in compression compared to perfect Arithmetic or Range entropy coder).
              - coded to reduce the number of branches (which are costly on older processors, or cause pipeline stalls in case of misprediction on modern processors).

              You can safely replace your use cases of GZ with ZSTD to gain speed and save space, unless you're bound by legacy...

              Originally posted by discordian View Post
              Pretty much, but the reason they added this is existing hardware to decompress deflate.


              ...and hardware-accelerated deflate is an example where you can't throw ZSTD at (the Jarek Duda papers on ANS haven't been written yet back when that hardware was designed), you'd need to fall back to a much slower general purpose CPU.

              Comment


              • #8
                Originally posted by DrYak View Post

                - uses the modern tANS (table-based asymmetrical numeral system) that flat out beat Huffman trees (higher speed while at the same time not making compromises in compression compared to perfect Arithmetic or Range entropy coder).
                Nope, zstd still uses huffman trees to compress literals.

                Originally posted by DrYak View Post

                ...and hardware-accelerated deflate is an example where you can't throw ZSTD at (the Jarek Duda papers on ANS haven't been written yet back when that hardware was designed), you'd need to fall back to a much slower general purpose CPU.
                Many modern accelerators adapt deflate support in practice (including 2023 on-market Intel Xeon processors, specifically IAA [1] and QAT [2]) since several popular formats in the past decades are based on deflate.
                If there is the only accelerator built-in, other accelerators won't benefit to any format like zip, gzip or png, but these are popular accoss different platforms over the past decades (even non-UNIX platforms).

                [1] https://www.intel.com/content/www/us...intel-iaa.html
                [2] https://www.intel.com/content/www/us...-overview.html

                Anyway, deflate support doesn't conflict with zstd support as the next step.
                Last edited by hsiangkao; 26 July 2023, 08:23 AM.

                Comment

                Working...
                X