Announcement

Collapse
No announcement yet.

ZRAM Will See Greater Performance On Linux 5.1 - It Changed Its Default Compressor

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Weasel View Post
    But ZRAM is only for RAM though?

    And zswap also, last I heard, writes uncompressed stuff to disk, for some idiotic reason (which can easily be denial of service'd).
    Yes. ZRAM and ZSWAP are basically the same kernel feature, and hooks into the virtual memory management. ZRAM is for when you have no swap drive and want to store the compressed memory pages in system RAM. ZRAM then acts as an in-memory swap drive. ZSWAP is like ZRAM, but for when you do have a dedicated swap drive. There it stores the compressed memory pages on an HDD or SSD without creating an in-memory swap drive.
    Last edited by sdack; 16 March 2019, 07:17 PM.

    Comment


    • #22
      I've tried to use ZRAM on my first gen RPi1 w/ 256Mb, you'd image stuff will fill RAM pretty fast, and when it works it works.

      But in the end I had to disable it since, even using a small size, it would end up OOPSing the kernel out of the blue.

      Hope they improved it since 4.14.

      Comment


      • #23
        Originally posted by mcloud View Post
        Wish one could use zram for more things than just swap, pretty-much like windows 10
        You can use zram for far more than just swaps unfortunately the main distro utils for configure suck babdly.
        Both ubuntu & debians zram config tools should really be called zram_make_some_illogical_swaps_whilst_overwriting_ existing_zram_devices.

        Have a look at https://github.com/StuartIanNaylor/zram-config or my other repos as did some examples due to my frustration to the ignorance of https://www.kernel.org/doc/Documenta...ckdev/zram.txt
        The same bad script has been emulated and copied for about 8 years now and not one of them has bothered to take the time to actually read the kernel documentation valid since 3.15.

        I have been using LZ4 with the raspberry pi and the results are a lot closer than benchmarks state with with little difference to lzo on Arm you may find LZO-RLE maybe even faster as presume that is why it has been made default.

        I have been working off this list and for swap its LZO or LZ4, deflate or Zstd which is not in /proc/crypto give excellent up 200% the text compression ratio of LZ, but currently the only offering is a choice of LZ or deflate zlib as the others are inferior or not in the kernel.

        | Compressor name | Ratio | Compression | Decompress. |
        |------------------------|----------|-------------|-------------|
        |zstd 1.3.4 -1 | 2.877 | 470 MB/s | 1380 MB/s |
        |zlib 1.2.11 -1 | 2.743 | 110 MB/s | 400 MB/s |
        |brotli 1.0.2 -0 | 2.701 | 410 MB/s | 430 MB/s |
        |quicklz 1.5.0 -1 | 2.238 | 550 MB/s | 710 MB/s |
        |lzo1x 2.09 -1 | 2.108 | 650 MB/s | 830 MB/s |
        |lz4 1.8.1 | 2.101 | 750 MB/s | 3700 MB/s |
        |snappy 1.1.4 | 2.091 | 530 MB/s | 1800 MB/s |
        |lzf 3.6 -1 | 2.077 | 400 MB/s | 860 MB/s |

        It would be great if someone could do some benchmarks on system load via zram. I have been using my zram-conf and just #swap commenting out the swap line of ztab for no zram.
        When you switch to zram you change from the assumption of hdd swap which the static defaults are set at and swapiness can be approx 80-100 rather than the default 80.
        The page-cluster should also be set to zero as negating the hdd tuned cache of 8 page writes will help greatly reduce latency.

        There is a LibreOffice spreadsheet with runs of 15 mins logged every 2secs of /proc/loadavg with no zram, zram-sw80-pc3, zram-sw80-pc0, zram-sw100-pc3, zram-sw100-pc0 in https://github.com/StuartIanNaylor/z...ap-performance

        If your load is medium or lower then zram can be pushed to swapiness 100 and the pi3b+ generally is up to most tasks with zram maximised even if the boot process queue takes a hit because of the extra zram overhead.
        Its a shame swapiness isn't dynamic and I did an extremely crude swapiness loadbalancer in the https://github.com/StuartIanNaylor/z...-load-balancer as its the problem with static set points as swapiness is often reduced due to the intense boot / startup period as it increases process queue with zram.
        If you get past that period and load is moderately normal then zram is unnoticeable and the extra ram provided comes into play of swapiness=100 but you end up with a compromise between the 2 of somewhere between 80-100 depending on how well the CPU copes with load.
        You can try this on a pi zero as then its pretty drastic but you can see the same curve happening with later.

        Comment


        • #24
          Originally posted by Licaon View Post
          I've tried to use ZRAM on my first gen RPi1 w/ 256Mb, you'd image stuff will fill RAM pretty fast, and when it works it works.

          But in the end I had to disable it since, even using a small size, it would end up OOPSing the kernel out of the blue.

          Hope they improved it since 4.14.
          Its highly likely it was the poor zram-config script you where using as nothing kernel wise has changed much since 3.14.

          There has been this zram-config script in the wild for over 7 years that for some reason has been emulated and copied blindly with no actual reference to the kernel docs or thought as it would seem.
          It pointlessly takes half the current ram available and partitions that so number of zram swaps equals cpu cores. Pointlessly reducing the max page write into the reciprocal of core count of 50% of mem.
          Its pure voodoo and bad voodoo as since 3.15 zram has been multistream and zramctl will display that. They end up with the same number of devices as cores all supporting the core count in streams but each swap is reduced as divided by the core count.

          That is only the start of how poor the zram-config script is from Ubuntu and its absolutely amazing as it seems to of been emulated and copied to practically every distro.
          Try my rough hacks https://github.com/StuartIanNaylor/ with zram yeah my scripting aint up to much but at least I bothered to read the kernel docs.

          Comment


          • #25
            Originally posted by StuartIanNaylor View Post

            Its highly likely it was the poor zram-config script you where using as nothing kernel wise has changed much since 3.14.

            There has been this zram-config script in the wild for over 7 years that for some reason has been emulated and copied blindly with no actual reference to the kernel docs or thought as it would seem.
            It pointlessly takes half the current ram available and partitions that so number of zram swaps equals cpu cores. Pointlessly reducing the max page write into the reciprocal of core count of 50% of mem.
            Its pure voodoo and bad voodoo as since 3.15 zram has been multistream and zramctl will display that. They end up with the same number of devices as cores all supporting the core count in streams but each swap is reduced as divided by the core count.

            That is only the start of how poor the zram-config script is from Ubuntu and its absolutely amazing as it seems to of been emulated and copied to practically every distro.
            Try my rough hacks https://github.com/StuartIanNaylor/ with zram yeah my scripting aint up to much but at least I bothered to read the kernel docs.
            Whoops should say 3.15 as streams / mem_limit and various updates where added, but since then relatively static but the scripts haven't changed since 3.14

            Comment


            • #26
              Originally posted by andreano View Post

              I'm amazed to see that zstd in its fastest setting almost keeps up with these special-purpose fast compressors, and actually manages to beat regular lzo in decompression! We don't have the numbers for lzo-rle, and it's hard to extrapolate 30% from regular lzo, since we don't know how much comes from compression and decompression, but assuming it's a pure decompression speedup (since that's what you get by making the algorithm more complex), that would be upwards of 60%, and a close race between lzo-rle and zstd on the decompression side. However, nothing that would dethrone lz4 as the bilateral speed king. Of course, the result will depend a bit on your test data.
              Actually the perf benefits were split between compression and decompression (it's much faster to detect a run of zeros than run through the lzo compression loop). I don't have the data to hand, but roundtrip perf ends up being a win over lz4 if I remember rightly, as the benefits to improving the slowest part (compression) have more impact than the fastest part (decompression) - because we spend more time on the slowest part. Zram does about 2.25x more compression than decompression (some pages are never decompressed again), so this also skews the importance to compression.

              Dave

              Comment


              • #27
                It would be really great if Phoronix could do a compression and zram test.
                Prob best way would be to create and mount a drive and use various disk benchmarks as well as swap/compression tests
                Compare at least against lz0, lz4, zstd, zlib ?

                Comment


                • #28
                  maybe this is why my performance has tanked once I start transferring files over 1GB / sec PC grinds to a halt , rarely see's 2GB/Sec and used to move 4x the data

                  2x Avago 9302-16e 12Gb/s PCIe 3.0 with Multipathing anymore than ~20 drives being Read / Written it becomes very unresponsive now

                  wish their was a way to force all of web-browsing content to be compressed in memory and leave everything else alone
                  ( chrome eats 30 Gig RAM very quickly )

                  Comment


                  • #29
                    Originally posted by MasterCATZ View Post
                    maybe this is why my performance has tanked once I start transferring files over 1GB / sec PC grinds to a halt , rarely see's 2GB/Sec and used to move 4x the data

                    2x Avago 9302-16e 12Gb/s PCIe 3.0 with Multipathing anymore than ~20 drives being Read / Written it becomes very unresponsive now

                    wish their was a way to force all of web-browsing content to be compressed in memory and leave everything else alone
                    ( chrome eats 30 Gig RAM very quickly )
                    There is. Put chrome in its own cgroup, then limit the total memory for the cgroup. memory.soft_limit_in_bytes is where it will start swapping, memory.limit_in_bytes is the effective memory limit, it won't go oover that (will trip the OOM killer if needed).

                    Comment


                    • #30
                      thanks for the tip :P

                      Comment

                      Working...
                      X