Announcement

Collapse
No announcement yet.

Linux 5.4 vs. Liquorix Kernel Benchmarks For AMD Ryzen + Radeon Gaming On Ubuntu

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Volta View Post

    Take a look at total frame time then. CFS low latency is just awesome except if you're some awkward Google 'developer'.
    But even total frame time should be a pure throughput problem and not a latency problem. Now this of course depends how the game logic is written with modern games having several threads all contributing to the frame time, but still as a general rule the time it takes to create a single frame should increase with a true realtime kernel.

    edit: I think that I misread your post. Sorry about that.
    Last edited by F.Ultra; 04 January 2020, 03:03 PM.

    Comment


    • #12
      Originally posted by perpetually high View Post

      Aye aye! I have a few more I use, but yes these are musts.

      Code:
      #https://github.com/graysky2/kernel_gcc_patch
      echo "*** Copying and applying graysky's GCC 9 tweaks for Intel.. ✓"
      if [ "${KERNEL_BASE_VER}" = "5.4" ]; then
      cp $PATCH_PATH/graysky-gcc9-5.4.patch .;
      patch -p1 < ./graysky-gcc9-5.4.patch;
      else
      cp $PATCH_PATH/graysky-gcc9-5.5+.patch .;
      patch -p1 < ./graysky-gcc9-5.5+.patch;
      fi
      
      #https://github.com/sirlucjan/kernel-patches/blob/master/5.4/cpu-patches
      echo "*** Copying and applying O3 patches.. ✓"
      cp $PATCH_PATH/0001-cpu-patches.patch .;
      patch -p1 < ./0001-cpu-patches.patch;
      
      #https://cchalpha.blogspot.com/
      echo "*** Copying and applying BMQ Scheduler patch.. ✓"
      cp $PATCH_PATH/bmq_v5.4-r1.patch .;
      patch -p1 < ./bmq_v5.4-r1.patch;
      
      #https://lkml.org/lkml/2019/7/30/1398 and https://lkml.org/lkml/2019/7/30/1399
      #https://gitlab.collabora.com/krisman/linux/commit/368eb7d8c86bd2a5db200567e82c477efc1a9502
      #https://github.com/sirlucjan/kernel-patches/tree/master/5.4/futex-patches-sep
      echo "*** Copying and applying Valve fsync/futex patches.. ✓"
      cp $PATCH_PATH/futex-patches-sep/*.patch .;
      patch -p1 < ./0001-futex-Split-key-setup-from-key-queue-locking-and-rea.patch;
      patch -p1 < ./0002-futex-Implement-mechanism-to-wait-on-any-of-several-.patch;
      patch -p1 < ./0003-futex-Change-WAIT_MULTIPLE-opcode-to-31.patch;
      
      #https://github.com/sirlucjan/kernel-patches/tree/master/5.4/clearlinux-patches-v6-sep
      echo "*** Copying and applying Clear Linux patches.. ✓"
      cp $PATCH_PATH/clearlinux-patches-v6/0001-clearlinux-patches.patch .;
      patch -p1 < ./0001-clearlinux-patches.patch;
      
      #https://github.com/sirlucjan/kernel-patches/tree/master/5.4/bfq-patches-sep
      echo "*** Copying and applying BFQ patches.. ✓"
      cp $PATCH_PATH/bfq/$KERNEL_BASE_VER/0001-blkcg-Make-bfq-disable-iocost-when-enabled.patch .;
      cp $PATCH_PATH/bfq/$KERNEL_BASE_VER/0002-block-bfq-present-a-double-cgroups-interface.patch .;
      cp $PATCH_PATH/bfq/$KERNEL_BASE_VER/0003-block-bfq-Skip-tracing-hooks-if-possible.patch .;
      patch -p1 < ./0001-blkcg-Make-bfq-disable-iocost-when-enabled.patch;
      patch -p1 < ./0002-block-bfq-present-a-double-cgroups-interface.patch;
      patch -p1 < ./0003-block-bfq-Skip-tracing-hooks-if-possible.patch;
      (big shout out to sirlucjan for his excellent github repository and maintaining the patches!)
      Thanks for the tip

      PS: I'm using BMQ patch, not the PDS-mq patch.

      Comment


      • #13
        if someone wants to use it on arch it's called linux-zen there.
        i just it for my gaming desktop

        Comment


        • #14
          Michael would be nice to have an Arch linux vs. linux-lts vs. linux-zen vs. linux-hardened test.

          Comment


          • #15
            Wow! Very enlightening benchmarks.

            Specifically, the "Deus Ex: Mankind Divided" looks like a specific configuration is causing this anomaly, getting much lower frame rates at an already low FPS. What could the engine be doing that penalizes Liquorix so much?

            There's 3 things that come to mind:

            1) Liquorix enforces a rescheduling on yield() invocations. Games that use this aggressively when they actually want to use a different locking mechanism will probably perform badly here. I switched Liquorix to this yield type when trying to get RPCS3 audio stutter and other problems solved on some games I was playing. Most likely to get performance up on games that need yield, we'll need to ignore invocations to it entirely (yield_type => 0). The side effect is games will take more CPU resources to run, but with the last benchmark on spinlocks (https://probablydance.com/2019/12/30...ler-really-is/), considering how well MuQSS handles traditional spinlock implementations, maybe this is the right choice.

            2) Switching from SMT to MC runqueues probably reduced throughput further. This change was to get Liquorix closer to upstream since MC is considered the most optimal choice in MuQSS to keep deadlines under control and have reasonable throughput. If testing reveals a huge performance regression though, I'll switch it back.

            3) And finally, CPU vulnerabilities penalize context switching very heavily. Just look at Phoronix's last article where a large variety of benchmarks were tested with mitigations on/off: https://www.phoronix.com/scan.php?pa...igations&num=5. The 9900K took 4x longer to context switch with mitigations enabled - yikes!

            I'll do some testing (especially with the most egregious benchmarks here), and see if I can improve the worst case scenarios with some basic tuning.

            Comment


            • #16
              Thank you Michael for this nice comparision

              Comment


              • #17
                Originally posted by damentz View Post
                Wow! Very enlightening benchmarks..
                I came across Linus's response today where he addresses yield() in depth and responds to the scheduling article about spinlocks/mutexes:

                Linus's response: https://www.realworldtech.com/forum/...rpostid=189723
                Author's response: https://www.realworldtech.com/forum/...rpostid=189747
                Linus follow up: https://www.realworldtech.com/forum/...rpostid=189755
                Linus additional follow up: https://www.realworldtech.com/forum/...rpostid=189759

                They were all really great reads that may help your tuning. It's always refreshing having Linus drop knowledge like that and I'm glad all this discussion has been popping up lately, it can only lead to better software in the future.

                Comment


                • #18
                  are all the tested games based on 64bit architecture?

                  Comment


                  • #19
                    The 3900X with 12 cores and 24 theads is just better with cfs then with MuQSS.
                    I haven't found a single game that ran better with MuQSS then with CFS with my 3900X, no matter what settings for MuQSS I used.
                    PDS is about the same, sometimes slighly faster sometimes a bit slower, BMQ is inbetween MuQSS and PDS.

                    Most were at best as fast as with CFS, but some like Guild Wars 2 with fsync enabled are abysmal with MuQSS (better with esync, but still not even close to CFS)

                    As for fsync on wine games, on most games I tried have seen worse FPS with fsync then with esync on PDS/BMQ/MuQSS.
                    CFS on the other hand is often slightly faster with fsync then esync.
                    Last edited by ObiWan; 04 January 2020, 11:12 PM.

                    Comment


                    • #20
                      Originally posted by lumks View Post
                      Michael would be nice to have an Arch linux vs. linux-lts vs. linux-zen vs. linux-hardened test.
                      Totally!

                      Comment

                      Working...
                      X