Announcement

Collapse
No announcement yet.

New Linux Scheduler Patches Can Improve AMD Zen Performance For Some Workloads

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Apologies for blowing up the thread. Got some benchmarks with these new Ryzen scheduler patches. (available via my build script, see profile if interested).

    I also added some GRUB changes, sysctl conf updates, and slightly tightened up the memory timings (but no other BIOS changes),

    Just hit 8.0 geometric mean on pts/osbench. Not too shabby. All tests below are with mitigations=off. The 5.879 score was what I got on a default, fresh Ubuntu 21.10 installation with mitigations=off, 5.13.0-22 generic kernel, none scheduler, and schedutil as the governor.

    All the changes are kernel/software based except for the tighter memory timings which are just a small part of it (can see the Memory Allocations test for the improvement there.). Get to compiling, everyone. Lots of untapped performance out there. For free.

    https://openbenchmarking.org/result/...27&sgm=1&ppt=D



    GRUB changes:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet mitigations=off amdgpu.freesync_video=1 nosoftlockup pkill_on_warn=1 psi=0 audit=0 mce=ignore_ce ipv6.disable=1 hugepagesz=1G hugepages=1 default_hugepagesz=1G"
    99-sysctl.conf additions:

    vm.stat_interval = 10
    vm.dirty_ratio = 10
    vm.dirty_background_ratio = 3
    Environment variables (via .pam_environment):

    XZ_DEFAULTS="-T 16"
    CONCURRENCY_LEVEL=16
    ZINK_DESCRIPTORS=lazy
    AMD_VULKAN_ICD=RADV
    RADV_PERFTEST=rt
    RADEONSI_ENABLE_NIR=true
    MESA_DISK_CACHE_SINGLE_FILE=1
    TEARFREE_DISCARD=true
    THREAD_SUBMIT=true
    WINEFSYNC=1
    WINEFSYNC_SPINCOUNT=100
    WINEDEBUG=-all,fixme-all
    WINE_FULLSCREEN_INTEGER_SCALING=1
    CFFLAGS="-O3 -march=znver3 -mtune=znver3 -pipe -feliminate-unused-debug-types -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -ftree-loop-distribute-patterns -malign-data=abi -fno-semantic-interposition -ftree-vectorize -fno-tree-loop-vectorize"
    FFLAGS="-O3 -march=znver3 -mtune=znver3 -pipe -feliminate-unused-debug-types -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -ftree-loop-distribute-patterns -malign-data=abi -fno-semantic-interposition -ftree-vectorize -fno-tree-loop-vectorize"
    CXXFLAGS="-O3 -march=znver3 -mtune=znver3 -pipe -feliminate-unused-debug-types -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -ftree-loop-distribute-patterns -fno-semantic-interposition -ffat-lto-objects -fno-trapping-math -fvisibility-inlines-hidden -fno-tree-loop-vectorize"
    CFLAGS="-O3 -march=znver3 -mtune=znver3 -pipe -feliminate-unused-debug-types -fexceptions -fstack-protector --param=ssp-buffer-size=32 -m64 -fasynchronous-unwind-tables -ftree-loop-distribute-patterns -fno-semantic-interposition -ffat-lto-objects -fno-trapping-math -fno-tree-loop-vectorize"
    Small note: I previous had RADV_DEBUG=llvm and that cost me 20 FPS in Shadow of the Tomb Raider benchmarks. So I definitely recommend removing that if you have that set.

    Please also review the above, don't just copy and paste, but I think there's some good stuff there. Feel free to share anything you've got also. Who cares what thread it's in. More information the better. Cheers.

    Comment


    • #22
      So for those with the X570S Gigabyte boards, I was able to get the Active OC Tuner working in the Gigabyte BIOS. It basically allows for PBO to handle single-core (I saw a peak of 4.9 GHz on Windows) and then manual OC to kick in for maximum multi-core performance.

      So on my 5800X I was able to get 4.7 GHz @ 1.368V. Any higher and I was reaching into thermal throttling and instability and performance regressions.

      (fyi- Ryzen Master also tells you which two cores are the fastest, pretty cool. My second and last core were the fastest. This information could be useful if you want to set per-core curve).

      Not sure how much of this info you can leverage with your motherboard, but I figure more info the better. Love this damn chip.

      Some great links I found:

      https://hothardware.com/reviews/amd-...er-guide-zen-3
      https://www.youtube.com/watch?v=OAeWVKEYufE& (does a great job explaining Gigabyte's Active OC Tuner)

      ---

      edit: some more updates below

      CPU-Z:


      CPU-X on Linux:

      PBO + ActiveTuner (single-core takes a small hit)



      PBO Auto, No ActiveTuner (slightly better single-core)


      Cinebench R23

      Single Core: 1577
      Multi Core: 15800

      4.85 GHz max single-core boost
      4.7 GHz all-core boost



      Gigabyte BIOS Settings:
      (edit: Instead of PBO Scalar 10x, go with 2x for better thermals. Play around with 2x, 10x, and Auto. For now I'm using 2x with great results. I also found out my Core 2 (so not Core 0 or Core 1) was my weakest core. (Windows Error was showing error codes for that code. So I changed the Curve Optimizer to Negative 25 for that core, the rest are the same as that screenshot. That lets me scale back that core a little more so that it doesn't bring down the whole system. Now I can reach 4.7 GHz @ 1.368V without crashing. Still doing more testing but that's what I'm finding.

      Conclusion:

      PBO + Gigabyte's Active OC Tuner does result in slightly slower single-core but much faster than multi-core which likely outweighs it.

      The pts/ctx-clock got slower and went from 152 to 185 with PBO+ActiveTuner. But every test in pts/osbench was about equal or faster. So this is definitely worth doing if your BIOS supports this feature. Otherwise, just load optimized defaults and set PBO to Auto everything and call it a day.

      Here's the kernel compiling at 4.7 GHz all core, and I was also seeing boosts earlier to 4.825 GHz single core also. So it looks like all is good in Linux land for getting best of both worlds out of Ryzen 5000. Undervolting is definitely the way to go combined with all of the above. Much better than previously in this thread where I was at 4.45 GHz.

      Another perk: It runs cooler *and* faster. It's because out-the-box unless your motherboard supports it you don't have the choice between PBO or manual O/C, it's one or the other. So most for good reason opt for PBO for max single-core and so-so multi-core. Which while compiling the kernel, it was settling at 4.45 GHz and reaching higher voltages, say 1.45V. Now 1.368V is the cap, and I can get 4.7 all core, and still hit 4.825 - 4.9 GHz single-core. Awesome chip.

      Geekbench 5 on Linux:

      Single-Core Score: 1841
      Multi-Core Score: 12305

      Geekbench 5 on Windows 11:

      Single-Core Score: 1625
      Multi-Core Score: 11254

      Wow. Linux for the win!! And I didn't try and gimp the Windows benchmark by any means. Gave it its best. Linux is supreme... that custom kernel, baby! I think I'm in the Top 5 for the 5800X on Geekbench 5. Love this chip.
      Last edited by perpetually high; 08 December 2021, 10:37 AM.

      Comment


      • #23
        perpetually high

        Nice! I had a 6800XT Red Devil but it croaked on me a week after getting it, was an insane card. I'm thinking the fan controller or some other IC failed on it, or maybe it got damaged in shipping only to fail later. It was absolutely massive, the RTX card I have now isn't a whole lot smaller.
        From what I understand NVIDIA is going to get absolutely obliterated by AMD in 2022-2023, provided the world doesn't get trashed worse than it already is.

        All AMD needs to do is up their game with their software concerning GPU's.
        Last edited by creative; 08 December 2021, 11:55 AM.

        Comment


        • #24
          Hey creative, I was editing my post and it tripped up the system because of all the links so it disappeared right when you posted. Check it out for sure, some good info on there. I don't see a 5800X/X570S even close to me on Geekbench.

          I wonder if your PSU was the culprit of the 6800XT Failing? That Red Devil is a beauitful card and can probably be tough on a PSU. Someone previously told me that and i dismissed it, but now I see what they meant. PSU quality is not to be f'd with. (I picked up a Seasonic PRIME 850W on sale, absolutely loving it.)

          Really glad I splurged and treated myself on this machine. My last build was 2014, I feel like I deserved this lol. My Haswell does 4 cores @ 4.3GHz. My new Zen 3 does 16 threads @ 4.7GHz. #Winning
          Last edited by perpetually high; 08 December 2021, 03:00 PM.

          Comment


          • #25
            Originally posted by perpetually high View Post
            Hey creative, I was editing my post and it tripped up the system because of all the links so it disappeared right when you posted. Check it out for sure, some good info on there. I don't see a 5800X/X570S even close to me on Geekbench.

            I wonder if your PSU was the culprit of the 6800XT Failing? That Red Devil is a beauitful card and can probably be tough on a PSU. Someone previously told me that and i dismissed it, but now I see what they meant. PSU quality is not to be f'd with. (I picked up a Seasonic PRIME 850W on sale, absolutely loving it.)

            Really glad I splurged and treated myself on this machine. My last build was 2014, I feel like I deserved this lol. My Haswell does 4 cores @ 4.3GHz. My new Zen 3 does 16 threads @ 4.7GHz. #Winning
            I have a gold rated Corsair 850watt power suppy, RM 850X should be one of the best out there, 10 year warranty. From what I understand its manufactured by Seasonic but carries the Corsair branding.

            I am not familiar with geekbench. I am usually not a huge benchmarker, except for games just for seeing if video drivers are better.

            Its very nice to have a fast computer, its for sure a luxury that not many people are aware that can be a really good thing. Most people given the chance would spend the money of a really nice computer on a smart phone. I have the same smartphone from like 2016, not a big fan of smart phones but they are handy. I always strip all the apps off first thing.
            Last edited by creative; 08 December 2021, 05:01 PM.

            Comment


            • #26
              Originally posted by creative View Post

              I have a gold rated Corsair 850watt power suppy, RM 850X should be one of the best out there, 10 year warranty. From what I understand its manufactured by Seasonic but carries the Corsair branding.

              I am not familiar with geekbench. I am usually not a huge benchmarker, except for games just for seeing if video drivers are better.

              Its very nice to have a fast computer, its for sure a luxury that not many people are aware that can be a really good thing. Most people given the chance would spend the money of a really nice computer on a smart phone. I have the same smartphone from like 2016, not a big fan of smart phones but they are handy. I always strip all the apps off first thing.
              Good point. And yeah that’s a great PSU. Forgot you mentioned it earlier.

              Agreed, I’d much rather spend that money on a PC. The new smartphones have some incredible cameras though. That’s the only thing. Besides that, very little reason to keep updating. I do like my iPhone 13 Pro.

              Last photo and I’m signing off for the thread. Good stuff again, creative. Thanks for your input

              Comment


              • #27
                Man, I'm learning a lot about this chip. It's very flexible, and like creative said, I think we all have a golden sample.

                I'm getting 4.925 GHz single-core now (1.45V or so, whatever PBO decides), and 4.65 [email protected] multi-core.

                Turns out 4.7 GHz wasn't stable and I had to give it way more juice than it needed so I decided to dial back to 46.50. Needs *way* less voltage. Only 1.26V, I was giving it 1.368V before because I jumped to high to 4.7 GHz all core and didn't realize. If I didn't want 4.9 GHz single-core, I could even leave it at 1.26V and cripple the single-core a little bit if I needed lower temperatures, but it's doesn't even hit 80 degrees now. Incredible.

                This Active OC Tuner feature is very helpful. Highly recommend this Gigabyte X570S AORUS Master motherboard to anyone looking for a very high-end motherboard for Zen 3 with literally all the bells and whistles. Couldn't be happier.

                More benchmarks here in this imgur album. Will post the new Gigabyte BIOS settings soon as they're slightly different than the ones above.

                edit: Gigabyte BIOS Settings. These are pretty much safe to plug in. My Core 1 is the fastest (Ryzen Master on Windows will tell you), so it has the highest negative offset. The rest are at negative 5. If I do all core negative 10, I get crashes, meaning one of my cores is weak. Since I know Core 1 is strong, I can give that the highest offset to get the highest single core, then the rest I can just leave at negative 5, set it to 46.50, set 1.281 voltage, and bam! Best of both worlds.
                Last edited by perpetually high; 10 December 2021, 01:11 AM.

                Comment


                • #28
                  perpetually high I favor blunt stability over fine tuned performance, over time all it will take is a bios update to sheer off that performance edge. Trust me I know.

                  Tweaking can be fun though, there is much merit in the entertainment of it.

                  The 5800x is a space heater under full load without question, however, its a high quality one that has temperature settings motherboard provided.

                  The reason I said golden sample is due to how low these processors idle on average, which is usually indicative of high grade silicon. Also most hit well above their advertised boost frequency, single core boost regularly to 4.85Ghz, 150Mhz more than box specifications at AUTO default settings.

                  Even on AUTO you are properly pushing this processor, you have good cooling. Pushing Zen 3 is providing it with good cooling and just forgetting it and leaving it on auto.

                  I also would not even begin to have an intel approach to this 'tweaks', 12th gen is a clear victor and the i7 12700k absolutely wipes the floor with this chip in multi-threaded workloads.

                  I hate intel though, so they feck off anyway.

                  The fact that I could have had a 8700k on my old z270 board but they did not take the time and care for their customers made me even more pro AMD.

                  NVIDIA is the greediest out of all of them though. Only reason I have a RTX 3070 is due to convenience at the price of access in a critically damaged market and yes I got lucky. $999 for a RTX 3070 in 2021? I watched like a hawk. Prices fluctuate so much I was not able to get another 6800XT.

                  I hope I don't sound like a dickhead, if I do I feel burned out concerning the latest and greatest, especially considering that I have not seen the release of a single good game since 2017, save Shadow Of The Tomb Raider being worth a play-through.
                  Last edited by creative; 11 December 2021, 10:23 AM.

                  Comment


                  • #29
                    New update! 12/16/2021



                    Single core is now at 4.925 GHz and multi-core is 4.65 GHz.


                    PPT 200, TDC 88, EDC 120 (this was the game changer. Someone recommended it, and bam. Fixed all my issues)

                    Along with: PBO Advanced, Negative 30 offset on all cores, +75 MHz boost

                    creative, do give that a whirl. Nothing to lose. Start with Optimized Defaults, then XMP on, then plug in the above. Cheers, and good luck!

                    Comment


                    • #30
                      Originally posted by perpetually high View Post
                      New update! 12/16/2021



                      Single core is now at 4.925 GHz and multi-core is 4.65 GHz.


                      PPT 200, TDC 88, EDC 120 (this was the game changer. Someone recommended it, and bam. Fixed all my issues)

                      Along with: PBO Advanced, Negative 30 offset on all cores, +75 MHz boost

                      creative, do give that a whirl. Nothing to lose. Start with Optimized Defaults, then XMP on, then plug in the above. Cheers, and good luck!
                      I went on ahead and did a R20 benchmark on auto and hit a 6135, AGESA V2 PI 1.2.0.3 Patch C. I can hit well above those frequencies but am not getting the same performance as on auto. My setup is a bit different though and I am using an inferior board compared to yours, yet its still pretty good with a strong VRM setup.

                      90c does not bother me, its just a brute processor that gets hot at normal operation. Its not uncommon for someone to have even a laptop that hits those temps on a daily basis that has lasted for close to twenty years.

                      Almost everyone likes to see low temperatures. Coming back this machine I saw that it idled at 24c last night, bit chilly out though. When its really cold though it idles at 22c while the room temperature is even 75 fahrenheit but cool air collects in room sides where system is at.
                      Last edited by creative; 18 December 2021, 02:13 PM.

                      Comment

                      Working...
                      X