100+ Linux Benchmarks Of Intel Arrow Lake With New BIOS / 0x114 CPU Microcode

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • phoronix
    Administrator
    • Jan 2007
    • 67371

    100+ Linux Benchmarks Of Intel Arrow Lake With New BIOS / 0x114 CPU Microcode

    Phoronix: 100+ Linux Benchmarks Of Intel Arrow Lake With New BIOS / 0x114 CPU Microcode

    This past week Intel published an Intel Core Ultra 200S Series "Arrow Lake" performance status update following mixed reviews since launch around the Arrow Lake gaming performance that were inconsistent with Intel's internal findings. Among Intel's findings detailed in their report this past week were some new BIOS performance optimizations, some misconfigured performance settings in early/reviewer BIOSes, and also some Windows 11 updates being pushed down to help with different performance issues. ASUS already started releasing new BIOSes that incorporate the 0x114 Arrow Lake intended to help the situation. While it's been a Windows-focused issue, I couldn't help but to run Intel Arrow Lake performance comparison benchmarks on Linux with the new microcode / BIOS.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite
  • pWe00Iri3e7Z9lHOX2Qx
    Senior Member
    • Jul 2020
    • 1591

    #2
    On the Windows side, it looks like the updates generally fix the atrocious performance regressions with Windows 24H2, but that basically gets you back around Windows 23H2 performance. CPU launches have been rocky lately with the increasing complexity of the designs. Arrow Lake was a mess. Ryzen 9900X / 9950X had core parking issues impacting gaming performance. The joys of heterogenous cores and multiple CCDs, and the resulting scheduling challenges .

    Comment

    • geerge
      Senior Member
      • Aug 2023
      • 362

      #3
      What an incredible performance leap, surely that 285K is worth the £564 now.

      Comment

      • Espionage724
        Senior Member
        • Sep 2024
        • 381

        #4
        Originally posted by pWe00Iri3e7Z9lHOX2Qx View Post
        On the Windows side, it looks like the updates generally fix the atrocious performance regressions with Windows 24H2, but that basically gets you back around Windows 23H2 performance. CPU launches have been rocky lately with the increasing complexity of the designs. Arrow Lake was a mess. Ryzen 9900X / 9950X had core parking issues impacting gaming performance. The joys of heterogenous cores and multiple CCDs, and the resulting scheduling challenges .
        I feel as if this is caused by new CPUs trying to do wacky stuff. Gimme real cores (no HT/SMT), give em direct L caches (or however that works that isn't whatever Bulldozer started), and let the OS without any newly-invented guess-work algorithms schedule process to real cores no-nonsense. As boring as that is, it seemingly worked

        I was affected with Ryzen back with Win10 LTSC being 1809 but Ryzen scheduler "improvements" being only 1903+ non-LTSC (newer LTSC wasn't available until 21H2). MS being MS didn't backport it, it was black-box enough to not really describe "what" the fix was exactly, but luckily I didn't really notice a difference anyway (disabled SMT and stuck with 1809 no problem PCVR).

        Comment

        • smitty3268
          Senior Member
          • Oct 2008
          • 6963

          #5
          The slide from Intel clearly states you need the updated firmware kit version before microcode 114 improvements apply, and that hasn't been released yet.

          Most of the rest came down to missing power management settings in Windows which wouldn't apply to linux. And apparently a few BIOS default settings getting updated, which is highly dependent on your motherboard and when you tested.

          Comment

          • coder
            Senior Member
            • Nov 2014
            • 8952

            #6
            Some of the really weird performance results from the initial 285K testing seemed due to threads getting scheduled on E-cores. It's obvious when this happens, because perf is at the bottom, but perf/W is at the top, as in this case

            Performance
            Perf/W
            Source: https://www.phoronix.com/review/inte...a-9-285k-linux

            I doubt a BIOS or microcode fix will help with that, but maybe it could influence ThreadDirector behavior enough to have an impact. More likely, it's going to need a kernel fix to tweak the scheduler behavior.
            Last edited by coder; 23 December 2024, 12:54 AM.

            Comment

            • coder
              Senior Member
              • Nov 2014
              • 8952

              #7
              Originally posted by Espionage724 View Post
              Gimme real cores (no HT/SMT),
              Wish granted. Arrow Lake has no HT/SMT.

              Originally posted by Espionage724 View Post
              ​give em direct L caches
              Huh?

              Comment

              • Espionage724
                Senior Member
                • Sep 2024
                • 381

                #8
                Originally posted by coder View Post
                Huh?
                I heard AMD Bulldozer did something different than the previous arch where multiple cores used a shared L2 (or L3) cache across all the cores which could lead to scheduling conflicts or overall lower performance if software tried using all the cores as-is (traditionally) without being "Bulldozer" optimized.

                Comment

                • coder
                  Senior Member
                  • Nov 2014
                  • 8952

                  #9
                  Originally posted by Espionage724 View Post
                  I heard AMD Bulldozer did something different than the previous arch where multiple cores used a shared L2 (or L3) cache across all the cores
                  Intel's hybrid CPUs organize E-cores in clusters of 4*, with a shared L2 cache per cluster. Each P-core gets its own L2. Arrow Lake continues this tradition.

                  Chips&Cheese tested Skymont's L2 cache bandwidth in both single-threaded and multi-threaded read scenarios and found it appears to scale almost linearly (402 GBps / 107 GBps = 3.76x):

                  Intel CPUs have a global L3 domain. This trades usable size for scalability. Perhaps in medium-threaded scenarios it's a winner, but Intel lags AMD in both single-threaded and all-thread L3 bandwidth.

                  In contrast, chiplet-based AMD CPUs have a dedicated L3 cache per chiplet. Their hybrid, monolithic CPUs similarly have dedicated L3 for the P-core and C-core clusters, respectively.


                  * Meteor Lake's LPE cores are in a cluster of 2, because there are only 2 of them.

                  Comment

                  • Slartifartblast
                    Senior Member
                    • Nov 2013
                    • 873

                    #10
                    Originally posted by coder View Post
                    Some of the really weird performance results from the initial 285K testing seemed due to threads getting scheduled on E-cores.
                    Precisely why I went AMD where all cores are equal, no software wackiness with unequal cores.

                    Comment

                    Working...
                    X