Announcement

Collapse
No announcement yet.

Exploring The Zen 5 SMT Performance With The AMD EPYC 9755 "Turin" CPU

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Exploring The Zen 5 SMT Performance With The AMD EPYC 9755 "Turin" CPU

    Phoronix: Exploring The Zen 5 SMT Performance With The AMD EPYC 9755 "Turin" CPU

    Continuing on with the testing around the AMD EPYC 9005 series "Turin" processors, today is a look at the Simultaneous Multi-Threading (SMT) performance impact for Turin while using the AMD EPYC 9755 as the highest-end "Turin Classic" processor with 128 cores / 256 threads. Similar SMT on/off tests for "Turin Dense" with the EPYC 9965 192-core / 384-thread will also be coming in a future benchmarking comparison on Phoronix. These tests are mainly intended for reference purposes for those curious about the SMT benefits at such high core counts and what workloads may or may not still benefit from SMT especially when having so many threads while using 12-channel DDR5-6000 memory.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    AMD SMT is incredible - it increases performance AND lowers power consumption!!!

    Comment


    • #3
      Simultaneous Multithreading creates 2 Virtual CPUs per Core to prevent task execution from being blocked/queued when core's primary thread is occupied. This used to massively improve performance in the past with low core count and lower single-thread (single-core) performance (e.g. i7-4770K).

      However, today many games/heterogeneous-tasks actually run faster with SMT disabled due to modern CPUs having much faster execution (higher single-core performance and core count). In other words, SMT can actually hurt your performance if you share the the resources of a core​ with tasks that are low-threaded (or single-threaded). Nonetheless, take this with a pinch of salt since multi-threaded/homogeneous-tasks (compilation, rendering, encoding/decoding, compression/decompression) still benefit from SMT

      Curious how big of a difference we'd observe with Ryzen or even Threadripper CPU which has much higher base- & boost clock
      CPU Cores / Threads Base Clock Boost Clock
      EPYC 9755 128 / 256 2.7 GHz 4.1 GHz
      Threadripper 7970X 32 / 64 4.0 GHz 5.3 GHz
      Ryzen 7950X 16 / 32 4.5 GHz 5.7 GHz

      Edit: fixed minor logic error
      Last edited by Kjell; 21 October 2024, 07:23 AM.

      Comment


      • #4
        This is somewhat confusing because it mixes SMT effect with the scalability problems -- especially because trivially parallel stuff seem to give SMT a clear win. The proper way to test this would be to re-run the benchmark for thread counts from say 8 (ideally from 1, but this takes a lot of time for little gain) to max and present it as a curve. If some code's speed saturates at like 12 threads it would be massacred by a two-fold increase in thread count, with or without SMT.

        Comment


        • #5
          Apart from pinning cores, is there any utility/scheduler for assigning Physical Cores to a program to disable SMT (but not system-wide)?
          Last edited by Kjell; 17 October 2024, 05:25 PM.

          Comment


          • #6
            Originally posted by Kjell View Post
            However, today many games/heterogeneous-tasks actually run faster with SMT disabled due to modern CPUs having much higher single-core performance and core count.
            Proper scheduling completely eliminates this issue.

            Comment


            • #7
              Originally posted by Kjell View Post
              Apart from pinning cores, is there any utility/scheduler for assigning Physical Cores to a program to disable SMT (but not system-wide)?
              I could be wrong here, but AFAIU you can use "taskset -c 0,1 ./your_program" to pin CPU cores to a specific task. As far disabling SMT, just as Michael said, "cat /sys/module/cpu/parameters/smt" you can "echo=on/off" to enable/disable SMT at runtime.
              Last edited by carguello2; 23 November 2024, 06:44 PM. Reason: Typo.

              Comment


              • #8
                Originally posted by carguello2 View Post
                "taskset -c 0,1 ./your_program" to pin CPU cores to a specific task

                "cat /sys/module/cpu/parameters/smt" you can "echo=on/off"
                Disabling Sibling Cores (VCores / SMT) + Pinning isn't effective as you're sacrificing multi-thread performance when the pinned cores are idle

                Ideally the scheduler should dynamically reserve the full core (without SMT) when the (specified) application requests CPU time​ to avoid sharing the Core + cache with other tasks
                Last edited by Kjell; 17 October 2024, 08:41 PM.

                Comment


                • #9
                  Originally posted by Daktyl198 View Post
                  Proper scheduling completely eliminates this issue.
                  I mentioned this in the orginal edit but decided to remove it since there doesn't seem to be any scheduler doing it yet AFAIK
                  Last edited by Kjell; 17 October 2024, 08:23 PM.

                  Comment


                  • #10
                    Originally posted by Kjell View Post
                    However, today many games/heterogeneous-tasks actually run faster with SMT disabled due to modern CPUs having much higher single-core performance and core count. In other words, SMT can actually hurt your performance if you share the the resources of a core​ with tasks that are low-threaded (or single-threaded). Nonetheless, take this with a pinch of salt since multi-threaded/homogeneous-tasks (compilation, rendering, encoding/decoding, compression/decompression) still benefit from SMT
                    Games have suffered from poor utilization of cores/threads as far as I can tell. The developers still don't know how to write engines. Many other tasks are not that demanding in terms of CPU load.

                    Comment

                    Working...
                    X