Announcement

Collapse
No announcement yet.

AMD Ryzen 9 3900X SMT Linux Performance Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by xnor View Post

    None of that helps with the issue at hand here.

    Optimizing for SMT has always been in the hand of the application developers but not many ever did.

    atomsymbol's prediction makes a lot of sense. Since application developers fail to optimize their applications for different CPU architectures with SMT it would make sense to have tools with per-application profiles.
    Actually, such tools already exist.
    I've been doing this manually for over a year now on my 1700X, since the security of SMT in general came into question. With the extra cores enabled in BIOS, I wrote a script that will disable the cores at OS startup. Then I can toggle them on or off as needed by my workload. Gaming? 8 cores. Compiling? 16 cores.

    I'd link you to my repo but my account's still fresh, so here it is raw:

    Code:
    #!/bin/bash
    # Try your best to control SMT on original Ryzen CPUs (and maybe others?)
    # Usage: ./smt.sh (enable|disable|status)
    
    num_cores=`lscpu | grep "CPU(s):" | grep -v "," | grep -v "-" | cut -d: -f2 | awk '{print $1}'`
    num_procs=`nproc`
    
    coreLoop() {
        i="1"
        while [ "${i}" -lt "${num_cores}" ]; do
            echo "$1" > /sys/devices/system/cpu/cpu${i}/online
            i=$[ $i + 2 ]
        done
        echo "Done."
        return 0
    }
    
    disable() {
        if [ "${num_procs}" -lt "${num_cores}" ]; then
            echo "SMT has already been disabled!"
            return 1
        else
            echo -n "Disabling SMT: "
            coreLoop 0
        fi
    }
    
    enable() {
        if [ "${num_procs}" -eq "${num_cores}" ]; then
            echo "SMT has already been enabled!"
            return 1
        else
            echo -n "Enabling SMT: "
            coreLoop 1
        fi
    }
    
    status() {
        cat /proc/cpuinfo |egrep "processor|physical id|core id" | sed 's/^processor/\nprocessor/g'
    }
    
    case "$1" in
        disable) disable;;
        enable) enable;;
        status) status;;
        *) echo "   Usage: ./smt.sh (disable | enable | status)";;
    esac

    Comment


    • #12
      Originally posted by HenryM View Post
      One thing I'd like to know is the power consumption differences. in my limited SMT/hyperthread experience (fist gen I7...) it causes a pretty substantial increase in thermal output, and therefore wattage.
      Actually, SMT can significantly increase efficiency. This is one of the main reasons why we have SMT in the first place.

      Of course, in an absolute sense, it will increase power and heat for a single core

      Comment


      • #13
        Originally posted by Qaridarium
        very interesting test. other scaling tests show that the overhead of hyperthreading is higher than the benefit if you have more than 32-64 cores.

        means this technologies was good for dual core cpus or 4-24core cpus... but with 32 cores and up this technology makes no sense what so ever.

        AMD+Intel should agree to disable or completely remove hyperthreading with CPUs who have more than 32 cores.
        First of all, it's called SMT. Hyper-Threading is just Intel's name for their SMT implementation.

        Secondly, I don't see how those statements make any sense. Two threads that communicate with each other have less overhead if they run on the same physical core (through SMT) instead of on separate cores.
        What you saw was probably just bad application design / an application that simply doesn't scale beyond 64-128 cores or the CPU caches and memory subsystem just failing to keep up?

        Or maybe I'm missing something?
        Last edited by xnor; 02 August 2019, 09:43 AM.

        Comment


        • #14
          Originally posted by xnor View Post
          What you saw was probably just bad application design / an application that simply doesn't scale beyond 64-128 cores or the CPU caches and memory subsystem just failing to keep up?
          I think you and Q are saying the same thing... for a given cache/memory subsystem (eg "a platform like AM4 with a dual channel ddr4 ram controller") there is a practical limit to core count x performance for typical workloads, which is true... and is one of the reasons we still make lower core count parts on the same platform.

          That said, each new generation typically brings slightly faster RAM and larger caches, which helps a bit, but there are still practical limits.
          Test signature

          Comment


          • #15
            Originally posted by atomsymbol

            Just a note: num_cores=$(nproc --all) is shorter to write.
            Thanks! I knew there had to be a more elegant way to find that info. It was a working kludge at least.

            Comment

            Working...
            X