Announcement

Collapse
No announcement yet.

AMD EPYC 7773X Performance Continues To Impress With Tremendous Opportunity For Large-Cache Server CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD EPYC 7773X Performance Continues To Impress With Tremendous Opportunity For Large-Cache Server CPUs

    Phoronix: AMD EPYC 7773X Performance Continues To Impress With Tremendous Opportunity For Large-Cache Server CPUs

    Back in March when AMD Milan-X rolled out I published a number of EPYC 7773X benchmarks as well as Milan-X benchmarks in the cloud. Since then there have been new Linux kernel improvements and other changes in the ever-advancing open-source world. Plus simply more time to conduct additional tests over the summer. Here is the latest round of my AMD EPYC 7773X 1P and 2P benchmarking compared to the Milan EPYC 7713/7763 SKUs as well as Intel's Xeon Platinum 8380 "Ice Lake" competition.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I cant help but wonder if you could benchmark 128 containers running a db server or 64 vms running python pandas doing data table extracts( because every tricycle rider likes python)

    Comment


    • #3
      One set of results stuck out a little. Relational databases (MariaDB, PostgreSQL) didn't scale from 1P to 2P. Results were dire for AMD, but even Intel had slowdowns going to 2P. phoronix didn't comment in the article. Anyone has an idea why 2P performance was brutally worse than 1P?

      Comment


      • #4
        Originally posted by jochendemuth View Post
        One set of results stuck out a little. Relational databases (MariaDB, PostgreSQL) didn't scale from 1P to 2P. Results were dire for AMD, but even Intel had slowdowns going to 2P. phoronix didn't comment in the article. Anyone has an idea why 2P performance was brutally worse than 1P?
        That is not that uncommon scenario, generally it is related towards that certain operations are latency sensitive and accessing information from another totally diffrent CPU takes wayyy longer. Also in case of something like SQL i could imagine excessive page locking that slows down operation.

        SQL is basicly ordering memory to do some stuff, it is not particulary "computing" excessive in literal computing meaning.

        This is also why benchmarking stuff with something like rendering (That can be perfectly parellel between tons of CPUs) is stupid. SQL will suffer from locking, video encoding has to limit so slices aren't too big.

        Generally if your workload achieves close to perfect scalability with 2 processor nodes that means you use wrong tool for a job - that tool should be GPU.
        Last edited by piotrj3; 12 July 2022, 01:48 PM.

        Comment


        • #5
          Michael

          Thanks for using the performance governor across the board, since this eliminates the inefficiencies of both intel_pstate powersave & acpi-cpufreq schedutil from the equation.

          Now just try to imagine what AMD-PSTATE performance could do on the Steam Deck:
          Remember how on the HP Dev One Clear Linux managed to come out on top while having both the lowest temperature & power draw on average?
          (For the doubters: https://www.phoronix.com/scan.php?pa...ne-linux&num=8)

          That was with amd-pstate performance, whereas all the other distros were defaulting to the inferior by design schedutil CPU governor, even though Ubuntu was also making use of amd-pstate, too.

          Actually, don't just try to imagine the improvements to the Steam Deck, benchmark them, please...

          Comment


          • #6
            Originally posted by Linuxxx View Post
            Michael

            Thanks for using the performance governor across the board, since this eliminates the inefficiencies of both intel_pstate powersave & acpi-cpufreq schedutil from the equation.

            Now just try to imagine what AMD-PSTATE performance could do on the Steam Deck:
            Remember how on the HP Dev One Clear Linux managed to come out on top while having both the lowest temperature & power draw on average?
            amd_pstate vs acpi_cpufreq are both cpufreq drivers, so it should make no difference which one is used with the performance governor, because the performance governor always says "use the highest possible frequency", and amd_pstate and acpi_cpufreq both expose the same maximum frequency. It is only when using a governor that actually governs that you should see a difference.

            Comment


            • #7
              Originally posted by piotrj3 View Post

              Video encoding has to limit so slices aren't too big.
              Technically a splitter like av1an is very scalable, and TBH is what most should be using for non-realtime encoding if they have the RAM.

              Live reencoding is very different, yeah.

              Comment


              • #8
                Michael

                Page 4: "%8775 USD" should be "$8775 USD"

                Page 9: "wit hHBM2e" should be "with HBM2e"

                Comment


                • #9
                  Originally posted by yump View Post

                  amd_pstate vs acpi_cpufreq are both cpufreq drivers, so it should make no difference which one is used with the performance governor, because the performance governor always says "use the highest possible frequency", and amd_pstate and acpi_cpufreq both expose the same maximum frequency. It is only when using a governor that actually governs that you should see a difference.
                  Both intel_pstate & intel_cpufreq are also just CPU drivers, yet there is a significant difference between the two, even though both of them are using the performance governor:

                  Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite


                  Care to explain where that difference comes from?

                  Comment


                  • #10
                    Originally posted by piotrj3 View Post
                    This is also why benchmarking stuff with something like rendering (That can be perfectly parellel between tons of CPUs) is stupid.
                    It's not stupid because:
                    1. It's a workload some people care about.
                    2. The benchmarks show how effective the additional cache is at boosting performance.
                    3. It can expose weakness in a CPU's interconnect fabric.
                    4. In 2P configurations, it can expose weaknesses in inter-processor communication.
                    5. It can expose software scaling poorly to NUMA architectures.
                    Originally posted by piotrj3 View Post
                    Generally if your workload achieves close to perfect scalability with 2 processor nodes that means you use wrong tool for a job - that tool should be GPU.
                    First, CPUs support much more memory -- GPUs' limited memory capacity is a deal-breaker for some workloads. Second, look at things like NAS parallel benchmarks and timed kernel compilation. Those both scale well to 2P, but would run like garbage on a GPU (if you even could).
                    Last edited by coder; 13 July 2022, 04:33 AM.

                    Comment

                    Working...
                    X