Announcement

Collapse
No announcement yet.

AMD Ryzen 9 7900X / Ryzen 9 7950X Benchmarks Show Impressive Zen 4 Linux Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Thanks for the benchmarks!

    It would be interesting to see benchmarks showing performance and power efficiency when configured at 65W & 105W TDP vs the high 170W default.
    According to anandtech these can be configured in the bios:
    • 170 W = 230,000 PPT, 160,000 TDC, and 225,000 EDC
    • 105 W = 142,000 PPT, 110,000 TDC, and 170,000 EDC
    • 65 W = 88,000 PPT, 75,000 TDC, and 150,000 EDC

    Comment


    • #62
      Originally posted by alexheretic View Post
      Thanks for the benchmarks!

      It would be interesting to see benchmarks showing performance and power efficiency when configured at 65W & 105W TDP vs the high 170W default.
      According to anandtech these can be configured in the bios:
      • 170 W = 230,000 PPT, 160,000 TDC, and 225,000 EDC
      • 105 W = 142,000 PPT, 110,000 TDC, and 170,000 EDC
      • 65 W = 88,000 PPT, 75,000 TDC, and 150,000 EDC
      See Ars Tech's review here.....

      Comment


      • #63
        5950x runs with too high power target as well. I'm running it at 120W PPT (~88W TDP) and still get over 17k points in geekbench on air cooling in a poorly ventilated case (case fans turned off until GPU kicks in).
        Benchmark results for a Gigabyte Technology Co., Ltd. B550 AORUS PRO AX with an AMD Ryzen 9 5950X processor.

        And this is with 128GB of cheap 3600 CL18 memory (1733Mhz IF, 2T, sadly the memory controller on my chip can't do any better with four sticks).

        With higher clocks, much faster memory subsystem and 2x larger L2 cache, 7950x is definitely better, but nothing magic.
        Last edited by sobrus; 27 September 2022, 09:29 AM.

        Comment


        • #64
          Ok, it was my mistake to call it efficiency since that implies work units by power units.

          My problem is the trend of setting those huge power draw changes out of the box and allowing the CPU to reach those high temps constantly because that will be the reference if people accept it.

          I would have no problem at all if the CPU comes out of the box at 105w and you can later on unlock the 170w tier but backwards set a dangerous trend because motherboards for one will get more expensive since you have to design for 170w(because it is the default), AIO can become an implicit requirement because by default it can reach 95c and while AMD claims is safe for the silicon that is not necessarily true for other part of the hardware especially the components related to power delivery, etc.

          If ppl accept this as normal we can reach the same level of craziness we are seeing trending on GPUs (i guess RTX 50 series will require a dedicated sub station), in fact i would not discard that future generations will require a 12 pin CPU power connector just to keep the PSU cables from overheating and make room for when the new efficient normal is 200+w

          Comment


          • #65
            Originally posted by jrch2k8 View Post
            I would have no problem at all if the CPU comes out of the box at 105w and you can later on unlock the 170w tier but backwards set a dangerous trend because motherboards for one will get more expensive since you have to design for 170w(because it is the default), AIO can become an implicit requirement because by default it can reach 95c and while AMD claims is safe for the silicon that is not necessarily true for other part of the hardware especially the components related to power delivery, etc.
            I know people are going to hate this, but maybe the best solution is some form of regulation. That's often what it takes to stop an industry "race-to-the-bottom" scenario, like the power consumption race that's heated up () between Intel and AMD.

            It doesn't have to be in the form of an outright ban (which I'm not in favor of), but it could be as simple as labeling requirements, or as extreme as an extra tax levied on inefficient computers. Similar techniques have previously been adopted for household appliances and even automobiles.

            It's not ideal, because I don't trust regulators to get it 100% right, and you know that manufacturers will try to game whatever they come up with, but I think it's better than nothing.

            Comment


            • #66
              Originally posted by coder View Post
              I know people are going to hate this, but maybe the best solution is some form of regulation. That's often what it takes to stop an industry "race-to-the-bottom" scenario, like the power consumption race that's heated up () between Intel and AMD.

              It doesn't have to be in the form of an outright ban (which I'm not in favor of), but it could be as simple as labeling requirements, or as extreme as an extra tax levied on inefficient computers. Similar techniques have previously been adopted for household appliances and even automobiles.
              Or you have to accept that those that want lower TDPs are a small minority and most people don't care. Just by 65 W versions, there are always plenty of them. Or use cTDP.

              Comment


              • #67
                Originally posted by Anux View Post
                Or you have to accept that those that want lower TDPs are a small minority and most people don't care. Just by 65 W versions, there are always plenty of them. Or use cTDP.
                As jrch2k8 correctly cites, the issue is out-of-the-box performance. That's the main thing reviewers test (though they often explore other settings, as well) and how most users are likely to use it.

                I'd just want to see something like an efficiency-labeling scheme that reflects the out-of-the-box config. Then, if the minority who wants high-performance at all costs wants to, they can change the defaults and get that extra 10% - 15% of performance for 2x the power.

                Without clear & consistent labeling, even most well-intentioned buyers won't be sophisticated enough to know how much power the computer could actually use.

                Comment


                • #68
                  Originally posted by coder View Post
                  I know people are going to hate this, but maybe the best solution is some form of regulation. That's often what it takes to stop an industry "race-to-the-bottom" scenario, like the power consumption race that's heated up () between Intel and AMD.

                  It doesn't have to be in the form of an outright ban (which I'm not in favor of), but it could be as simple as labeling requirements, or as extreme as an extra tax levied on inefficient computers. Similar techniques have previously been adopted for household appliances and even automobiles.

                  It's not ideal, because I don't trust regulators to get it 100% right, and you know that manufacturers will try to game whatever they come up with, but I think it's better than nothing.
                  The pathetic and embarrassing slow-walking of ATX12VO by motherboard and PSU manufacturers has a far greater effect on desktop efficiency than peak CPU power. People are worried about the difference between a 9 minute compile job that takes 230 W vs a 10 minute compile job that takes 142 W, when the computer is idling at 50+W for the many hours they spend reading, thinking, and typing between compile jobs.

                  On that note, AMD is claiming big idle power optimizations on the IO die, but I haven't seen any reviews yet that put serious focus on wall power at idle. Although, I haven't looked that hard. PCWorld shows it on their graphs in a few places, and it doesn't look great. But the reviewer motherboards are likely to be super high end with all kinds of energy-wasting RGB nonsense, and with Zen 3, memory frequencies above 3200 MBd caused a significant increase of idle power.

                  Comment


                  • #69
                    Originally posted by yump View Post

                    The pathetic and embarrassing slow-walking of ATX12VO by motherboard and PSU manufacturers has a far greater effect on desktop efficiency than peak CPU power. People are worried about the difference between a 9 minute compile job that takes 230 W vs a 10 minute compile job that takes 142 W, when the computer is idling at 50+W for the many hours they spend reading, thinking, and typing between compile jobs.

                    On that note, AMD is claiming big idle power optimizations on the IO die, but I haven't seen any reviews yet that put serious focus on wall power at idle. Although, I haven't looked that hard. PCWorld shows it on their graphs in a few places, and it doesn't look great. But the reviewer motherboards are likely to be super high end with all kinds of energy-wasting RGB nonsense, and with Zen 3, memory frequencies above 3200 MBd caused a significant increase of idle power.
                    you have a point there but my main concern is that draw power set the standard for components and that drives costs up and reliability down.

                    Having a higher power draw by default force manufacturers(motherboard, psu, etc.) to up their materials and costs because sustain that amperage with the precision the CPU needs require a better tier of components, cables, etc. or there are real risks of fires, shorts, overheating, etc. and on those places in the world where electricity is anything but stable also may severely affect the life of the components.

                    For reference i have seen fx9590 back in the day caught fire(the 8 pin CPU connector overheated and shorted) and even blow VRMs on anything but the crop top tier of mobos.

                    but if this trend continue don't be alarmed to see 100$ mobos start creping to 200$+ tiers and your entries PSU moving to the 200$ tiers and beyond simply because CPU/GPU start going lazy and just uping power draw every new generation(if they can save money just drawing more power, they will). I mean a raptor lake i7 + 4070 that is not a 4070 but a 4080 because they wanna charge you harder for a 4070 can probably trigger over current protections on a decent ish gold plus 600w PSU when pushing both the CPU+GPU just in one generation.

                    Comment


                    • #70
                      Originally posted by atomsymbol
                      That is only partially true, because if a compiler writes to a particular CPU register (such as: %rax) twice per clock cycle then the limit isn't the number of ISA (user-visible) registers but it is the number of physical registers in the CPU and the size of the ROB (reorder buffer).
                      I understand how register renaming works, and that once you overwrite a register's contents, it effectively becomes a new register from a data-flow perspective.

                      Where you get burned by a limited number of ISA registers is if the compiler runs out of registers to hold all of the intermediate state needed to compute a result, and then has to resort to spilling stuff to memory.

                      Originally posted by atomsymbol
                      Zen 4 has an integer register file of 224 [64-bit?] entries, floating-point register file of 192 [256-bit?] entries, and a reorder buffer of 320 entries.
                      Where did you find these stats?

                      And yes, the physical registers referred to would be the 64-bit ones. I'm surprised about the number of physical FPU registers, but if those are indeed 256-bit, then it's not too much more than the 128 you'd need to support 2 threads (AVX-512 has 32 x 512-bit ISA registers = 64 x 256-bit per thread).

                      Originally posted by atomsymbol
                      IPC gains in the benchmarked applications range from 1% to 36% (the average being 13%), are NOT a coincidence.
                      Yes, but I'm saying the IPC gains would be lower, if they'd kept the same DDR4 memory and L2 cache size.

                      Originally posted by atomsymbol
                      That said, Intel Alder Lake P-core has a ROB of 512 entries which means that Alder Lake is a less-balanced architecture compared to Zen 4 that has a much smaller reorder buffer. The utilization (effectivity) of Alder Lake's P-core ROB is approximately 25% smaller than the utilization of Zen 4's ROB, while these CPUs are running the same application.
                      Whether a microarchitecture is balanced probably has more to do with the back end. And how do you know what actual utilization is like, for Alder Lake's Golden Cove cores?

                      Originally posted by atomsymbol
                      Zen 4 can execute only 2 stores per cycle. Which such a small number of stores per cycle, it is almost pointless to attempt memory renaming.
                      You'd think it would be the other way around, right? I think the point of memory renaming is to nullify spills by eliminating the corresponding loads & stores!

                      Originally posted by atomsymbol
                      A limiting factor to more stores per cycle are page table lookups (paged virtual memory). Memory segmentation would enable a higher number of stores per cycle, but (1) operating system developers (such as: Linus Torvalds) are against memory segmentation because it substantially increases the complexity of the operating system and (2) programming languages without segmentation-friendly type-systems, such as C/C++/Go/Rust/etc, are the prevalent programming languages today.
                      This is why I think directly-addressable scratch pad memory deserves a second look, in general-purpose CPUs. GPUs and DSPs continue to use it to deliver very strong performance per W. CPUs need to find a way to do likewise.

                      For instance, you could give each thread 1 page of scratchpad memory. You could implement it by locking a corresponding block of L1 cache to a page of physical RAM, with exclusive semantics. That should avoid most of the normal cache & TLB overhead associated with memory accesses to it.

                      Originally posted by atomsymbol
                      It depends on the ratio of [the number of memory stalls] to [the number of concurrent memory requests in flight].
                      What I meant was roughly the amount of time the CPU core was starved by memory latency/bandwidth. In spite of the fancy OoO execution and prefetchers, there are still times these cores are underutilized, simply because most of the work they need to do relies on pending loads. Perhaps a less likely scenario is that they're log-jammed with pending stores.
                      Last edited by coder; 29 September 2022, 10:36 AM.

                      Comment

                      Working...
                      X