Announcement

Collapse
No announcement yet.

CPUs From 2004 Against AMD's New 64-Core Threadripper 3990X + Tests Against FX-9590

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    The real crazy thing is the FX-9590 uses just barely less power.

    Comment


    • #32
      Originally posted by tg-- View Post


      Funny, you mention benchmarks designed for that purpose and in the next breath mention a benchmark designed for that purpose to make the opposite point.
      The prime calculator likely uses a special code path for Intel's AVX which the 3770 supports, and will fall back to a generic, slow, codepath for AMD (where the Ryzens now would support the same AVX that 3770 did).
      The threadripper will outperform the old 3770 in any metric at any time, even intel-optimized AVX. It just can't outperform it if it can't use its accelerated paths, which is a decision of the software developer.

      Look here for the "special Intel optimized code", that no AMD processor ever made, can run faster than a 10 year old i7-3770...

      https://www.phoronix.com/forums/foru...29#post1158629
      Last edited by Raka555; 02-09-2020, 04:51 AM.

      Comment


      • #33
        Originally posted by birdie View Post

        What are your numbers? Here on Ryzen 7 3700X:

        Code:
        $ time ./prime_gcc
        664580
        
        real 0m8.030s
        user 0m8.030s
        sys 0m0.000s
        If you mean what I get when timing it:

        r7-3700x:
        real 0m8.138s

        i7-3770:
        real 0m4.083s

        i7-4600u:
        real 0m4.883s

        Comment


        • #34
          Originally posted by Raka555 View Post
          vs Raspberry PI ?

          It is actually not that impressive that it is only 4x faster with compiling the kernel and only 2x faster encoding an mp3 than the AMD FX-9590 with its 64/128 against 8/8...

          For me lots of cores only looks good in benchmarks designed for that purpose.
          In the "real world" you still get bad diminishing returns ...

          Yesterday I ran a program that calculates prime numbers and I was not impressed that my 2012 model i7-3770 (3.4GHz/3.9GHZ) did it in 4s versus my "shinny new" ryzen7-3700x (3.6Ghz/4.4GHz) which only managed 8s....

          Running bloatware is where the ryzens shine with all their cache, but pure calculations intel seems to still be far ahead...
          Too bad the vast majority of cpu operations are not calculations but memory access and data movement. You see, modern cpus aren't hampered by calculation performance, it has been adequate for decades, especially calculations like pi which don't need RAM access at all. The main issue with cpus is memory latency, that is what hinders performance. It doesn't matter how fast you can calculate stuff if you can't get to the data fast enough.

          It is obvious that you are quite ignorant of how cpus work, and you think calculating pi is somehow a good indicator of cpu performance. If you think that way, then please by all means, donate your Ryzen7-3700x somewhere and upgrade your cpu performance to the i7-3770. Moron.

          Comment


          • #35
            Originally posted by Raka555 View Post
            vs Raspberry PI ?

            It is actually not that impressive that it is only 4x faster with compiling the kernel and only 2x faster encoding an mp3 than the AMD FX-9590 with its 64/128 against 8/8...

            For me lots of cores only looks good in benchmarks designed for that purpose.
            In the "real world" you still get bad diminishing returns ...

            Yesterday I ran a program that calculates prime numbers and I was not impressed that my 2012 model i7-3770 (3.4GHz/3.9GHZ) did it in 4s versus my "shinny new" ryzen7-3700x (3.6Ghz/4.4GHz) which only managed 8s....

            Running bloatware is where the ryzens shine with all their cache, but pure calculations intel seems to still be far ahead...
            An actual pi3 b+ on 20.04 arm64 gives this result
            real 0m18.248s
            user 0m18.193s
            sys 0m0.005s
            That way neither 3770 nor 3700x looks impressive.
            This isn't a meaningful benchmark at all.

            Comment


            • #36
              Originally posted by muncrief View Post
              I held on to my old FX 6300/990FX system all the way up until Zen2, when I purchased my current R7-3700X/X570 system.

              Needless to say the performance difference is astounding

              I'm so glad I stuck with AMD even though I had to live with lesser performance for quite awhile. But I knew that if AMD went under we'd be at the mercy of the predatory Intel corporation forever.

              As an embedded systems engineer who designed a few simple custom microprocessors and microcontrollers back in the day I realized what a monumental error Bulldozer was. And a year or so after its release I also sadly realized that the architecture couldn't be salvaged, and it would be awhile before a new one could be developed. I didn't know it would be quite this long, but still the wait was worth it.
              Actually, as an embedded systems engineer you are a disgrace and you don't even know what you are talking about.... I expect more from people who supposedly know about cpu design. Not that simple microcontrollers are a feat, in university they design them at first year these days, but still....

              Bulldozer was a great architecture and was a step towards Fusion. AMD's grand plan was to eliminate FPU and SIMD from the cpu cores completely, eventually, and move those calculations on the iGPU. This makes a metric ton of sense, since cpu cores only rarely calculate floating point math. And those calculations are better suited for gpgpu, which is only hindered these days by pcie latency. AMD Fusion was the best idea for cpus in 2 decades. But AMD didn't have the software and marketing grunt to push for such change, and Intel realising they would lose if AMD went that road, doubled up on AVX and their floating point calculations, especially per thread.

              These days on 7nm, cpu cores even with all those SIMD parts, are TINY. It would have made a lot more sense to have even tinier cpu cores by removing the floating point units (which cost a LOT of silicon), adding tons of cache, and a beefy igpu, and move those calculations there. It would have been far better performant. It would allow the cpu cores to stop bothering with things they are not at their best, and leave the igpu do what it is best suited for... But this failed to evolve because idiots thought Bulldozer was a failure just because video games relied still on single and dual cores and as we all know, gaming is the most important thing in computing.... Even today intel sells a ton of cpus because it has slightly better per core performance and this matters to gaming. People are cretins. Now all AMD is doing is copying Intel's design but selling it at a far lower profit margin.... Yay.

              Comment


              • #37
                Originally posted by mlau View Post

                Yes, this performs over twice as fast on intel hardware. Run it with perf on Zen:

                Code:
                # perf stat ./rand
                664580
                
                Performance counter stats for './rand':
                
                7.479,07 msec task-clock:u # 0,999 CPUs utilized
                0 context-switches:u # 0,000 K/sec
                0 cpu-migrations:u # 0,000 K/sec
                52 page-faults:u # 0,007 K/sec
                35.198.688.120 cycles:u # 4,706 GHz
                15.034.444 stalled-cycles-frontend:u # 0,04% frontend cycles idle
                33.164.471.465 stalled-cycles-backend:u # 94,22% backend cycles idle
                17.446.305.050 instructions:u # 0,50 insn per cycle
                # 1,90 stalled cycles per insn
                3.508.141.911 branches:u # 469,061 M/sec
                4.877.261 branch-misses:u # 0,14% of all branches
                I guess integer division is not a strong point of Zen.
                How do you tell it is integer division that is so low ?

                Comment


                • #38
                  Originally posted by TemplarGR View Post
                  Bulldozer was a great architecture and was a step towards Fusion. AMD's grand plan was to eliminate FPU and SIMD from the cpu cores completely, eventually, and move those calculations on the iGPU.
                  The fun fact is that this is what is happening just now. Intel is slowly embedding every sort of coprocessor in its CPUs ( starting with FPGAs ). They are working to build an API to supersede OpenCL, so software can take advantage of iGPUs instead of relaying on AVX.

                  But I don't see a real point in stripping the vector processors from CPU cores. Yes you can, but you just put them into the iGPU. AMD APUs with HSA were just a bunch of CPU cores + iGPU, with the floating point part developed much more on the iGPU than the CPUs. This simplifies CPUs design of course. And iGPUs must implement vector processors nonetheless. So, at least, you don't waste transistors and energy on powering SIMD processors in the CPU.

                  Originally posted by TemplarGR View Post
                  Even today intel sells a ton of cpus because it has slightly better per core performance and this matters to gaming.
                  Not even this. In the benchmark, Ryzens are very often on top of Intel CPUs on single threading performance.

                  Comment


                  • #39
                    Originally posted by Raka555 View Post

                    How do you tell it is integer division that is so low ?
                    Run the program with "perf record", and look at the data with perf annotate.
                    The test for the remainder being zero (test edx, edx) takes up 90% of the spent time,
                    at least on my system. The code generated is identical for haswell and zen.
                    Maybe it's also a code scheduling issue in gcc? amd is far behind intel in compiler optimizations.

                    Comment


                    • #40
                      Good old FX doesn't look half bad in these comparisons. With only 4 full cores ( each with two halves) on a a chip so old againt cutting-edge newest generation over the top model. Excavatro wasn't that bad for tasks that could be spread amongst those threads and run on optimized code...

                      Comment

                      Working...
                      X