Announcement

Collapse
No announcement yet.

CPUs From 2004 Against AMD's New 64-Core Threadripper 3990X + Tests Against FX-9590

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by atomsymbol
    Maybe the compiler should be more clever and use floating-point division if it produces an exact result when dividing integers.
    Merge requests accepted.

    Comment


    • #22
      Originally posted by Zucca View Post
      Ouch.
      My FX-8350 now seems quite a slow CPU... :|
      Its not actually. remember that you paid about 200USD for that CPU. if you spend the same amount of money today you get a cpu that is about 80-90% faster on single threaded tasks and about 120-130% on multi threaded. So in 7-8 years for the same money you dont even get a CPU that is twice as fast on most tasks

      Comment


      • #23
        Originally posted by atomsymbol

        Some notes:
        • The multiplication almost never overflows in this case because the loop terminates due to (i*i <= test), where "test" is an int and "i" increases by 1 per loop iteration
        • The C/C++ type int is 32-bit even on 64-bit systems
        Automatic integer promotion rules happen, I think. I'm almost sure I've seen the multiply happen in 64-bits and then compare against a 64 bit register value even if it's an int. Because otherwise the compiler would have to output lame assembly to truncate the result, and there's a reason int overflow is undefined. Especially on uarch like PowerPC. I'll have to compile this one and look at the machine code I guess.

        Comment


        • #24
          Originally posted by atomsymbol

          Some notes:
          • The multiplication almost never overflows in this case because the loop terminates due to (i*i <= test), where "test" is an int and "i" increases by 1 per loop iteration
          • The C/C++ type int is 32-bit even on 64-bit systems
          Oh and sorry, you are correct about the comparison. It will exit the loop before it overflows. But that multiply did jump out at me.

          Comment


          • #25
            Originally posted by Raka555 View Post

            I repeated it several times.
            Even my 4600u intel laptop is faster than the 3700x with this.

            You can check for yourself:
            Compile with "gcc -O3 prime.c -o prime_gcc"
            It's worth pointing out that such a simple calculation could be parallelized trivially, so AMD's core-count lead over Intel should be considered when looking at these results.

            However, it's safe to say that integer division isn't a strong suit for AMD's current architecture. I think most non-trivial number crunching applications are a lot smarter about how they do that kind of thing and would be using AVX(2/512) instead.

            Comment


            • #26
              Originally posted by Zan Lynx View Post

              Automatic integer promotion rules happen, I think. I'm almost sure I've seen the multiply happen in 64-bits and then compare against a 64 bit register value even if it's an int. Because otherwise the compiler would have to output lame assembly to truncate the result, and there's a reason int overflow is undefined. Especially on uarch like PowerPC. I'll have to compile this one and look at the machine code I guess.
              Oh well, I was wrong. I guess the C Committee at some point decided integer promotion had gone on long enough and it was a bad idea, so it stops at int. Everything gets promoted to int but not to long. And multiplication is truncated to its operand types even though it really ought to promote to the double-length type if available. No wonder this area is such a minefield of security problems. I just always assume it's doing something weird.

              Comment


              • #27
                The real crazy thing is the FX-9590 uses just barely less power.

                Comment


                • #28
                  Originally posted by tg-- View Post


                  Funny, you mention benchmarks designed for that purpose and in the next breath mention a benchmark designed for that purpose to make the opposite point.
                  The prime calculator likely uses a special code path for Intel's AVX which the 3770 supports, and will fall back to a generic, slow, codepath for AMD (where the Ryzens now would support the same AVX that 3770 did).
                  The threadripper will outperform the old 3770 in any metric at any time, even intel-optimized AVX. It just can't outperform it if it can't use its accelerated paths, which is a decision of the software developer.

                  Look here for the "special Intel optimized code", that no AMD processor ever made, can run faster than a 10 year old i7-3770...

                  https://www.phoronix.com/forums/foru...29#post1158629
                  Last edited by Raka555; 09 February 2020, 04:51 AM.

                  Comment


                  • #29
                    Originally posted by birdie View Post

                    What are your numbers? Here on Ryzen 7 3700X:

                    Code:
                    $ time ./prime_gcc
                    664580
                    
                    real 0m8.030s
                    user 0m8.030s
                    sys 0m0.000s
                    If you mean what I get when timing it:

                    r7-3700x:
                    real 0m8.138s

                    i7-3770:
                    real 0m4.083s

                    i7-4600u:
                    real 0m4.883s

                    Comment


                    • #30
                      Originally posted by Raka555 View Post
                      vs Raspberry PI ?

                      It is actually not that impressive that it is only 4x faster with compiling the kernel and only 2x faster encoding an mp3 than the AMD FX-9590 with its 64/128 against 8/8...

                      For me lots of cores only looks good in benchmarks designed for that purpose.
                      In the "real world" you still get bad diminishing returns ...

                      Yesterday I ran a program that calculates prime numbers and I was not impressed that my 2012 model i7-3770 (3.4GHz/3.9GHZ) did it in 4s versus my "shinny new" ryzen7-3700x (3.6Ghz/4.4GHz) which only managed 8s....

                      Running bloatware is where the ryzens shine with all their cache, but pure calculations intel seems to still be far ahead...
                      Too bad the vast majority of cpu operations are not calculations but memory access and data movement. You see, modern cpus aren't hampered by calculation performance, it has been adequate for decades, especially calculations like pi which don't need RAM access at all. The main issue with cpus is memory latency, that is what hinders performance. It doesn't matter how fast you can calculate stuff if you can't get to the data fast enough.

                      It is obvious that you are quite ignorant of how cpus work, and you think calculating pi is somehow a good indicator of cpu performance. If you think that way, then please by all means, donate your Ryzen7-3700x somewhere and upgrade your cpu performance to the i7-3770. Moron.

                      Comment

                      Working...
                      X