Announcement

Collapse
No announcement yet.

Intel Talks Up Xeon Cascade Lake Performance - Up To 48 Cores, 12 DDR4 Channels

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    I am still waiting for my 5Ghz 28 core workstation CPU / MB

    Comment


    • #22
      Originally posted by pegasus View Post
      Correction, anything that uses AVX512. And those apps are still rare. Plus they incur noticeable downclocking because of thermal issues of avx512.
      Nope. Correction of your correction.

      AVX 256 runs at about half the speed on my 8-core Ryzen 2700x as on my Haswell 4770. It's pretty consistent with both OpenBLAS and MKL.
      Last edited by vegabook; 06 November 2018, 09:36 AM.

      Comment


      • #23
        Really? Interesting. Our guys benchmarked equal performance of their avx2 application on 2.0ghz epyc vs 2.5ghz haswell.

        Comment


        • #24
          Originally posted by Weasel View Post
          "Typical workloads" don't always optimize for the CPU properly, in fact most the time they aren't even optimized much at all. It's not the job of a CPU to cater to badly optimized software.
          But at the end of the day, the thing that I'm most likely to do on my computer is to run my (badly optimized) workloads.

          To me it won't matter if the latest meanest "Intel Wonderland Lake Xeon Mu plus 9997 (tm)(c)(r)" is able to perform a few percent best in some obscure BlablaMark 2019, what matters to me is that the fucking old buggy code i need to run for work runs at a decent enough speed (even if the original devs never ever though about optimizing for the upcoming "AVX2048"), and that there aren't yet again three new spectre variant vulnerability announced this week risking to insta-pwn my computer before I even started to plug in the network.

          To get the example about ray tracing you're giving : GPU with special RTX features (like Nvidia's) are a nice feature, but if I were a gfx designer and my workload doesn't benefit from it and runs all the same (e.g.: I think it's not available in blender yet), then it's only a gimmick and I wouldn't have any interest in buying them.

          There's no point in buying some bleeding edge future tech when it doesn't help your current use case, and hope and rely that some future software improvement will make it suddenly relevant (see Itanic's hope for some future magical VLIW-optimizing compiler).

          Comment


          • #25
            Originally posted by DrYak View Post
            But at the end of the day, the thing that I'm most likely to do on my computer is to run my (badly optimized) workloads.

            To me it won't matter if the latest meanest "Intel Wonderland Lake Xeon Mu plus 9997 (tm)(c)(r)" is able to perform a few percent best in some obscure BlablaMark 2019, what matters to me is that the fucking old buggy code i need to run for work runs at a decent enough speed (even if the original devs never ever though about optimizing for the upcoming "AVX2048"), and that there aren't yet again three new spectre variant vulnerability announced this week risking to insta-pwn my computer before I even started to plug in the network.

            To get the example about ray tracing you're giving : GPU with special RTX features (like Nvidia's) are a nice feature, but if I were a gfx designer and my workload doesn't benefit from it and runs all the same (e.g.: I think it's not available in blender yet), then it's only a gimmick and I wouldn't have any interest in buying them.

            There's no point in buying some bleeding edge future tech when it doesn't help your current use case, and hope and rely that some future software improvement will make it suddenly relevant (see Itanic's hope for some future magical VLIW-optimizing compiler).
            I think you're missing the points of the benchmarks. They were made to benchmark the potential of the CPU itself. Not specific workloads even if they're "typical".

            Personally I find it cool to see how much they can push it, even if it needs special software designed for it.

            Comment


            • #26
              Originally posted by Weasel View Post
              I think you're missing the points of the benchmarks. They were made to benchmark the potential of the CPU itself. Not specific workloads even if they're "typical".

              Personally I find it cool to see how much they can push it, even if it needs special software designed for it.
              And what is it that you think "potential" is? I can answer that for you, it's the -potential- the actual programs in the actual configuration that you're actually gonna use should exhibit....

              Nothing else that you could imagine could possibly -be- potential.
              existing in possibility : capable of development into actuality; expressing possibility; specifically : of, relating to, or constituting a verb phrase expressing possibility, liberty, or power by the use of an auxiliary with the infinitive of the verb (as in 'it may rain')… See the full definition


              I know you refuse to read definitions I link to, but you should read this one because potential is directly linked to actuality.

              Comment


              • #27
                Originally posted by duby229 View Post
                And what is it that you think "potential" is? I can answer that for you, it's the -potential- the actual programs in the actual configuration that you're actually gonna use should exhibit....
                Allow me to let you in on a little secret.

                Intel don't give a shit about YOUR programs, configuration, or workload. Neither do I. What you consider "typical" some others don't use and don't really care at all.

                "Typical workload" is not an universal fact, a law of physics, or an objective fact. Maybe i don't use your software or your "typical workload" and am interested to see the true potential of this CPU, if I optimize for it. Yeah, some of us do code, so we care about stuff like that.

                If you don't, then go look for community-driven benchmarks for "typical workloads" for the average dumb Joe.

                You're not the fucking center of the world, and neither is the majority of humanity.

                And lastly, even your linked definition says "something that CAN BECOME" not "something that IS" which is what you'd want with "typical workload". The CPU *can* become fast as in the benchmarks, if software is written for it.

                Comment


                • #28
                  Originally posted by Weasel View Post
                  Allow me to let you in on a little secret.

                  Intel don't give a shit about YOUR programs, configuration, or workload. Neither do I. What you consider "typical" some others don't use and don't really care at all.

                  "Typical workload" is not an universal fact, a law of physics, or an objective fact. Maybe i don't use your software or your "typical workload" and am interested to see the true potential of this CPU, if I optimize for it. Yeah, some of us do code, so we care about stuff like that.

                  If you don't, then go look for community-driven benchmarks for "typical workloads" for the average dumb Joe.

                  You're not the fucking center of the world, and neither is the majority of humanity.

                  And lastly, even your linked definition says "something that CAN BECOME" not "something that IS" which is what you'd want with "typical workload". The CPU *can* become fast as in the benchmarks, if software is written for it.
                  So much wrong here I'm not sure where to start. I think I'm gonna have to make a numbered list to keep your fallacies straight in my own mind.

                  1: Fuck typical, think actual. What are the configurations that people -actually- use. Those are what's most important.
                  2: So I'm not the center of the world and most of humanity is not the center of the world, so that must mean that you are? I don't get it. It seems too conceded of you to be true.
                  3: A synthetic benchmark that doesn't identify a specific bottleneck -CAN'T EVER- become something that would benefit actual programs.

                  EDIT: Oh yeah, and 4: You think Intel doesn't care about optimizing their processors for the workloads that people actually use? You better think again about that.
                  Last edited by duby229; 07 November 2018, 09:38 AM.

                  Comment


                  • #29
                    Originally posted by torsionbar28 View Post
                    I assume you're referring exclusively to AVX 512? AMD has had performant 128 and 256 bit AVX extensions for years now. Which real world applications (i.e. not synthetic benchmarks) did you have in mind that use AVX512? After all, AVX512 is just wasted silicon for workloads that do not utilize it.

                    Not to mention that even intel's own implementation of AVX512 kind of sucks, and delivers maybe ~10% performance improvement over AVX256, at a cost of greatly increased power consumption, high die temp, and heavy clock throttling. I.e. not really worth it, even if you have an app that can use it. Plus the Xeon Platinum and i9 are the only available chips with AVX512 support, so not enough of the market for any developers to care, even if there was a benefit to it.
                    Try this. Using either openblas or mkl, in IPython:

                    import numpy as np
                    xx = np.random.rand(1000000).reshape(1000, 1000)
                    %timeit np.linalg.eig(xx)


                    This is very real world. In finance, we need to know the principal components of multiple securities. AMD AVX256 is about 40% slower on my Ryzen 2700x than on my Haswell i7.

                    I get about 1.5 seconds on 2700x and about 0.85 seconds on Haswell. No clue how this would get even better using AVX512. And by the way, using a non-AVX enabled BLAS library, this takes 5+ seconds. So I hear you on heat etc, but in the real world, Intel's AVX absolutely rocks. And I'm an AMD guy.

                    (I should add that on CUDA, with cupy, this is like 0.1 seconds using a rubbish 1050ti. Using cupy.eigh).
                    Last edited by vegabook; 07 November 2018, 04:51 PM.

                    Comment


                    • #30
                      Originally posted by duby229 View Post
                      1: Fuck typical, think actual. What are the configurations that people -actually- use. Those are what's most important.
                      Who's "people"? Is that some scientific method measurable fact? I'm a human, does that mean I'm part of "people"? If so, then you lost already, since I'm interested in the max potential of the CPU, and I'm "people" so...

                      Originally posted by duby229 View Post
                      2: So I'm not the center of the world and most of humanity is not the center of the world, so that must mean that you are? I don't get it. It seems too conceded of you to be true.
                      You're the only one who said they need to cater the benchmarks to someone's needs, so no, it doesn't mean I am. I never said they have to cater the benchmarks to my use case.

                      I said I want them to show the true max potential of this CPU, given the proper software for it. Even software that doesn't exist yet (because it can be written, by me or someone else). So the benchmarks are fine.

                      Originally posted by duby229 View Post
                      3: A synthetic benchmark that doesn't identify a specific bottleneck -CAN'T EVER- become something that would benefit actual programs.
                      Just imagine how retarded such a way of thinking is. By that logic, any new instruction is completely pointless cause it can't benefit "actual existing programs".

                      Newsflash: new programs can be written or optimized for a CPU. Before you spend effort optimizing it, though, you need to know WHAT to optimize, and the true potential of a CPU. So benchmarks like these are awesome for that.


                      Example: They release AVX-1024. No program in existence uses them (obviously), but someone is very interested to find out the potential of these CPUs using AVX-1024 before they rewrite their code for it. WTF is so hard to get?

                      Another example: You know that in synthetic benchmarks the CPU can reach X flops. You optimize your app and it falls short of it, about 80% of the true potential. That means there's still room to optimize it. You'd never know this without the REALLY USEFUL synthetic benchmarks, since there's no other app that comes even close.

                      Dummy.
                      Last edited by Weasel; 08 November 2018, 12:19 PM.

                      Comment

                      Working...
                      X