Announcement

Collapse
No announcement yet.

Intel Contributes AVX-512 Optimizations To Numpy, Yields Massive Speedups

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by coder View Post
    If Zen 4 has AVX-512 as rumored, AMD should send Intel a "thank you" cake. Considering Alder Lake removed AVX-512, this is a double-win for AMD.
    AFAIK, there is runtime dispatch in *BLAS libs which explicitly checks for intel (rather than instruction set)
    Last edited by RedEyed; 12 October 2021, 04:09 PM.

    Comment


    • #12
      Originally posted by numacross View Post

      Don't you mean Nvidia?
      Those how buy hw for compute, buy Intel + NVIDIA, rather than AMD + NVIDIA

      Comment


      • #13
        Originally posted by RedEyed View Post

        Those how buy hw for compute, buy Intel + NVIDIA, rather than AMD + NVIDIA
        EPYC is more performant (apart from AVX-512), has more PCIe lanes, was first to market with PCIe 4.0 and is cheaper than Xeon. All the deployments I know of went for AMD, and only a few ended up with Intel because of supply issues.

        Comment


        • #14
          Originally posted by numacross View Post

          EPYC is more performant (apart from AVX-512), has more PCIe lanes, was first to market with PCIe 4.0 and is cheaper than Xeon. All the deployments I know of went for AMD, and only a few ended up with Intel because of supply issues.
          Yeah AMD rly has something to offer, I just complaining about lack of SIMD

          Comment


          • #15
            Originally posted by jabl View Post

            It's about using SVML, which provides very fast and vectorized (including avx-512 it appears) versions of math functions like sin(), cos(), log(), gamma() etc. etc.

            Unfortunately SVML isn't open source, so you're not gonna see this performance with the out of the box numpy on your Linux distro.
            I think @clownstown meant that sin, cos don't seem complicated to need extra CPU acceleration, right?

            Comment


            • #16
              Originally posted by smitty3268 View Post

              It looks like the code this is using is BSD licensed, available here: https://github.com/numpy/svml
              Ooh, interesting. I stand corrected.

              Anyway, I believe ctlansdown is right and numpy must not have been using vectorized instructions at all on certain operations in order to get this kind of speedup.
              Very likely, the GNU world has https://sourceware.org/glibc/wiki/libmvec but AFAIK it's not very widely supported yet.

              Comment


              • #17
                Originally posted by coder View Post
                If Zen 4 has AVX-512 as rumored, AMD should send Intel a "thank you" cake. Considering Alder Lake removed AVX-512, this is a double-win for AMD.
                That could lead to some weird benchmark charts. I thought Ryzen might not get AVX-512, but Zen 4 Ryzen and Epyc are probably using the same chiplets...

                Comment


                • #18
                  Originally posted by cl333r View Post

                  I think @clownstown meant that sin, cos don't seem complicated to need extra CPU acceleration, right?
                  I don't know whether that's the intention of @clownstown, but that is completely incorrect. Making a fast AND accurate math library is VERY hard. Even more so if you want it vectorized.

                  And no, since x87 FPU's haven't had hardware implementations of math functions (and in many respects, even the x87 HW implementations were/are shit), library implementations have to implement it using various numerical algorithms based on +, -, *, /.

                  Comment


                  • #19
                    Originally posted by coder View Post
                    If Zen 4 has AVX-512 as rumored, AMD should send Intel a "thank you" cake. Considering Alder Lake removed AVX-512, this is a double-win for AMD.
                    You realize this innovation is meant for server chips, like future Sapphire Rapid ? Not for your typical gamer rig ...

                    Comment


                    • #20
                      But how much more power does it use?

                      Comment

                      Working...
                      X