Announcement

Collapse
No announcement yet.

Intel Publishes Blazing Fast AVX-512 Sorting Library, Numpy Switching To It For 10~17x Faster Sorts

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by jayN View Post
    wikichip shows dual avx512 units per core on some of the server chips... but, probably right .. just dual the 512 bits of FMA operations.
    https://en.wikichip.org/wiki/intel/m...s/cascade_lake
    ark.intel.com actually has a field which indicates single vs. dual-FMA of their large-die 14 nm CPUs. The label is literally "# of AVX-512 FMA Units" and here are two Cascade Lake models with differing numbers:

    Comment


    • #22
      Originally posted by onlyLinuxLuvUBack View Post
      The graveyard of intel grows: itanic64, pentium4, optane, vroc, and now avx512
      Why Pentium 4? You mean Netburst? According to this analysis, it actually had some useful ideas that reappeared in later architectures:

      In the world of today’s high performance CPUs, major architectural changes don’t happen often. Iterating off a proven base is safer, cheaper, and faster than attempting to massively rework the basi…


      And I read they actually brought back VROC, based on customer demand.

      Of course, AVX-512 is very much not dead, at Intel. Only their E-cores (and therefore Gen 12+ consumer CPUs) lack it. See:

      Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

      Comment


      • #23
        Originally posted by NeoMorpheus View Post
        Soooo, reading your text wall of intel anticonsumer crap, you still want to give them money?

        I sincerely hope they die and soon. They have done enough damage to the industry as it is (RIP my beloved Alpha).
        They've done (and continue to do) a lot for open source, including most of the work on library, OS, and toolchain support for AVX-512 that exists to date. They're trying to give us a 3rd choice for GPUs. And they're an active (if smaller) contributor to the RISC-V community.

        None of these companies is a saint. There's a track-record of anti-competitive behavior, for sure. Still, what they're doing with their independent foundry business is a very positive development, IMO -- I wanted Intel to be broken up and its fabs to be split out since more than a decade ago. And having Intel as a CPU competitor sharpens everyone else.

        I don't love Intel, but I think they're an important player and I still have plans to use their CPUs and GPUs, among others'. If you want AVX-512, you can either buy a Zen 4-based CPU or Intel has some Xeon W (workstation) or Xeon Scalable processors they'd like to sell you.

        Comment


        • #24
          This is bullshit because it compares it to a non-vectorized crap implementation. Now compare it to the best AVX2 implementations and then see.

          Comment


          • #25
            This is the type of stuff that Intel needs to do a lot more of to truly show how powerful AVX-512 is for real-world use cases. String/XML/JSON processing is another area ripe for lots of improvement.

            Comment


            • #26
              Originally posted by chuckula View Post
              This is the type of stuff that Intel needs to do a lot more of to truly show how powerful AVX-512 is for real-world use cases. String/XML/JSON processing is another area ripe for lots of improvement.
              I'm with Linus that using AVX-512 for light-weight stuff is a bad move, at least until we get a bit further past some of the early AVX-512 CPUs where doing so would trigger clock-throttling that can actually hurt overall performance.

              While I was writing the post comparing the new Qualcomm server chip, Centriq, to our current stock of Intel Skylake-based Xeons, I noticed a disturbing phenomena.


              BTW, the library they used isn't AVX-512 specific, even though that's the case cited in the headline numpy performance numbers.

              Comment


              • #27
                Originally posted by coder View Post
                They've done (and continue to do) a lot for open source, including most of the work on library, OS, and toolchain support for AVX-512 that exists to date. They're trying to give us a 3rd choice for GPUs. And they're an active (if smaller) contributor to the RISC-V community.

                None of these companies is a saint. There's a track-record of anti-competitive behavior, for sure. Still, what they're doing with their independent foundry business is a very positive development, IMO -- I wanted Intel to be broken up and its fabs to be split out since more than a decade ago. And having Intel as a CPU competitor sharpens everyone else.

                I don't love Intel, but I think they're an important player and I still have plans to use their CPUs and GPUs, among others'. If you want AVX-512, you can either buy a Zen 4-based CPU or Intel has some Xeon W (workstation) or Xeon Scalable processors they'd like to sell you.
                Nah.

                They are acting all friendly and shit because they are being beaten down and badly.

                I know very well what Intel on top does to us the consumer and the industry and ignoring that is being very naive.

                Sorry for the colorful language, but fuck'em and they can take their equally horrible friends at nvidia to the same shallow grave.

                I prefer someone new to r@pe me without lube than someone that already did, because at least that way, it was unexpected or as they say, "Once, shame on you. Twice?, shame on me."

                Comment


                • #28
                  Originally posted by NeoMorpheus View Post

                  Nah.

                  They are acting all friendly and shit because they are being beaten down and badly.

                  I know very well what Intel on top does to us the consumer and the industry and ignoring that is being very naive.

                  Sorry for the colorful language, but fuck'em and they can take their equally horrible friends at nvidia to the same shallow grave.

                  I prefer someone new to r@pe me without lube than someone that already did, because at least that way, it was unexpected or as they say, "Once, shame on you. Twice?, shame on me."
                  I totally get yout point when it comes to Intel's milking, but isn't AMD just as bad nowadays? AM5 is disgustingly expensive as a platform, so are their GPUs. I miss the times where there was a price/performance champion giving us more for less.

                  Comment


                  • #29
                    Originally posted by NeoMorpheus View Post
                    Nah.

                    They are acting all friendly and shit because they are being beaten down and badly.

                    I know very well what Intel on top does to us the consumer and the industry and ignoring that is being very naive.
                    FWIW, Intel seems to do its best when they're down. After Pentium 4, we got Core 2. And after Ice Lake & Rocket Lake, we got Alder Lake.

                    I'm eager to see how much they can catch up to TSMC, with Intel 4 and their 18A nodes. Sierra Forest should be very interesting, as well. Doing some scaling analysis on it could be fascinating.

                    Comment


                    • #30
                      Originally posted by coder View Post
                      I'm with Linus that using AVX-512 for light-weight stuff is a bad move, at least until we get a bit further past some of the early AVX-512 CPUs where doing so would trigger clock-throttling that can actually hurt overall performance.

                      While I was writing the post comparing the new Qualcomm server chip, Centriq, to our current stock of Intel Skylake-based Xeons, I noticed a disturbing phenomena.


                      BTW, the library they used isn't AVX-512 specific, even though that's the case cited in the headline numpy performance numbers.
                      Cloudflare is a web hosting service and they are not experts in actually writing software targeting hardware, unlike the developers who posted these results. As for Torvalds, his last foray into hardware was over 20 years ago when Transmeta did.. not much commercially before folding. If AVX-512 is so terrible for Linux, he should also stop most GPU driver development because it takes literally millions of lines of often buggy kernel code just to get a different model of GPU to turn on. Compared to that AVX-512 is a trivial point-patch. You might as well say that Zen 4 is a failure because Bulldozer wasn't a crap architecture. Incidentally, a bunch of the "AVX-512 is slow" noise comes from FMA operations on the earliest AVX-512 implementations that did result in downclocking, but were still faster than AVX2 FMA's. However, most AVX-512 instructions for non-HPC workloads have little or no clockspeed reductions. So basically: Don't mix small batches of HPC FMAs into your non-HPC workstreams on old AVX-512 implementations. But that's totally irrelevant to these restuls.

                      In the real world, Daniel Lemire has been showing massive improvements using AVX-512 on a wide range of workloads for a long time, and there just needs to be more software support for it.

                      Comment

                      Working...
                      X