Announcement

Collapse
No announcement yet.

Linus Torvalds: "I Hope AVX512 Dies A Painful Death"

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Anarchy View Post
    a noob question: how is avx512 implemented in the cores, is it per core or one for all?
    another noob question: is avx512 something that can be "emulated" by some kind of a tensor-coprocessor?
    AVX-512 is per-core - each core gets more/wider FP registers and the instruction decoder recognizes additional instructions.

    Could potentially be emulated (by taking an "illegal instruction" trap) but the emulation overhead would probably be pretty high. Emulating on CPU would be a lot faster than emulating on a co-processor, and both would be slower than using FP instructions that were supported by the CPU.
    Test signature

    Comment


    • #12
      Originally posted by bridgman View Post

      AVX-512 is per-core - each core gets more/wider FP registers and the instruction decoder recognizes additional instructions.

      Could potentially be emulated (by taking an "illegal instruction" trap) but the emulation overhead would probably be pretty high. Emulating on CPU would be a lot faster than emulating on a co-processor, and both would be slower than using FP instructions that were supported by the CPU.
      Right. thanks for the answer. I've been suspecting for a long time that AMD is perhaps planning to introduce some kind of a TPU per ccx or something similar that could be better suited to implement these instructions plus a lot more on the cpu. But maybe that's not my smartest idea today.

      Comment


      • #13
        Because you don't use something doesn't mean it's useless.
        Kernel developers don't use FP mathematics, so some of them think AVX512 is useless.

        Comment


        • #14
          This rant right here tells you everything you need to know why Linux is an also ran on the desktop. Intel concentrates on the HPC market because that's where the big bucks are, just ask NVIDIA.

          And AVX-512 is a God send on HPC workloads, in some cases using AVX-512 is vastly faster than using GPU acceleration.

          The reality is that Intel doesn't care about the desktop market, that's not their bread and butter, people buying a $200-$300 cpu are not going to make or break Intel, customers that buy $10000 cpu's but the hundreds to thousands to build huge super computers for AI, simulations, Inference, etc are what make or break Intel.

          Lastly, floating point is very important in many applications, like scientific and video, if you want the highest quality you don't use integer you use floating point, the only reason to use int is because it's much faster due to the lower precision of the calculations.

          It's really a shame to see a guy like this jackass making statements like this one.

          Comment


          • #15
            Originally posted by dxin View Post
            Same goes with Intel GPUs. Why waste transistor on something nobody cares?
            I care. Many others care, including big business. Not everyone requires the performance of a discrete GPU. Without the iGPU we'd need to waste financial, thermal, acoustic and spatial budgets installing one.

            Comment


            • #16
              Originally posted by sophisticles View Post
              This rant right here tells you everything you need to know why Linux is an also ran on the desktop. Intel concentrates on the HPC market because that's where the big bucks are, just ask NVIDIA.

              And AVX-512 is a God send on HPC workloads, in some cases using AVX-512 is vastly faster than using GPU acceleration.

              The reality is that Intel doesn't care about the desktop market, that's not their bread and butter, people buying a $200-$300 cpu are not going to make or break Intel, customers that buy $10000 cpu's but the hundreds to thousands to build huge super computers for AI, simulations, Inference, etc are what make or break Intel.

              Lastly, floating point is very important in many applications, like scientific and video, if you want the highest quality you don't use integer you use floating point, the only reason to use int is because it's much faster due to the lower precision of the calculations.

              It's really a shame to see a guy like this jackass making statements like this one.
              And yet it's generally Linux that's used in the HPC workloads. Linus disliking AVX512 is not why Linux isn't used more frequently on desktop machines.

              Comment


              • #17
                Originally posted by chuckula View Post
                Torvalds ought to look into the GFNI and VBMI as at least two major examples of where AVX-512 can massively improve the performance in critical areas of kernel code that have nothing to do with floating point.

                Here's an article that should be of great interest: https://branchfree.org/2019/05/29/wh...s-perspective/

                Here's a paper showing AVX-512 making base64 encoding basically as fast as the memcpy command can move the data: https://arxiv.org/pdf/1910.05109.pdf Of course, while base64 isn't a kernel-specific algorithm, it's the same type of non-floating point byte-processing that happens all the time in the kernel and that could be massively improved with the correct use of the architecture.

                Maybe he's just venting after having to politically-correct the kernel source all day.
                Depends how you measure improvement. Advanced instructions usually give an 20-25% extra performance in the cost of +60% power consumption, example: Ryzen3600 gaming full load = 50w, GIMP full load = 80w. That is why processors can achieve higher clock for gaming, because of thermals. So achieving +15% clock +25% power consumption on GIMP, with simpler instructions wouldn't give near the performance with lower consumption? So Ryzen3650 SSE6 @4.9Ghz = 65w and more cores on the same space. Instructions with consumption over the gaming consumption are trash everyone knows it and this goes for the majority of AVX instructions, not only AVX512. Even VIA if it gets serious can destroy them both this round, x86 trash architecture haha.
                Last edited by artivision; 11 July 2020, 11:37 PM.

                Comment


                • #18
                  Originally posted by gigaplex View Post
                  I care. Many others care, including big business. Not everyone requires the performance of a discrete GPU. Without the iGPU we'd need to waste financial, thermal, acoustic and spatial budgets installing one.
                  Not to mention sometimes iGPU is faster than dGPU, for instance I have a number of pcs, the fastest is a R5 1600 with 16 Gb ddr4 and a GTX1050 and the slowest is an i3 7100 with 16 Gb ddr4 and no dGPU (it uses the iGPU).

                  I do a lot of video editing and routinely need to render out a file that has a bunch of filters applied (I use Shotcut). If I use the first pc with a 50 minute source, it takes over 9 hours to finish the encode, if I'm using software filters. If I enable gpu filters, R5 + 1050 combo cuts that time down to 5.5 to 6 hours. If I do the same encode on the i3 and use gpu filters with the iGPU, the time is down to just over 3 hours.

                  This is repeatable with other test files. Near as I can tell the iGPU cuts the time so much because it doesn't suffer from memory copy performance penalties (from system ram to gpu ram) that the other system has to perform.

                  I'm looking forward to Rocket Lake, that Gen 12 Xe iGPU should be awesome for the work I do.

                  Comment


                  • #19
                    Anyway x86 is a total garbage... so get rid of the full stuff yeah.

                    Comment


                    • #20
                      Originally posted by dragorth View Post
                      Games do care about AVX512, though,
                      Which games are using it? As opposed to just AVX2.

                      Comment

                      Working...
                      X