Announcement

Collapse
No announcement yet.

Intel Prepares GCC Compiler Support For BFloat16

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Prepares GCC Compiler Support For BFloat16

    Phoronix: Intel Prepares GCC Compiler Support For BFloat16

    Intel developers continue prepping the Linux support for next-generation Intel Xeon "Cooper Lake" processors, particularly around its addition of the new BFloat16 instruction...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    7 P1 bugs are still remaining. Three being Arm-related, one is related to MacOSX, one looks like it's a Fortran issue and two bugs being ICEs on valid code.

    Comment


    • #3
      Originally posted by sdack View Post
      7 P1 bugs are still remaining. Three being Arm-related, one is related to MacOSX, one looks like it's a Fortran issue and two bugs being ICEs on valid code.
      Sounds like only the last two bugs are important.

      Comment


      • #4
        Does anybody else get the feeling Intel is making a big mistake here? That is using the AVX engine instead of coming up with a new execution unit for AI style code? We already have examples of custom execution units in ARM hardware (Apples A12 comes to mind but there are others). The idea here is that you end up with less complex hardware, running at lower power, that can evolve independently of all other hardware.

        It isnt like this this keeps me up at night because I’m likely to be history before this machine learning hardware and the associated software becomes as advanced as I believe it will. In fact if the rumors about an Apple ARM based chip are true, I suspect the motivation will be advanced ML/AI hardware not so much the ARM cores themselves. In any event the fear here is that the complex logic to expand the capabilities of the AVX unit means high power and a slow evolution of the hardware.

        Comment


        • #5
          Originally posted by wizard69 View Post
          Does anybody else get the feeling Intel is making a big mistake here? ...
          They're actually being smart. Not everyone interested in AI programming has got specialized hardware readily available to them. To make standard PCs of tomorrow capable of handling such code more efficiently allows more developers to make first experiences with AI software, but without having additional costs. The opposite would mean not to have done anything about it. It then gives more relevance to the next generation of PCs and as such means more money in Intel's pocket, but it also benefits AI programming itself. Every bit helps.

          And before AIs become as advance as you think they will does mankind first need to work hard to get there. It will take many little steps and won't happen over night with a revolution or a sudden up-rise of AIs.
          Last edited by sdack; 13 April 2019, 01:42 PM.

          Comment


          • #6
            Originally posted by wizard69 View Post
            Does anybody else get the feeling Intel is making a big mistake here? That is using the AVX engine instead of coming up with a new execution unit for AI style code? We already have examples of custom execution units in ARM hardware (Apples A12 comes to mind but there are others). The idea here is that you end up with less complex hardware, running at lower power, that can evolve independently of all other hardware.

            It isnt like this this keeps me up at night because I’m likely to be history before this machine learning hardware and the associated software becomes as advanced as I believe it will.
            So they're doing it right: Don't burn resources, as long as you can have it the easy and cheap way.

            Comment


            • #7
              Originally posted by wizard69 View Post
              Does anybody else get the feeling Intel is making a big mistake here? That is using the AVX engine instead of coming up with a new execution unit for AI style code? We already have examples of custom execution units in ARM hardware (Apples A12 comes to mind but there are others). The idea here is that you end up with less complex hardware, running at lower power, that can evolve independently of all other hardware.
              Yeah, this is a weird move. Intel is making dGPUs, enlarging their iGPUs, and spent $Billions on Nervana and Movidius. Why they feel they also need to beef up their AVX is mystifying, since it still won't touch GPUs or dedicated ASICs in perf/Watt and certainly won't beat them on ops/$. The only way this makes any business sense is as a stop-gap measure, for the < 1 year between when this ships and when they can really ramp up those products.

              You could look at the conversion instructions as improving interoperability with their GPUs and Nervana ASICs, but it's a little silly given that one of the main benefits of BFloat16 is how trivially it can be converted via existing instructions.

              Comment


              • #8
                Originally posted by sdack View Post
                They're actually being smart. Not everyone interested in AI programming has got specialized hardware readily available to them.
                Their iGPUs pack more raw compute than these things, and are more efficient as well. The typical laptop or desktop PC already has a lot more horsepower in its iGPU than if they bolted BFloat16-enabled AVX-512 units onto those CPU cores. This move isn't really defensible. I can only imagine some marketing dweeb heard that BFloat16 was the next big thing and decided they needed to have it everywhere, for some reason.

                Comment


                • #9
                  ​VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS​​​​​
                  I too like my CPU instructions to look like I died on my keyboard.

                  Comment


                  • #10
                    Originally posted by coder View Post
                    This move isn't really defensible.
                    But it is. All the suggestions you're making are more complex and only focus on achieving maximum power, but it is complexity, which hampers development. Adding a simple instruction will work as a stepping stone into AI programming. You then can get far more efficient solutions already and Intel would only be competing with those. Not every problem then requires a big AI solution, but will only need a little. With a single CPU instruction can one then solve many little AI problems and leave the bigger ones to dedicated hardware units.

                    Comment

                    Working...
                    X