Announcement

Collapse
No announcement yet.

Intel Publishes Whitepaper On New BFloat16 Floating-Point Format For Future CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by microcode View Post

    Well, if they do that you can easily sue them on any shipment they represent that way. It is not FP32 precision, it is FP32 range, it's a specialized format. I highly doubt that this will be misrepresented egregiously and I don't really see why people think this is such a big deal. This is incredibly simple to do in hardware, and it has major benefits for these and some other workloads.
    What it is also great for is marketing: "Look, our next generation has made a large jump in FP32(******) performance!" (******): in very select workloads where precision doesn't matter, blabla, another 5 lines with clarifications in even smaller fontsize.

    Intel is after all a company driven by the marketing department.

    I don't doubt that it's a nifty format for certain uses. And if it's easy to implement in HW and SW, all the better.
    Last edited by mlau; 16 November 2018, 03:02 AM.

    Comment


    • #12
      Originally posted by wizard69 View Post
      This seems to be a long ways off. What I’d like to know is why hasn’t Intel or AMD defined a specialized processor core for these workloads? That is like Apple and other ARM developers have done with specialized ML accelerators.
      Because they both make GPUs, which are kinda that.

      Specialized hardware blocks make more sense in cell phone SoCs because power-efficiency is more valuable than die area. Qualcomm took a slightly different approach of enhancing its existing DSP block to run machine learning (although they can also employ the Adreno GPU and CPU blocks).

      Both Intel and AMD have added support for IEEE 754-based half-precision floats, years ago. I think Intel added it in Gen 8 (Broadwell; 2014) HD Graphics and AMD added it in Vega.

      Intel is adding "DL-boost" to their upcoming 10 nm CPU cores, which is probably the context of this article. I think it's basically some subset of AVX-512 vector extensions that utilize this BFloat16 format.

      AMD keeps adding more deep learning instructions to GCN, but not yet anything that can compare with Nvidia's Tensor cores.

      Comment


      • #13
        Originally posted by microcode View Post
        I don't really see why people think this is such a big deal. This is incredibly simple to do in hardware, and it has major benefits for these and some other workloads.
        As I said, I think the half-floats specified in IEEE 754 are more generally useful. That's what GPUs have used, to date.

        To wit:

        Q: when is 256+ 1 = 256?

        A: when you're using BFloat16.

        Comment


        • #14
          Originally posted by coder View Post
          As I said, I think the half-floats specified in IEEE 754 are more generally useful. That's what GPUs have used, to date.

          To wit:

          Q: when is 256+ 1 = 256?

          A: when you're using BFloat16.
          Saturating addition is a good choice for some use cases.

          Comment


          • #15
            Originally posted by microcode View Post
            Saturating addition is a good choice for some use cases.
            I don't disagree with that statement, but that doesn't really speak to my point. I was just trying to illustrate what low precision this format has.

            IEEE 754 half-precision was created as a balance between range and precision, whereas BFloat16 is all about range. IMO, that limits its potential for a great many uses. It's fine for deep learning, but not a whole lot else. I'd rather the industry stuck with existing half-precision.

            Comment

            Working...
            X