Intel Prepares GCC Compiler Support For BFloat16
Intel developers continue prepping the Linux support for next-generation Intel Xeon "Cooper Lake" processors, particularly around its addition of the new BFloat16 instruction.
BFloat16 is a new floating-point format optimized for machine learning workloads. Besides being found in next-gen Cooper Lake processors, BF16 is also found within Intel's Nervana neural network processors and FPGAs.
Earlier this month Intel developers added BFloat16 support for GNU Gas while now they have sent out their latest patch enabling BFloat16 support within the GNU Compiler Collection (GCC).
The patch enables the compiler-side work around the new instructions for BFloat16: VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS. These AVX512BF16 instructions allow converting two packed single data to one packed BF16 data, converting packed single data to packed BF16 data, and performing a dot product of BF16 pairs accumulated into packed single precision.
The patch is now out for review. We'll see if it manages to slide into trunk for GCC 9 with GCC 9.1's release being imminent or will have to wait until next year's GCC 10 compiler release.
BFloat16 is a new floating-point format optimized for machine learning workloads. Besides being found in next-gen Cooper Lake processors, BF16 is also found within Intel's Nervana neural network processors and FPGAs.
Earlier this month Intel developers added BFloat16 support for GNU Gas while now they have sent out their latest patch enabling BFloat16 support within the GNU Compiler Collection (GCC).
The patch enables the compiler-side work around the new instructions for BFloat16: VCVTNE2PS2BF16, VCVTNEPS2BF16, and VDPBF16PS. These AVX512BF16 instructions allow converting two packed single data to one packed BF16 data, converting packed single data to packed BF16 data, and performing a dot product of BF16 pairs accumulated into packed single precision.
The patch is now out for review. We'll see if it manages to slide into trunk for GCC 9 with GCC 9.1's release being imminent or will have to wait until next year's GCC 10 compiler release.
33 Comments