Tensor LLVM Extensions Proposed For Targeting AI Accelerators, Emerging Hardware
Intel, Amazon AWS, IBM, Qualcomm, and UIUC researchers have been collaborating over a proposed "Tensor LLVM Extensions" (TLX) to make this open-source compiler infrastructure more suitable for targeting AI accelerators and other emerging classes of hardware.
The proposed Tensor LLVM Extensions would make the widely-used LLVM compiler stack able to better deal with tensor cores and similar hardware for today's increasing AI/ML workloads and related fields. LLVM is already the dominant player when it comes to supporting CPUs and often GPUs while Tensor LLVM Extensions would help them on the new frontier of being able to deal with hardware around Intel Advanced Matrix Extensions (AMX), NVIDIA tensor cores, AMD matrix cores, Qualcomm HVX, Amazon Infferentia/Trainium, and other accelerators. Right now most of the compiler stacks for such accelerators are closed-source and not having any universal solution for sharing optimizations and other compiler features as LLVM could provide.
This proposal would make it easier for vendors to create optimizing compiler back-ends for such hardware, leverage existing LLVM front-ends for various programming languages to more easily exploit such tensor hardware, allow MLIR to integrate this hypothetical framework, and more. Simply put, they want to extend LLVM IR with a set of common tensor operations that would work across hardware back-ends and better optimizing LLVM for tensor code generation.
At the moment, those involved are still bringing up a prototype implementation for their own platforms. Thanks to LLVM front-ends for Rust, C/C++, Intel DPC++, Julia this would surely be an interesting effort for allowing more languages to target the growing presence of AI hardware and doing so in a common manner thanks to LLVM.
Ultimately they want to upstream all of this work into mainline LLVM. This weekend those involved sent out a "request for comments" letter providing a lengthy look at their proposal for LLVM. It will be very interesting to see where this Tensor LLVM Extensions work leads and how widely adopted it will be by the industry for better code sharing in this hotly competitive space.
The proposed Tensor LLVM Extensions would make the widely-used LLVM compiler stack able to better deal with tensor cores and similar hardware for today's increasing AI/ML workloads and related fields. LLVM is already the dominant player when it comes to supporting CPUs and often GPUs while Tensor LLVM Extensions would help them on the new frontier of being able to deal with hardware around Intel Advanced Matrix Extensions (AMX), NVIDIA tensor cores, AMD matrix cores, Qualcomm HVX, Amazon Infferentia/Trainium, and other accelerators. Right now most of the compiler stacks for such accelerators are closed-source and not having any universal solution for sharing optimizations and other compiler features as LLVM could provide.
This proposal would make it easier for vendors to create optimizing compiler back-ends for such hardware, leverage existing LLVM front-ends for various programming languages to more easily exploit such tensor hardware, allow MLIR to integrate this hypothetical framework, and more. Simply put, they want to extend LLVM IR with a set of common tensor operations that would work across hardware back-ends and better optimizing LLVM for tensor code generation.
"Tensor LLVM Extensions" were publicly proposed this weekend.
At the moment, those involved are still bringing up a prototype implementation for their own platforms. Thanks to LLVM front-ends for Rust, C/C++, Intel DPC++, Julia this would surely be an interesting effort for allowing more languages to target the growing presence of AI hardware and doing so in a common manner thanks to LLVM.
Ultimately they want to upstream all of this work into mainline LLVM. This weekend those involved sent out a "request for comments" letter providing a lengthy look at their proposal for LLVM. It will be very interesting to see where this Tensor LLVM Extensions work leads and how widely adopted it will be by the industry for better code sharing in this hotly competitive space.
12 Comments