
The POCL project lets OpenCL 1.2~2.0 run over CPU back-ends as well as for running OpenCL on NVIDIA GPUs over CUDA, on AMD GPUs via HSA, and other accelerator targets that have LLVM back-end coverage.
POCL 1.5 adds support for the newly-released LLVM/Clang 10, refactoring of the convert_T() OpenCL functions, other tracing/profiling improvements, and "a lot" of fixes. The convert_T work for POCL 1.5 better jives with LLVM's auto-vectorization criteria and according to the documentation can lead to better SIMD ISA use on CPUs like Arm where up to a ~5.5x improvement can be seen in tight loops.
POCL 1.5 can be downloaded from PortableCL.org.
1 Comment