POCL 1.5 Released With Performance Improvements, Fixes For OpenCL On CPUs
POCL 1.5 has been released as the "Portable CL" implementation for running OpenCL on CPUs and other devices with LLVM back-ends.
The POCL project lets OpenCL 1.2~2.0 run over CPU back-ends as well as for running OpenCL on NVIDIA GPUs over CUDA, on AMD GPUs via HSA, and other accelerator targets that have LLVM back-end coverage.
POCL 1.5 adds support for the newly-released LLVM/Clang 10, refactoring of the convert_T() OpenCL functions, other tracing/profiling improvements, and "a lot" of fixes. The convert_T work for POCL 1.5 better jives with LLVM's auto-vectorization criteria and according to the documentation can lead to better SIMD ISA use on CPUs like Arm where up to a ~5.5x improvement can be seen in tight loops.
POCL 1.5 can be downloaded from PortableCL.org.
The POCL project lets OpenCL 1.2~2.0 run over CPU back-ends as well as for running OpenCL on NVIDIA GPUs over CUDA, on AMD GPUs via HSA, and other accelerator targets that have LLVM back-end coverage.
POCL 1.5 adds support for the newly-released LLVM/Clang 10, refactoring of the convert_T() OpenCL functions, other tracing/profiling improvements, and "a lot" of fixes. The convert_T work for POCL 1.5 better jives with LLVM's auto-vectorization criteria and according to the documentation can lead to better SIMD ISA use on CPUs like Arm where up to a ~5.5x improvement can be seen in tight loops.
POCL 1.5 can be downloaded from PortableCL.org.
1 Comment