TensorFlow Lite Now Supports Tapping OpenCL For Much Faster GPU Inference
TensorFlow Lite for AI inference on mobile devices now has support for making use of OpenCL on Android devices. In doing so, the TFLite performance presents around a 2x speed-up over the existing OpenGL back-end.
To little surprise, the TensorFlow developers are finding their new OpenCL back-end for TFLite to be much faster than their OpenGL back-end for mobile inference. Thanks to better performance profiling abilities, native FP16 support, constant memory, and OpenCL being better designed for compute than OpenGL ES with compute shaders, the TFLite performance is much improved -- and especially so compared to doing inference on the mobile SoC CPU cores.
More insight on the new OpenCL back-end for TensorFlow Lite via the TensorFlow.org blog. "Our new OpenCL backend is roughly twice as fast as the OpenGL backend, but does particularly better on Adreno devices (annotated with SD), as we have tuned the workgroup sizes with Adreno's performance profilers mentioned earlier."
To little surprise, the TensorFlow developers are finding their new OpenCL back-end for TFLite to be much faster than their OpenGL back-end for mobile inference. Thanks to better performance profiling abilities, native FP16 support, constant memory, and OpenCL being better designed for compute than OpenGL ES with compute shaders, the TFLite performance is much improved -- and especially so compared to doing inference on the mobile SoC CPU cores.
More insight on the new OpenCL back-end for TensorFlow Lite via the TensorFlow.org blog. "Our new OpenCL backend is roughly twice as fast as the OpenGL backend, but does particularly better on Adreno devices (annotated with SD), as we have tuned the workgroup sizes with Adreno's performance profilers mentioned earlier."
7 Comments