Originally posted by Spooktra
View Post
Announcement
Collapse
No announcement yet.
Running OpenCL On The CPU With POCL 1.0, Xeon & EPYC Testing
Collapse
X
-
Michael Thanks for this test, I'm very interested in POCL to be able to ditch a direct dependency on CUDA for a small part of an application.
One question:
When you're saying "how ... EPYC POCL OpenCL performance compares to GPUs running OpenCL", did the GPUs in your benchmark run OpenCL directly, or through POCL using POCL's GPU backends?
Comment
-
Originally posted by nh2_ View PostMichael Thanks for this test, I'm very interested in POCL to be able to ditch a direct dependency on CUDA for a small part of an application.
One question:
When you're saying "how ... EPYC POCL OpenCL performance compares to GPUs running OpenCL", did the GPUs in your benchmark run OpenCL directly, or through POCL using POCL's GPU backends?Michael Larabel
https://www.michaellarabel.com/
Comment
-
Thanks for the nice article!
The Core i7 numbers were actually surprisingly good given how little we have focused on optimizing pocl on CPUs lately. However, please keep in mind that when you compile a GPU optimized kernel to a CPU device, the performance is expected to suffer in comparison to a CPU optimized implementation. I'm not familiar with Blender's OpenCL implementation, but typically the OpenCL kernels are GPU optimized. In practice this means a lot of work-items, local memory use, and kernels that can have barriers in tricky locations which all can heavily hinder CPU optimizations such as autovectorization.
Having said that, I have a list of things that could be done to improve the "performance portability" of GPU optimized OpenCL kernels to CPU/SIMD ISAs, but too little time lately unfortunately. If someone cares on OpenCL GPU to CPU perf portability enough to contribute to the project, just let us know and we are happy to point to the correct direction. Same goes for the thread (WG and multi-kernel) scheduling to improve multicore/thread scalability: a lot of low known low hanging fruits there for someone to pick.
Thanks again and keep up the good work on Phoronix!
Pekka,
the lead developer of pocl
P.S. Another interesting comparison would be how pocl 1.0rc1 compares to NVIDIA's OpenCL 1.2 via
the CUDA backend.
- Likes 2
Comment
-
Originally posted by Spooktra View PostHow funny is it that a $100 GTX1050 is capable of smoking high end Intel and AMD based servers costing thousands of dollars.
Comment
-
Originally posted by Pekka View PostThanks for the nice article!
Another interesting comparison would be how pocl 1.0rc1 compares to NVIDIA's OpenCL 1.2 via the CUDA backend.
For me the key intersting thing about POCL is to be able to write recent OpenCL that works cross-platform.
- Likes 1
Comment
Comment