Why there are ROCm and PAL OpenCL implementations? Why would I use one over the other?
Announcement
Collapse
No announcement yet.
Radeon Software 18.20 Preview Offers Early Support For Ubuntu 18.04 LTS & RHEL 7.5
Collapse
X
-
Originally posted by lostdistance View Post
I was able to run clinfo against amdgpu-pro-18.10 on an AMD Radeon HD 8570 Oland (GCN SI):
It seems like there is some strangeness related to caching of built kernels and/or some other race condition surrounding clEnqueueNDRangeKernel. Running the exact same kernel twice with the exact same arguments, however, works and returns the correct result in the output buffer! Assuming the kernel args (including input buffer memory contents) are generated deterministically like in the program linked above, running the same program twice (thereby calling clEnqueueNDRangeKernel again) works as well. For non-deterministic arguments (e.g. random numbers as input), a simple copy-paste duplication of the clEnqueueNDRangeKernel call before calling clEnqueueReadBuffer seems to do the trick.
As a complete beginner in OpenCL programming, I've no idea why this behaviour occurs or why the above workaround resolves it. Assuming it isn't some bizarre oversight like CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE being set by default for clCreateCommandQueue, I can only assume there's something synchronization-related affecting clEnqueueNDRangeKernel.
Comment
-
Originally posted by StillStuckOnSI View PostApologies for all the questions, but just to clarify: Is GCN 1.0/SI completely unsupported (under Orca) until experimental support is flipped on in the PAL, or is it actually supported at present under the "legacy" implementation and any buggy OpenCL behaviour should be considered as such?
Originally posted by Tomin View PostWhy there are ROCm and PAL OpenCL implementations? Why would I use one over the other?Test signature
- Likes 1
Comment
-
Originally posted by QaridariumI have 2 Threatripper systems with 3 Vega-64 per PC and the one with the asrock mainboard the "sudo ./amdgpu-install --opencl=pal --headless"
solution works perfect but the other one with the MSI TR4 mainboard "sudo ./amdgpu-install --opencl=pal --headless" results in very loud FAN spin.
even with upgrading to 4.17rc3 kernel it is very loud. sure i have to check maybe there is dirt/dust inside of the cards who blocks the air... i will check this lader
There were also a few requests to force higher clock and fan speeds on dedicated compute rigs (to get best performance with bursty workloads) - not sure if that got implemented on the AMDGPU/PRO stack releases but if it was then it might be tied to headless installs.Last edited by bridgman; 06 May 2018, 03:54 PM.Test signature
Comment
-
Originally posted by bridgman View PostQuick answer is that the ROCm stack can run a bit faster (since it makes use of HSA hardware features) but the PAL stack can run on all our hardware, not just parts explicitly designed with full HSA/ROCm hardware support. In general you will see the ROCm paths tested more heavily on ROCm stack releases while testing for the amdgpu/pro packaged releases will focus on PAL paths.
I think the only problem that I have then is that there is no Tensorflow for PAL, only for ROCm which is quite strict about the platforms that it supports. Anyway, there never was Tensorflow support for AMD before ROCm so it's not worse now than it used to be.
Comment
-
Originally posted by Tomin View PostI think the only problem that I have then is that there is no Tensorflow for PAL, only for ROCm which is quite strict about the platforms that it supports. Anyway, there never was Tensorflow support for AMD before ROCm so it's not worse now than it used to be.Test signature
Comment
-
Originally posted by bridgman View PostOther than Tahiti (HD 79xx) my impression was that we had ROCm support on all of the parts which were sufficiently powerful (and had sufficient memory) to be worth running Tensorflow on - where do you see the gaps ?
Is it possible to use it with AMD processors like Phenom II paired with some AMD card?Last edited by Tomin; 06 May 2018, 04:19 PM. Reason: any -> some, sorry but English is not my first language
Comment
-
Originally posted by Tomin View PostProbably just on your documentation: https://rocm.github.io/hardware.html
Is it possible to use it with AMD processors like Phenom II paired with some AMD card?Test signature
Comment
-
Originally posted by bridgman View PostWhoops, you're right - you said "platform" not "GPU". Even worse, English IS my first language
I noticed that OpenMI has also OpenCL version. Does that work on other OpenCL stacks than just ROCm OpenCL? It seems interesting and I wonder why people porting Tensorflow to OpenCL haven't mentioned it.
Comment
Comment