Announcement

Collapse
No announcement yet.

AMD's Background On The ROCm OpenCL Stack

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Meteorhead View Post
    Then how in the blazes did Vega appear with an AMDGPU-PRO driver (that SOLELY supports Vega and nothing more)??!!
    They have been using the GCN architecture since the 7XXX (really late 6XXX) series cards, Vega is the first new architecture which they have dubbed NCU. Raja has publicly stated multiple times that they have had more challenges than expected developing drivers to take advantage of the new architecture. I'm pretty sure this has been the top priority and other things have been on the side (and delayed) while this was the main focused, hence so many other things falling behind schedule. They seem to still be behind on Vega drivers and it seems some things are not enabled in drivers, hence the weirdness with the Vega FE launch.

    Comment


    • #12
      Originally posted by bridgman View Post

      Your list includes both driver stacks and driver components; that is why it seems confusing. There are three "released" stacks (where we release matched sets of components together) and one we call "all-open" which is currently the set of open source components maintained in upstream projects.

      Released stacks - Catalyst Linux (aka fglrx) / AMDGPU-PRO / ROCm -- with AMDGPU-PRO replacing Catalyst Linux as you say. Strictly speaking "fglrx" is the kernel driver of the Catalyst Linux stack but a lot of people used fglrx as a name for the full stack.

      All-open stack - radeon or amdgpu kernel driver (newer HW uses amdgpu) plus libdrm plus Mesa plus an X driver (radeon/amdgpu/modesetting), plus OpenCL/Vulkan userspace drivers as we open source them and upstream the associated kernel code - assembled by distro packagers for each distro release



      The radeonsi and radv drivers plug into Mesa - the Mesa project includes code for things like OpenGL API support which is used by all HW platforms, in conjunction with lower level HW-specific drivers like radeonsi.



      amdgpu is a kernel driver; radeonsi is a userspace driver which plugs into Mesa. The two drivers (amdgpu and radeonsi) work together, although there are other smaller components connecting them including "winsys" and libdrm-amdgpu.

      The radv driver also plugs into Mesa and runs over the amdgpu kernel driver.



      The two stacks came out a couple of days apart IIRC - June 27 for AMDGPU-PRO 17.20 and June 29 for ROCm 1.6. Just different release dates - the 17.20 stack was the one which had to align with Vega FE board launch.



      Um... no. There is no connection whatsoever between amdkfd and DAL/DC.

      The amdkfd code for dGPU could not be upstreamed initially because it allowed pinning from userspace (pinning on allocation to be precise) in order to align with existing programming models. We recently implemented a working "eviction" mechanism for amdkfd which allowed us to provide the illusion of pinning from userspace for amdkfd userspace code but ensured that regular graphics userspace drivers & apps would not run out of physical memory as a consequence.

      Now that the eviction code is in place we are going ahead with upstreaming the latest amdkfd code for dGPU. The amdkfd code for APU has been upstream for years.



      DAL/DC (aka "display") is the only kernel driver code shared with other OSes/platforms, although we do share userspace driver code between OSes/platforms in the case of the closed source OpenGL and Vulkan drivers.

      The HSA/ROCm stack was initially developed for both Windows and Linux but there was essentially no perceived customer interest in the Windows implementation and as a result we focused primarily on Linux... IOW our customers seemed to feel Windows was a second-tier computing/development platform long before ROCm came along.

      We will probably end up having to support the ROCm stack on Windows anyways but in most cases it seems to be a "checkbox" requirement rather than something that customers plan to use for large-scale computing.



      I haven't had a chance to watch any Brazilian soap operas (although in fairness I don't think I have watched any non-Brazilian soap operas either) so I can't really comment, but if you think about the stacks in the following way things will probably make more sense:

      AMDGPU-PRO stack - aimed at traditional CAD workstation market, replacing Catalyst Linux

      ROCm stack - aimed at HPC, generally headless server systems

      all-open stack - covers pretty much everything else including desktop/gaming etc..

      The AMDGPU-PRO and ROCm stacks are growing together over time - both of them will continue to include some code which is considered non-upstreamable, eg the Kernel Compatibility Layer code which allows a single kernel driver source tree to build and run against a variety of kernel versions.
      What would be the best way to use OpenCL in Ubuntu 16.04 with an HD 7970? Does AMDGPU-PRO properly support OpenCL with an HD 7970 on Ubuntu 16.04?

      The HD 7970 offers good OpenCL performance under fgrlx in Ubuntu 14.04 - we have a machine with such setup at our research lab. We would like to upgrade to Ubuntu 16.04 so the machine is brought in line with the rest of our cluster, but the current driver landscape is in fact confusing. We fear that upgrading will result in poor to no OpenCL support. Thanks in advance.

      Comment


      • #13
        Originally posted by fakenmc View Post

        What would be the best way to use OpenCL in Ubuntu 16.04 with an HD 7970? Does AMDGPU-PRO properly support OpenCL with an HD 7970 on Ubuntu 16.04?

        The HD 7970 offers good OpenCL performance under fgrlx in Ubuntu 14.04 - we have a machine with such setup at our research lab. We would like to upgrade to Ubuntu 16.04 so the machine is brought in line with the rest of our cluster, but the current driver landscape is in fact confusing. We fear that upgrading will result in poor to no OpenCL support. Thanks in advance.
        I have a more recent scenario: When will we be able to actually use our RX 480/5xx GPGPUs with OpenCL 1.2/2.x inside Debian, and other non-CentOS, Ubuntu based distros?

        Comment


        • #14
          Originally posted by bridgman View Post
          The AMDGPU-PRO and ROCm stacks are growing together over time - both of them will continue to include some code which is considered non-upstreamable, eg the Kernel Compatibility Layer code which allows a single kernel driver source tree to build and run against a variety of kernel versions.
          Hmm, I didn't realize ROCm was intended to remain proprietary, i thought that was the way you were planning to open source the OpenCL code.

          Can you clarify how that will happen? Is there just going to be an open source version of ROCm public somewhere, like the amdgpu kernel driver upstream compared to the -pro driver version? Or is the CL driver not going to use ROCm at all?

          Comment


          • #15
            Originally posted by smitty3268 View Post

            Hmm, I didn't realize ROCm was intended to remain proprietary, i thought that was the way you were planning to open source the OpenCL code.
            It is open source already. It was released a month or so ago. However, at any given time there may be features that are not upstream yet or not upstreamable in their current form. I would even clarify what John said a bit more. We are moving to a unified driver stack for packaged delivery. This includes pro, rocm, and all-open. That stack may contain stuff that is not upstream for various reasons and kernel compatibility layers to support building the driver against various enterprise kernels. That way we can deliver pro, rocm, or the open stack the same way depending on the particular use case.

            Comment


            • #16
              We have large deployment of Fedora boxes at work. Fully working OpenCL stack is the only thing keeping us from switching to AMD GPUs. When will we see the OpenCL fully open/upstreamed so it can be included in distros like Fedora?

              Comment


              • #17
                Originally posted by smitty3268 View Post
                Hmm, I didn't realize ROCm was intended to remain proprietary, i thought that was the way you were planning to open source the OpenCL code.
                Not only is ROCm already open source (other than the HSAIL shader compiler which has been largely replaced with the LLVM-based direct-to-ISA compiler) but the portions of the AMDGPU-PRO stack it is being aligned with are already open source as well.

                As agd5f said, the things we are bringing closer together are code versions (the AMDGPU-PRO kernel driver now includes recent amdkfd with dGPU support for example) and delivery mechanisms.
                Last edited by bridgman; 05 July 2017, 07:44 PM.
                Test signature

                Comment


                • #18
                  Originally posted by agd5f View Post

                  It is open source already. It was released a month or so ago. However, at any given time there may be features that are not upstream yet or not upstreamable in their current form. I would even clarify what John said a bit more. We are moving to a unified driver stack for packaged delivery. This includes pro, rocm, and all-open. That stack may contain stuff that is not upstream for various reasons and kernel compatibility layers to support building the driver against various enterprise kernels. That way we can deliver pro, rocm, or the open stack the same way depending on the particular use case.
                  Ok, thanks - that all makes sense.

                  Comment


                  • #19
                    Is there open source OpenCL support for Kaveri APU?

                    Comment


                    • #20
                      Not exactly. We have released open source OpenCL running over ROCm, and we did do the early HSA/ROCm development on Kaveri (and pushed the work upstream) so most of the big pieces are there. That said, the OpenCL-over-ROCm path has not been tested on Kaveri recently AFAIK and so would probably require a non-trivial amount of effort to get it up to production quality.
                      Test signature

                      Comment

                      Working...
                      X