Announcement

Collapse
No announcement yet.

AMD To Open-Source Its Linux Execution & Compilation Stack

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    HSA is AMD's vision of the future of integrating the CPU and GPU

    So that apps can use the GPU seamlessly for compute.

    It includes both hardware and software elements.

    On the software side, they are creating a generic IL that is not vendor specific, and allowing drivers to compile that down to the hardware. Actually, it sounds kind of like Gallium/TGSI. So languages would get compiled down to HSAIL, and then drivers would take that and run it on the GPU.

    AMD is addressing this via HSA. HSA addresses these fundamental points by introducing an intermediate layer (HSAIL) that insulates software stacks from the individual ISAs. This is a fundamental enabler to the convergence of SW stacks on top of HC.

    Unless the install base is large enough, the investment to port *all* standard languages across to an ISA is forbiddingly large. Individual companies like AMD are motivated but can only target a few languages at a time. And the software community is not motivated if the install base is fragmented. HSA breaks this deadlock by providing a "virtual ISA" in the form of HSAIL that unifies the view of HW platforms for SW developers. It is important to note that this is not just about functionality but preserves performance sufficiently to make the SW stack truly portable across HSA platforms
    Hardware is supposed to be out in 2014, with certain elements done earlier:
    Existing APIs for GPGPU are not the easiest to use and have not had widespread adoption by mainstream programmers. In HSA we have taken a look at all the issues in programming GPUs that have hindered mainstream adoption of heterogeneous compute and changed the hardware architecture to address those. In fact the goal of HSA is to make the GPU in the APU a first class programmable processor as easy to program as today's CPUs. In particular, HSA incorporates critical hardware features which accomplish the following:

    1. GPU Compute C++ support: This makes heterogeneous compute access a lot of the programming constructs that only CPU programmers can access today

    2. HSA Memory Management Unit: This allows all system memory is accessible by both CPU or GPU, depending on need. In today's world, only a subset of system memory can be used by the GPU.

    3. Unified Address Space for CPU and GPU: The unified address space provides ease of programming for developers to create applications. By not requiring separate memory pointers for CPU and GPU, libraries can simplify their interfaces

    4. GPU uses pageable system memory via CPU pointers: This is the first time the GPU can take advantage of the CPU virtual address space. With pageable system memory, the GPU can reference the data directly in the CPU domain. In all prior generations, data had to be copied between the two spaces or page-locked prior to use

    5. Fully coherent memory between CPU & GPU: This allows for data to be cached in the CPU or the GPU, and referenced by either. In all previous generations GPU caches had to be flushed at command buffer boundaries prior to CPU access. And unlike discrete GPUs, the CPU and GPU share a high speed coherent bus

    6. GPU compute context switch and GPU graphics pre-emption: GPU tasks can be context switched, making the GPU in the APU a multi-tasker. Context switching means faster application, graphics and compute interoperation. Users get a snappier, more interactive experience. As UI's are becoming increasing more touch focused, it is critical for applications trying to respond to touch input to get access to the GPU with the lowest latency possible to give users immediate feedback on their interactions. With context switching and pre-emption, time criticality is added to the tasks assigned to the processors. Direct access to the hardware for multi-users or multiple applications are either prioritized or equalized

    As a result, HSA is a purpose designed architecture to enable the software ecosystem to combine and exploit the complementary capabilities of CPUs (sequential programming) and GPUs (parallel processing) to deliver new capabilities to users that go beyond the traditional usage scenarios. It may be the first time a processor company has made such significant investment primarily to improve ease of programming!

    In addition on an HSA architecture the application codes to the hardware which enables user mode queueing, hardware scheduling and much lower dispatch times and reduced memory operations. We eliminate memory copies, reduce dispatch overhead, eliminate unnecessary driver code, eliminate cache flushes, and enable GPU to be applied to new workloads. We have done extensive analysis on several workloads and have obtained significant performance per joule savings for workloads such as face detection, image stabilization, gesture recognition etc…

    Finally, AMD has stated from the beginning that our intention is to make HSA an open standard, and we have been working with several industry partners who share our vision for the industry and share our commitment to making this easy form of heterogeneous computing become prevalent in the industry. While I can't get into specifics at this time, expect to hear more about this in a few weeks at the AMD Fusion Developer Summit (AFDS).

    So you see why HSA is different and why we are excited
    Quotes are from AMD via Anandtech article: http://www.anandtech.com/show/5847/a...ds-manju-hegde
    Last edited by smitty3268; 06-20-2012, 02:50 PM.

    Comment


    • #17
      Certainly powerful stuff and the future. But I would rather see a proper spec for the new AMD GPU's any day of the week.

      Comment


      • #18
        Originally posted by jvillain View Post
        Certainly powerful stuff and the future. But I would rather see a proper spec for the new AMD GPU's any day of the week.
        [Slightly offtopic]

        Good point.

        Are the documents found at http://www.x.org/docs/AMD/ complete, by means of released docs?
        If so - that means there's no specific documentation available for Evergreen (HD 5000) and newer. I wasn't aware of that. :/
        Otherwise, can someone in charge please dump the newer stuff in there?
        Is there another location for the docs that I'm overlooking?
        Last edited by entropy; 06-20-2012, 04:24 PM.

        Comment


        • #19
          http://developer.amd.com/SDKS/AMDAPP...s/default.aspx

          Originally posted by entropy View Post
          [Slightly offtopic]

          Good point.

          Are the documents found at http://www.x.org/docs/AMD/ complete, by means of released docs?
          If so - that means there's no specific documentation available for Evergreen (HD 5000) and newer. I wasn't aware of that. :/
          Otherwise, can someone in charge please dump the newer stuff in there?
          Is there another location for the docs that I'm overlooking?

          Comment


          • #20
            Thanks!

            Still, information seems a bit scattered all over the place.

            Comment


            • #21
              Originally posted by rrohbeck
              Now if they could explain what they plan in a way that a simple computer scientist like myself understands what they're going to do...
              GPGPU + CPU + Possibly other stuff == New multicore computer architecture.

              So AMD is releasing software that will help people develop compilers and libraries to take advantage of it.

              Comment


              • #22
                Originally posted by entropy View Post
                Thanks! Still, information seems a bit scattered all over the place.
                We post info to the xorg wiki first then update amd.com later. Looks like we need another sync there.

                The 3D programming docco on the xorg wiki covers from 6xx through NI pretty well since they all use the same core architecture. It wouldn't hurt to add delta docs for things like attribute interpolation in the shaders, although the ISA docs and driver code also cover the changes. SI needs an all new docco set and that should probably be highest priority.

                Comment


                • #23
                  Well CISC is not RISC and does not decode anything in RISC (that is recompiling), decoding is another thing. RISC means relative instructions (one is little the other, or, continuation of another), CISC is the opposite (complex relations between instructions). CISC cannot execute directly many instructions, so executes them in Micro-Ops (frames in space or in time), its the only way for a CISC because 20-30 different units is not possible, instead 7 units are good wille RISC can have only 1 (vector registers). Micro-Ops are not RISC because are not relative and are not instructions. RISC is 20 times smaller for the same general processing power and 40 times smaller for stream processing like gaming, and thats because you cant find any game with static compiled graphics in order to use the complexity.

                  Comment


                  • #24
                    Originally posted by bridgman View Post
                    We post info to the xorg wiki first then update amd.com later. Looks like we need another sync there.

                    The 3D programming docco on the xorg wiki covers from 6xx through NI pretty well since they all use the same core architecture. It wouldn't hurt to add delta docs for things like attribute interpolation in the shaders, although the ISA docs and driver code also cover the changes. SI needs an all new docco set and that should probably be highest priority.
                    A sync would be nice but as I see now it's all properly linked at
                    http://www.x.org/wiki/RadeonFeature#Documentation

                    My bad!

                    Comment


                    • #25
                      Whoops, my bad too... I completely forgot about that. We actually had docco in a third place -- the Stream (now APP) SDK web pages, so agd5f built a list to wrap them all. Apologies if it was someone else than agd5f

                      Comment


                      • #26
                        Originally posted by ldesnogu View Post
                        They probably used what is considered in the industry as the best C++ front-end: Edison Design Group C++. It's used by almost all larger companies that produce their own compilers (TI, Intel to name a few).
                        Note that this opinion has not been re-evaluated in recent years. That is, everyone uses EDG because everyone uses EDG and there weren't any alternatives (GCC's frontend is not only GPL'd, it's also a ****ing nightmare to integrate into anything else, GPL'd or not). Clang has sprung into existence just a few years ago and is already at the same level of C++11 support as GCC. One can expect that in the coming years, a lot of tools may well migrate from EDG to Clang. The ones that don't will mostly be those monstrous clunking behemoths for which replacing the frontend would be far too much work, or those that have invested too heavily in modifying the EDG-based frontend (like Microsoft's compiler, which lags in C++11 because it's essentially a fork of an ancient version of EDG, but all of the crazy Microsoft extensions like C++/CLi and C++/Cx only exist in that frontend).

                        Comment

                        Working...
                        X