Announcement

Collapse
No announcement yet.

AMD Publishes Open-Source Linux HSA Kernel Driver

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by chithanh View Post
    Last I checked, AMD no longer lists Kabini as HSA compatible[1]. So I would avoid buying Kabini for HSA. The Kabini/Temash successors Beema/Mullins however seem to support HSA in some form.
    FUll HSA support requires an APU/CPU/chipset with an iommu that supports ats/pasid. Kabini does not have a compatible iommu.

    Comment


    • #52
      Originally posted by dungeon View Post
      Is that new A10-7800 APU on the picture - 65W/45W part? Like lower TDP on that one, 512 shaders with 45W sounds cool .
      You're thinking of the A8-7600 which is available NOWHERE at the moment. We should see it by the end of the year, meaning it was a total paper release. Sad.

      Comment


      • #53
        Originally posted by molecule-eye View Post
        You're thinking of the A8-7600 which is available NOWHERE at the moment. We should see it by the end of the year, meaning it was a total paper release. Sad.
        No i am thinking of what i ask . After reading this announce 9 days ago:

        The AMD A10-7800 APU will be available for purchase in Japan starting today, with worldwide availability at the end of July.

        Comment


        • #54
          As i see they have it in hands in Japan and even advertised cTDP of 45W .


          Comment


          • #55
            Originally posted by molecule-eye View Post
            You're thinking of the A8-7600 which is available NOWHERE at the moment. We should see it by the end of the year, meaning it was a total paper release. Sad.
            At first glance it appears that we might have been able to meet the original A8-7600 power levels without having to fuse off any GPU cores and so decided to sell the part as an A10-7800 instead of A8-7600. That is not an official statement, just my impression of what probably happened.
            Test signature

            Comment


            • #56
              When Kaveri (Desktop) was first presented in January, A8-7600 was amongst the models presented, though only the two flavors of A10 came to market until now. But they prsented Specs for A8-7600 when running as default 65W config, as well as for the 45W configuralble TDP, see the first slide here.

              The numbers are for 65 / 45 Watts:
              Default CPU freq: 3.3 / 3.1 GHz
              Max Turbo Core: 3.8 / 3.3 GHz
              GPU Frequency: 720 / 720 MHz
              CPU Cores: 4 / 4
              GPU Cores: 6 / 6 (384 shaders)

              So there were defined frequencies for the 45W cTDP published, for A10-7800 though, they didn't publish these numbers. But at least they (the frequencies) can be expected to be a little higher than those of A8-7600. GPU and CPU cores are the same as for 65W config (4 CPU, 8 GPU aka 512 shaders) of course. The nice part is that the GPU doesn't seem to need downclocking though in paractice/average it might run al little slower then in 65W mode.

              Comment


              • #57
                There are some published specs - click on "specs" tab at :



                Doesn't seem to say whether clock freqs are turbo or not, but from the numbers I imagine they are. The A10-7800 and A8-7600 are both there, guessing that fabbed parts yielded more A10-7800s than A8-7600s.
                Test signature

                Comment


                • #58
                  The upper (and higher) are turbo clocks,
                  the lower ones are the standard non turbo clock.

                  Comment


                  • #59
                    Originally posted by kaprikawn View Post
                    So if I understand this correctly, it means that both the CPU and GPU portions of an APU can both access the same memory (like they've been banging on about for the PS4 and Xbone 180)?

                    Does that mean that before, if you had an APU, some of your RAM was allocated to GPU tasks at startup, and when the CPU needed the GPU to do something then it had to transfer the data from the memory addresses used by the CPU to the parts used by the GPU (even if that was on the same physical stick of RAM)?

                    If my understanding is correct, I'm guessing it has no benefit for users with a CPU and a dedicated GPU where, obviously, the GPU has it's own RAM on the card?
                    The point is that, for historical reasons, the GPU has been treated as a kind of weird peripheral, not as a kind of CPU that just happens to use a different ISA from the main CPU. Suppose you have a SoC (ie GPU on same chip as CPU) and imagine that you ditched all that historical baggage. How would you do things? The obvious model is that the GPU would be treated by the OS as just another "CPU". (Depending on the detail, "the" GPU might in fact be treated by the OS in fact four or six or eight GPU "cores"). CPU and GPU cores would share the same virtual address space in a coherent fashion. The OS would schedule code on the GPU, just like it does on the CPU. The GPU would have the same structured page tables (with the same permissions, and the same ability to fault code or have code paged out). The GPU would support at least a small subset of interrupts (for example an interrupt which would allow for context switching).

                    Obviously for certain purpose you would arrange things for optimal performance (just like you arrange thing for optimal audio performance on a CPU). If the task demands it, you would wire down certain pages being accessed by the GPU so that they don't have to fault with the glitch that implies. You'd run certain GPU threads at real-time priority so they aren't interrupted by less important threads, etc. But the basic model is to have the OS controlling memory management and time scheduling for the GPU cores just like for CPUs. The value of this is most obvious when you imagine, for example, that you want a large compute-job to run on your GPU, but you want to time-slice it with something realtime like video decoding or game playing, or just UI. The OS can, on demand, submit bundles of code representing UI updates to the real-time queue and have them immediately executed, but while that's not happening, in any free time, the compute job can do what it does, which might include (for very large jobs) occasionally page faulting to bring in new memory. Compute jobs will no longer have to be written like it's the 80s, manually handling segmentation to swap memory in and out, manually trying to reduce their outer loop to something that lasts for less than a 30th of a second ala co-operative multi-tasking.

                    But all this is based on the idea that the CPU and GPU cores share a NoC, a common ultra-highspeed communication system, along with a shared address space and a high performance coherency mechanism (eg common L3 cache). That's not the case for existing discrete GPUs, and it's not clear (at least to me) if it could be made to work fast enough to be useful over existing PCIe. Basically this is a model based on the idea that the future of interest is GPU integrated onto the CPU (or, if necessary, communicating with it by the sort of inter-socket communications pathways you see on multi-socket Xeon motherboards). This fact makes gamers scream in fury because it is very obvious that they are being left behind by this. Well, that's life --- gaming just isn't very important compared to mainstream desktop computing and mobile, the worlds that don't use and don't care about discrete GPUs.

                    Comment


                    • #60
                      Originally posted by ObiWan View Post
                      The upper (and higher) are turbo clocks,
                      the lower ones are the standard non turbo clock.
                      That made sense for parts with a single power dissipation rating, but for parts with two ratings (eg 65W/45W) where I *think* the clock speeds are different at different ratings it's less clear how to interpret the numbers. I guess for now I'll stay with the cynic's view that the numbers represent the highest power rating
                      Test signature

                      Comment

                      Working...
                      X