Originally posted by seandarcy
View Post
Announcement
Collapse
No announcement yet.
AMD Publishes Open-Source Linux HSA Kernel Driver
Collapse
X
-
-
Not expecting benefits to the kernel directly -- the kernel driver exposes functionality to userspace that toolchains can call on to let apps more-or-less invisibly run faster. The app may or may not need to be rebuilt in order to take advantage of the new functionality depending on the JIT-iness of the language & APIs already in use. As examples, a C++ AMP app would need a rebuild while a OpenCL app would probably not.
Leave a comment:
-
HSA requires application update ??
Will HSA provide benefits to the kernel itself ? In other words, if I use existing applications will I see any benefit ? Or is this an API, where applications must be modified to see _any_ benefit. I understand that if applications are modified they can take advantage of HSA. The question is, are kernel level operations helped, even without application modification.
Leave a comment:
-
So, will earlier AMD APU generations (Bobcat and whatnot) be able to make use of any of this? They probably don't have the Youmu, err, IOMMU to make full use of it, but maybe there could be some minor gains out of this anyway?
Leave a comment:
-
A10-7800 has 3500 MHz base and 3900 MHz max turbo frequency at 65 Watts, that was published when it was introduced on 3dr July.
Auf Planet 3DNow! gibt es alle wichtigen Informationen fȹr AMD-User: News, Downloads, Support, Tests
As apposed to when A8-7600 was introduced in January, they decided to this time not communicate the speeds for the 45 Watt mode. Same for the desktop "Pro" models, some of which come at 65/45W and 65/35W TDP, with only the speds for the higher TDP having been communicated. Sadly. AMD marketing often is a little questionable. To say it the sugarsweet way.
And almost nobody seems to have delivered those news. Even more sadly.
Leave a comment:
-
Originally posted by ObiWan View PostThe upper (and higher) are turbo clocks,
the lower ones are the standard non turbo clock.
Leave a comment:
-
Originally posted by kaprikawn View PostSo if I understand this correctly, it means that both the CPU and GPU portions of an APU can both access the same memory (like they've been banging on about for the PS4 and Xbone 180)?
Does that mean that before, if you had an APU, some of your RAM was allocated to GPU tasks at startup, and when the CPU needed the GPU to do something then it had to transfer the data from the memory addresses used by the CPU to the parts used by the GPU (even if that was on the same physical stick of RAM)?
If my understanding is correct, I'm guessing it has no benefit for users with a CPU and a dedicated GPU where, obviously, the GPU has it's own RAM on the card?
Obviously for certain purpose you would arrange things for optimal performance (just like you arrange thing for optimal audio performance on a CPU). If the task demands it, you would wire down certain pages being accessed by the GPU so that they don't have to fault with the glitch that implies. You'd run certain GPU threads at real-time priority so they aren't interrupted by less important threads, etc. But the basic model is to have the OS controlling memory management and time scheduling for the GPU cores just like for CPUs. The value of this is most obvious when you imagine, for example, that you want a large compute-job to run on your GPU, but you want to time-slice it with something realtime like video decoding or game playing, or just UI. The OS can, on demand, submit bundles of code representing UI updates to the real-time queue and have them immediately executed, but while that's not happening, in any free time, the compute job can do what it does, which might include (for very large jobs) occasionally page faulting to bring in new memory. Compute jobs will no longer have to be written like it's the 80s, manually handling segmentation to swap memory in and out, manually trying to reduce their outer loop to something that lasts for less than a 30th of a second ala co-operative multi-tasking.
But all this is based on the idea that the CPU and GPU cores share a NoC, a common ultra-highspeed communication system, along with a shared address space and a high performance coherency mechanism (eg common L3 cache). That's not the case for existing discrete GPUs, and it's not clear (at least to me) if it could be made to work fast enough to be useful over existing PCIe. Basically this is a model based on the idea that the future of interest is GPU integrated onto the CPU (or, if necessary, communicating with it by the sort of inter-socket communications pathways you see on multi-socket Xeon motherboards). This fact makes gamers scream in fury because it is very obvious that they are being left behind by this. Well, that's life --- gaming just isn't very important compared to mainstream desktop computing and mobile, the worlds that don't use and don't care about discrete GPUs.
Leave a comment:
Leave a comment: