If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
AMD Announces Milan-X 3D V-Cache CPUs, Azure Prepares For Great Upgrade
AMD did also tease ROCm 5.0 is on the way... Besides adding MI200/Aldebaran support there, their open-source compute stack will hopefully see official RDNA/RDNA2 GPU support with ROCm 5.0.
At this point, I'll believe it not only when I see it, but when I have successfully installed it on an RDNA2 GPU system and run code.
MI200 feels odd. It is dual GPU on board card and it is compared vs single chip A100, that on top of that has some unique properties like ability to split into multiple instances. A100 that is already aging, and is single chip and only having between 1.4x to max 2.4x in aplications better performance on newer process node with still subpar CUDA support is .... questionable.
Also I don't see information about power draw what in such utilization is very important for example 9 out of top10 in Green500 is Nvidia GPU driven.
560W I believe.
It's certainly an interesting product. It's not quite 1 GPU or 2 GPUs.
AMD doesn't really have any choice but to compare to the A100. They don't have access to NVidia's next-gen chip.
Last edited by smitty3268; 08 November 2021, 11:36 PM.
Not me. Intel still sacrifices security for performance and several cloud providers say Intel runs too hot and have opted out of Intel for now.
Please show me where Intel still sacrifices security for performance?!
On the other hand, you seem to forget about this:
We discover timing and power variations of the prefetch instruction that can be observed from unprivileged user space. In contrast to previous work on prefetch attacks on Intel, we show that the prefetch instruction on AMD leaks even more information. We demonstrate the significance of this side channel with multiple case studies in real-world scenarios. We demonstrate the first microarchitectural break of (fine-grained) KASLR on AMD CPUs. We monitor kernel activity, e.g., if audio is played over Bluetooth, and establish a covert channel. Finally, we even leak kernel memory with 52.85 B/s with simple Spectre gadgets in the Linux kernel. We show that stronger page table isolation should be activated on AMD CPUs by default to mitigate our presented attacks successfully.
At least get your facts straight when spreading FUD...
MI200 feels odd. It is dual GPU on board card and it is compared vs single chip A100, that on top of that has some unique properties like ability to split into multiple instances. A100 that is already aging, and is single chip and only having between 1.4x to max 2.4x in aplications better performance on newer process node with still subpar CUDA support is .... questionable.
Also I don't see information about power draw what in such utilization is very important for example 9 out of top10 in Green500 is Nvidia GPU driven.
It's not a dual GPU. It's dual chiplets sort of. Like 1st gen Zen. The connection between the chips is coherent,
It's not a dual GPU. It's dual chiplets sort of. Like 1st gen Zen. The connection between the chips is coherent,
that is not what matters, what matters in GPU is if there is one memory controller for 2 chips or 2 memory controllers, eg does 1 chiplet has very fast access to all memory of GPU (way beyond Infinity Fabric, because when IF has to deal with relativly slow RAM, here dealing with HBM in GPU configuration is crazy high bandwitdh far beyond RAM in Ryzen case). And this is serious challenge. If AMD did that, that is very impressive.
that is not what matters, what matters in GPU is if there is one memory controller for 2 chips or 2 memory controllers, eg does 1 chiplet has very fast access to all memory of GPU (way beyond Infinity Fabric, because when IF has to deal with relativly slow RAM, here dealing with HBM in GPU configuration is crazy high bandwitdh far beyond RAM in Ryzen case). And this is serious challenge. If AMD did that, that is very impressive.
AFAIK, they are still no there yet, so moving data between chips has to go through 4 IF 3.0 links, which gives 200GB/second bi-derectional bandwith that is really impressive, and on par for regular GDDR, but just a fraction of the insane HBM2E bandwidth those chips have.
Comment