Announcement

Collapse
No announcement yet.

AMD Lands A Number Of RadeonSI RDNA NGG Fixes Ahead Of RDNA3 Enabling

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • paulocoghi
    replied
    Michael , unfortunately, WCCFTech is now publishing some articles copied from Phoronix without giving the proper source link, as well as NOT setting the canonical url on the source code to point to your original article.

    Look at the following article:
    https://wccftech.com/amd-receives-se...u-architecture

    They need to be notified. They are wrong.
    Last edited by paulocoghi; 13 June 2022, 06:46 PM.

    Leave a comment:


  • qarium
    replied
    Originally posted by brucethemoose View Post
    Another thing I want to ask the AMD driver people: are there any changes specifically for the upcoming multi-chip GPUs?
    if you see the first multi chip cpus like threadripper 1950X it can emulate a cpu without chiplet design and it can run in numa node mode.
    the 1950X threadripper has 2 ram channel per chiplet and cross links between the chiplet
    the 2990WX and 2970WX can only run in numa node mode because 2 chiplets have 2 64bit ram channel and 2 only sit on the cross links without having any ram at all for their own need.

    for RDNA3 in the past i thought they make a big IO cross bar chiplet and 2 gpu chips sharing the vram on the IO cross bar... but this will not be the case because the overhead of sync the complete data between these 2 gpu cores will be to much overhead in the IO crossbar.

    no the RDNA3 design willl be more like the threadripper 1950X design but instead of only cross talking lines between 2 gpu dies they will have a IO crossbar in the middle with infinity cache and every gpu will have their own GDDR6 ram channel controller. by this every gpu only need to sync some and not all data over the IO crossbar. and some data can still be shared over all ram channels to use the full vram of all chiplets.

    in the past if you had dualgpu and 16gb vram you could only use 8gb vram with that design if you have 16gb vram you can use full 16gb for stuff you dont need "sync" and 8gb+ for stuff you need to sync.

    also remember this: the first RDNA3 card will not have 2gpu chiplets + IO die but instead the radeon7700XT will only have 1 GPU chiplet and the IO die.

    they realase this 7700 first because they need 6month or more to implement the driver to utilize the full 2 cpu chiplet plus IO die what will not be trivial...

    but who cares the 7700 will have 6950XT level performance with only 12gb vram the full 7900XT will have 24gb vram.

    Leave a comment:


  • marek
    replied
    Originally posted by skeevy420 View Post

    What about driver-level FSR for AMDGPU or generically into Mesa?
    Not at the moment, though it would be possible since it's open source, though different between Mesa GL and Mesa Vulkan.

    Originally posted by brucethemoose View Post
    Another thing I want to ask the AMD driver people: are there any changes specifically for the upcoming multi-chip GPUs?
    No comment regarding future chips.

    Originally posted by Steffo View Post
    Can someone explain what NGG is? Couldn't really find something in the internet. All links pointed to Phoronix which doesn't really explain what this is.
    NGG was a rewrite of the geometry pipeline in hw. NGG stands for next generation graphics, referring to mesh shaders and the rewrite. It's talked about only because the hw has (had) both the old and new pipeline and driver developers constantly talk about which one to use, but being aware of it is not useful to the general public.

    Leave a comment:


  • Steffo
    replied
    Can someone explain what NGG is? Couldn't really find something in the internet. All links pointed to Phoronix which doesn't really explain what this is.

    Leave a comment:


  • Venemo
    replied
    NGG stream-out for RADV was challenging too but seems to have stabilized as well since its wiring up three years ago.
    This is inaccurate. We removed NGG streamout functionality from RADV last year, for two reasons:
    • It was never stable enough to be enabled by default
    • It was LLVM only, we'll need to port it to NIR instead

    Leave a comment:


  • brucethemoose
    replied
    Another thing I want to ask the AMD driver people: are there any changes specifically for the upcoming multi-chip GPUs?


    This seems like a natural fit for tiled rendering, especially with AMD's cache-heavy setup... but that's not how the current GPU drivers work, right?

    Or do they already work that way?

    Leave a comment:


  • skeevy420
    replied
    Originally posted by marek View Post

    No, it's totally different and it would be slower.
    What about driver-level FSR for AMDGPU or generically into Mesa?

    That's the one thing I've been wanting to ask someone who are so wise in the way of graphics.

    Leave a comment:


  • marek
    replied
    Originally posted by ms178 View Post
    Is there still a chance to see NGG working (with reasonable benefits) on Vega?
    No, it's totally different and it would be slower.

    Leave a comment:


  • kiffmet
    replied
    ms178

    Asking myself the same, I briefly read the AMDVLK PAL code a few years back. IIRC and according to the code comments, there are several hardware bugs prohibiting this. The issues range from GPU hangs, to signals for synchronization not being delivered, work items being skipped and some low level memory management related stuff.

    There was also an interface regarding special gds allocations missing in the kernel module and libdrm - that might still be the case for GFX9. It is likely that the implementation has also changed with RDNA 1/2/3 and the HW registers regarding NGG on Vega aren't really documented, so its not even guaranteed that the code can be backported to Vega hardware.

    I dunno if it's even possible to work around all of that, but it seems like there would be so much overhead associated with mitigating this, that the performance gains would be negligible and even negative in some cases.

    Just push core and memory clocks as fast as you can while staying within a reasonable power target. It helps fixed function performance (i.e. geometry throughput) quite a bit.
    Last edited by kiffmet; 12 June 2022, 07:55 AM.

    Leave a comment:


  • ms178
    replied
    Is there still a chance to see NGG working (with reasonable benefits) on Vega?

    Leave a comment:

Working...
X