Announcement

Collapse
No announcement yet.

AMD RDNA3 GPUs Can Have A Lot More Vector Registers Than RDNA2

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Hope they will "support" GFX11 in ROCm Stack from the start, not as a first class citizen, but at least compile in support by default into their released packages.

    I guess / hope AMD will have a good benefit over Nvidia in this generation. I guess they will, because of:
    - Better energy efficiency
    - Better yields due to chiplet´s design
    - Unsure: No large matrix cores with BF16/BF32 support which eat a lot of chip area but aren´t really used. -> Really unfortunate for people wanting to use ML on consumer cards. (not sure regarding WMMA)​

    Personally i am using a 5700XT right now, i bough a 3090 recently to do some machine learning, but switched back to the 5700XT as a lot of thing don´t really work that smoothly on the linux desktop with the nvidia card / driver stack.

    I just build a second PC for the 3090 now, which i turn on, when i want to play arround with CUDA / machine learning.

    Having ROCm working for ML on GFX11 would be a reason for me to buy a RDNA3 card.

    One big reason i switched back to the 5700XT on my main PC: The Nvidia card draws ~50W when idling on the desktop, the AMD card only draws ~11W.

    Comment


    • #12
      Originally posted by Spacefish View Post
      Hope they will "support" GFX11 in ROCm Stack from the start, not as a first class citizen, but at least compile in support by default into their released packages.
      I guess / hope AMD will have a good benefit over Nvidia in this generation. I guess they will, because of:
      - Better energy efficiency
      - Better yields due to chiplet´s design
      - Unsure: No large matrix cores with BF16/BF32 support which eat a lot of chip area but aren´t really used. -> Really unfortunate for people wanting to use ML on consumer cards. (not sure regarding WMMA)​
      Personally i am using a 5700XT right now, i bough a 3090 recently to do some machine learning, but switched back to the 5700XT as a lot of thing don´t really work that smoothly on the linux desktop with the nvidia card / driver stack.
      I just build a second PC for the 3090 now, which i turn on, when i want to play arround with CUDA / machine learning.
      Having ROCm working for ML on GFX11 would be a reason for me to buy a RDNA3 card.
      One big reason i switched back to the 5700XT on my main PC: The Nvidia card draws ~50W when idling on the desktop, the AMD card only draws ~11W.
      "Better yields due to chiplet´s design"

      its not a chiplet design in the meaning of multible gpus dies..

      the 24gb vram RDNA3 card has 3D stagged cache on top of the gpu like the 5800X3D cpu...
      this card also has 256mb infinity cache... the infinity cache is on the GPU die and there is 756mb 3D stagged cache on top of the gpu die. the 5800X3D shows it makes the design even more energy efficient because cache it in 3D cache is more efficient than to wait for the data moving from the RAM or VRAM...
      the 24gb vram card has a 384bit GDDR6 ram interface. AMD competes with GDDR6X on Nvidia side with infinity cache and 3D stagged cache...

      your 5700XT is not optimal if you want to use ROCm/HIP for example my Vega64 is supported in blender and the 5700 is not yet...
      RDNA2 cards are also supported.

      i am sure AMD will try to give a good driver support for RDNA3 from the start but on linux there is never a guarantee for this.

      Phantom circuit Sequence Reducer Dyslexia

      Comment


      • #13
        Originally posted by SquidHM3 View Post

        Yes sir, quite true - I didn't really mention it in my last post because I didn't want to sound snobbish - my current rig is an i9-10980xe with an rtx 3090 (non-Ti) and 256GB of system RAM. I run Windows on that machine, and run a very, very heavily modded Skyrim (among other workloads). 24GB of VRAM might seem excessive, but I've seen it use as much as 92% of that memory!! I've thought about pursuing AMD's professional cards, that have 32GB of VRAM on them. Gaming isn't all I do, I also run einstein@home and rosetta@home as well.

        Heck, the only reason I have a system that nice is all the overtime I worked in 2020-2021. I dropped a metric ton of cash on that thing. Enough to buy a quality used car. In today's economy, that would be highly unwise to do for sure. And I'm sure the wife wouldn't be happy about it either.

        Been a long time lurker on this forum, and decided it was about time to start getting involved.
        Wow! Do you have any videos of what that heavily modded version of Skyrim would look like? I would love to see what a beef PC like that can do. I was thinking about getting 48GB on my PC, but that still a long way away from 256GB.
        I used to do einstein@home back in the day but moved over to folding@home While I am cautiously optimistic that AMD can put out some competitive cards with the RDNA3 releases... I am concerned that AMD will have a competitive real time ray-tracing product to give you something to upgrade from the 3090.

        What brought you back to this Linux focused forum?

        Comment


        • #14
          IMO, AMD are really well positioned with GPUs, and with the economic downturn.

          Intel GFX won't compete with what's available used.

          nVidia too expensive, too hot, power-hungry. And it's not clear that they've pushed their arch balance in the right direction for gaming (frame insertion over more frames delivered).

          Plus, unlike nVidia, they have the flexibility to shift their wafer agreements between CPU and GPU production, both of which should be very competitive. They can go where the market takes them, and take share in both places.

          NVidia can really only hope that data center will eat up their wafer agreements, or get out of them at a penalty.

          Comment


          • #15
            Originally posted by Qarium
            "Better yields due to chiplet´s design"

            its not a chiplet design in the meaning of multible gpus dies..

            the 24gb vram RDNA3 card has 3D stagged cache on top of the gpu like the 5800X3D cpu...
            this card also has 256mb infinity cache... the infinity cache is on the GPU die and there is 756mb 3D stagged cache on top of the gpu die. the 5800X3D shows it makes the design even more energy efficient because cache it in 3D cache is more efficient than to wait for the data moving from the RAM or VRAM...
            the 24gb vram card has a 384bit GDDR6 ram interface. AMD competes with GDDR6X on Nvidia side with infinity cache and 3D stagged cache...​
            Not quite. Radeon 7000 high-end is a central compute die, and the 6 chiplets each contain the gddr6 memory controller and 16MB of infinity cache, which has been re-worked and is supposed to be a lot more efficient. It also has the micro-bumps to stack more cache on top -- it's said that stacked cache will only appear on cards that have twice as much memory as what's typically sold into the consumer market.

            This is another great flexibility of their design. They don't have to pay the area cost of cache and memory controllers for binned dies where CUs that are defective anyway.

            Or they may have the flexibility, in the future, to populate just 4 of those channels when faster GDDR is available, if it provides BOM savings.

            Comment


            • #16
              Originally posted by ravyne View Post
              Not quite. Radeon 7000 high-end is a central compute die,.
              its not a compute die like CDNA ... its a RDNA GPU die.
              and this myfriend is not a chiplet design in the traditional meaning of multible GPUs....

              Originally posted by ravyne View Post
              and the 6 chiplets each contain the gddr6 memory controller and 16MB of infinity cache, which has been re-worked and is supposed to be a lot more efficient. It also has the micro-bumps to stack more cache on top -- it's said that stacked cache will only appear on cards that have twice as much memory as what's typically sold into the consumer market.
              This is another great flexibility of their design. They don't have to pay the area cost of cache and memory controllers for binned dies where CUs that are defective anyway.
              Or they may have the flexibility, in the future, to populate just 4 of those channels when faster GDDR is available, if it provides BOM savings.
              i think there is still a infinity cache in the main GPU die. we will see how the final design looks like but one is for sure many more chips than the Nviida design.
              Phantom circuit Sequence Reducer Dyslexia

              Comment


              • #17
                I'm aware of the differences from CDNA (roughly, an evolved GCN stripped of fixed-function graphics silicon like texture units) and RDNA.

                The current big CDNA is two fully-functioning dies, with their own control circuitry, caches and memory controllers, lashed together with one of them in control. They could (and maybe do) sell a single-die variant, but not sure that's any interest to those buying CDNA.

                The RDNA3 "compute" die is basically everything a traditional GPU would be, except with infinity-fabric links where the memory controllers would be. The RDNA3 compute die isn't functional on its own. The IF links do present interesting speculation on potential multi-die RDNA3 GPUs, or upgrades to different/future memory technologies.

                There will be cache on the main die, L1/L2 of course, as well as scratchpad, in their traditional places. There might be another level of cache as well, and they might even brand it as part of 'infinity cache' but it will almost certainly be a different cache level, and very small. Not larger than 32mb would my bet.

                I am very interested to see what the memory hierarchy will look like.

                Comment


                • #18
                  Originally posted by ravyne View Post
                  They could (and maybe do) sell a single-die variant, but not sure that's any interest to those buying CDNA.
                  We do offer the MI210, which is a single die on a PCIE/CEM form factor card:



                  The primary advantage is being able to use it in a wider range of systems.
                  Test signature

                  Comment

                  Working...
                  X