Announcement

Collapse
No announcement yet.

AMD Announces The Radeon RX 7700 XT & RX 7800 XT Graphics Cards

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by Myownfriend View Post
    AMD doesn't sell PS5s and XBSX/S's to consumers. They were commissioned by Sony and Microsoft respectively to make them. The reason that they sell is because of the total package that Sony and Microsoft offer in terms of their OSs, controllers, and game libraries. Their games will perform better on their hardware than they would for general PC hardware, too, because their software stack is all built to try to exploit everything about those SOCs.
    DDR5 is the most current memory standard for desktop computers so there's no DDR6 for them to use. GDDR6 and 6X are for VRAM. They're high bandwidth but they're very high latency which means that CPU performance would take a pretty significant hit when they need to access it.
    There's also the issue of GDDR6 DIMMs not being a thing.
    they did already agree that they do not want to do DDR6 instead they will do DDR5 DDR5 MRDIMM
    with 17 600 MT/s

    JEDEC has revealed a new standard that accesses two DDR5 memory ranks simultaneously to double bandwidth.


    "The MR stands for "Multi-Ranked Buffered DIMMs," and it is not entirely unlike RAID-ing your RAM.
    MRDIMMs achieve double the data rate that the same hardware would offer in standard DDR5 mode by simultaneously accessing two memory ranks, whether on a single module or a pair of DIMMs. This is made possible by placing a mux between the memory and the CPU that combines the two 64-bit accesses into a single 128-bit data path for the CPU. Obviously, this buffering is going to add a bit of latency to the transfers, but JEDEC seems to believe that this will be offset by the higher transfer rate.
    The main benefit of this approach is that it has a minimal price premium; aside from the buffer/mux, MRDIMMs can be created from existing DDR5 memory stocks. Likewise, machines using MRDIMMs should in theory be backward compatible with standard DDR5 modules. A slide from a JEDEC presentation at Memcon in San Jose, posted by AMD's VP of Datacenter on LinkedIn, seems to imply that JEDEC expects MRDIMMs to start at 8800 MT/s and scale up to 17,600 MT/s by the third generation of the technology.
    Interestingly, the slide also says that the need for DDR6 memory is "unclear" due to uncertainty about its value proposition.​"
    Phantom circuit Sequence Reducer Dyslexia

    Comment


    • #62
      Originally posted by bridgman View Post
      Right... we could make & sell a PC based on the same kind of architecture we use in game consoles (dedicated wide/fast memory) but it would mean that (a) we would need to sell a package with APU and memory, similar to what Apple did with M1/M2 and (b) someone (us or a partner) would need to make and sell a mobo with the APU/memory package soldered down.
      Otherwise you would be looking at an EPYC-sized package in order to support all the high speed memory channels (EYPC has 512-bit memory bus like M2 Max).
      I don't think a sufficient market would exist if the product was in a desktop form factor, and our mobile parts have been moving in that direction (use with wide fast soldered down memory plus as many CUs as that memory can support) for a while. There is still a big difference between the CU count that can be supported with 128-bit DDR5 and what you can support with 256-bit GDDR6 or 512-bit LPDDR5.
      It's certainly do-able, but it would have to be designed as a complete product first with chip coming second, like we do with game console partners.
      maybe DDR5 MRDIMM (Multi-Ranked Buffered DIMMs) change anything here?

      i really don't get why amd does not do big APU's because apple is already all in with big SOC/APUs...

      i also think the latency problem goes to near zero with 3D cache technology a 7800X3D is nearly as fast on the slowest ram as it is on the fasted ram because most workload run in L3 and never go to ram.

      i also see a market in the AI field people need big VRAM agaist the memory wall and these large APUs could give them exactly this.
      Phantom circuit Sequence Reducer Dyslexia

      Comment


      • #63
        Originally posted by qarium View Post

        they did already agree that they do not want to do DDR6 instead they will do DDR5 DDR5 MRDIMM
        with 17 600 MT/s

        JEDEC has revealed a new standard that accesses two DDR5 memory ranks simultaneously to double bandwidth.


        "The MR stands for "Multi-Ranked Buffered DIMMs," and it is not entirely unlike RAID-ing your RAM.
        MRDIMMs achieve double the data rate that the same hardware would offer in standard DDR5 mode by simultaneously accessing two memory ranks, whether on a single module or a pair of DIMMs. This is made possible by placing a mux between the memory and the CPU that combines the two 64-bit accesses into a single 128-bit data path for the CPU. Obviously, this buffering is going to add a bit of latency to the transfers, but JEDEC seems to believe that this will be offset by the higher transfer rate.
        The main benefit of this approach is that it has a minimal price premium; aside from the buffer/mux, MRDIMMs can be created from existing DDR5 memory stocks. Likewise, machines using MRDIMMs should in theory be backward compatible with standard DDR5 modules. A slide from a JEDEC presentation at Memcon in San Jose, posted by AMD's VP of Datacenter on LinkedIn, seems to imply that JEDEC expects MRDIMMs to start at 8800 MT/s and scale up to 17,600 MT/s by the third generation of the technology.
        That's basically the same as increasing the bus width. Assuming the same amount of MRDIMMS as DIMMS, that would bring the bandwidth up to 563 GB/s shared by the CPU and GPU. That would be more than enough for a PS5 or XBSX level SOC but would obviously fall way short of that if it weren't maxed out. It's also more external bandwidth than a 7700XT.


        Comment


        • #64
          Originally posted by qarium View Post
          maybe DDR5 MRDIMM (Multi-Ranked Buffered DIMMs) change anything here?

          i really don't get why amd does not do big APU's because apple is already all in with big SOC/APUs...
          AMD does make big APUs just not for consumers. Look up the Instinct MI300. It's for super computers.

          Apple is all-in with big SOCs because they don't make PCs. One of the selling points of a desktop is it's modularity, compatibility with other parts, and upgrade-ability. Apple doesn't do that. As mentioned in our previous discussion, Apple's biggest SOCs are created by using two SOC chiplets to make one SOC that's like 800+ mm2, it's external memory is on-package, and it's GPU architecture is good at keeping work on-chip.

          Until AMD can scale their GPU compute up with chiplets, they can't just bond two SOCs together like that. They'd have to use a lot more chiplets with one centralized GPU compute chiplet.

          Originally posted by qarium View Post
          i also think the latency problem goes to near zero with 3D cache technology a 7800X3D is nearly as fast on the slowest ram as it is on the fasted ram because most workload run in L3 and never go to ram.
          The 7800X3D still benefits from faster memory it just depends on the workload. It's also not contending with a high-powered GPU by sharing a pool of memory with it. In an SOC, those same Zen4 cores would take a much larger hit when they do need to access external memory.

          They could make an SOC that's a Zen4 chiplet connected to an IO/GPU die that's set up like a 7700XT with cache chiplets that are connected to MRDIMMs. If the MRDIMMS are maxed out then that would provide enough bandwidth for that configuration with decently low latency and you'd have a unified memory architecture.

          Would a decent enough segment of the PC user-base value that over modularity of having a 7800X3D and a 7700XT, though? I suppose they could still have slots for PCI expansion so they could still upgrade to a more powerful GPU but later on but then you're not UMA anymore and you'd just have an overpowered iGPU. On top of that, the motherboards would have to be built to deliver way more power to the socket than a regular Zen 4 CPU and the cooling would have to be way more robust. More importantly, what if people want an AMD CPU and an Nvidia GPU? Nvidia still has the largest marketshare in discrete GPUs.

          I'm not against more integration between GPUs and CPUs in PCs. I personally have thought it would be cool to have GPU sockets on motherboards with a more specialized link between them and shared cooling but that obviously creates a lot of other issues like socket combinations, motherboard size, availability of cooling solutions, etc.

          Originally posted by qarium View Post
          i also see a market in the AI field people need big VRAM agaist the memory wall and these large APUs could give them exactly this.
          People in the AI field already have access to large GPU and CPU clusters as well as large APUs.

          Comment


          • #65
            Originally posted by Myownfriend View Post
            AMD does make big APUs just not for consumers. Look up the Instinct MI300. It's for super computers.
            Apple is all-in with big SOCs because they don't make PCs. One of the selling points of a desktop is it's modularity, compatibility with other parts, and upgrade-ability. Apple doesn't do that. As mentioned in our previous discussion, Apple's biggest SOCs are created by using two SOC chiplets to make one SOC that's like 800+ mm2, it's external memory is on-package, and it's GPU architecture is good at keeping work on-chip.
            Until AMD can scale their GPU compute up with chiplets, they can't just bond two SOCs together like that. They'd have to use a lot more chiplets with one centralized GPU compute chiplet.
            ​I honestly do not see the need for AMD for that. other companies like intel need it much more.

            and you give yourself the answer:

            "They could make an SOC that's a Zen4 chiplet connected to an IO/GPU die that's set up like a 7700XT with cache chiplets that are connected to MRDIMMs. If the MRDIMMS are maxed out then that would provide enough bandwidth for that configuration with decently low latency and you'd have a unified memory architecture."

            with a design like that a 7800X3D would be alone 3 chiplets the 3D cache die the IO die the CPU die with a RDNA3 7800 you would have 1 GPU die and 4 infinity cache dies. this together would be already 8 chiplets...

            so tell me why is apple better if they use 2 chiplets and if amd could do the same with 8 chiplets they need new gpu design to allow chiplets ???

            Originally posted by Myownfriend View Post
            The 7800X3D still benefits from faster memory it just depends on the workload. It's also not contending with a high-powered GPU by sharing a pool of memory with it. In an SOC, those same Zen4 cores would take a much larger hit when they do need to access external memory.
            They could make an SOC that's a Zen4 chiplet connected to an IO/GPU die that's set up like a 7700XT with cache chiplets that are connected to MRDIMMs. If the MRDIMMS are maxed out then that would provide enough bandwidth for that configuration with decently low latency and you'd have a unified memory architecture.
            Would a decent enough segment of the PC user-base value that over modularity of having a 7800X3D and a 7700XT, though? I suppose they could still have slots for PCI expansion so they could still upgrade to a more powerful GPU but later on but then you're not UMA anymore and you'd just have an overpowered iGPU. On top of that, the motherboards would have to be built to deliver way more power to the socket than a regular Zen 4 CPU and the cooling would have to be way more robust. More importantly, what if people want an AMD CPU and an Nvidia GPU? Nvidia still has the largest marketshare in discrete GPUs.
            I'm not against more integration between GPUs and CPUs in PCs. I personally have thought it would be cool to have GPU sockets on motherboards with a more specialized link between them and shared cooling but that obviously creates a lot of other issues like socket combinations, motherboard size, availability of cooling solutions, etc.
            People in the AI field already have access to large GPU and CPU clusters as well as large APUs.
            well on desktop and workstation some people want more flexibility but only a small part of the market most peolpe who build OEM computers don't care they want final product and never upgrade.

            also on notebook/laptop market such a big-SOC would eliminate the need for a dGPU and 99% of all people do not expect to upgrade a notebook/laptop... and 1% buy framework laptop and try to upgrade the dGPU...

            if i would need to buy a laptop instead of a laptop with dGPU i would buy such a BIG-SOC laptop.
            Phantom circuit Sequence Reducer Dyslexia

            Comment


            • #66
              Originally posted by qarium View Post
              with a design like that a 7800X3D would be alone 3 chiplets the 3D cache die the IO die the CPU die with a RDNA3 7800 you would have 1 GPU die and 4 infinity cache dies. this together would be already 8 chiplets...
              It would 7 chiplets because you wouldn't need the Zen 4 I/O die. The iGPU would be provided by the Navi 32-ish GCD and the interface to external memory would be provided by the chiplets connected to that. But considering that a 7800XT GCD is 68mm2 larger than a Zen I/O die, they could easily make it 8 by adding another Zen 4 chiplet.

              The issue with a 7800XT-level IO/GPU die is that the current 7800XT reaches its level of performance because it has 624GB/s of external memory bandwidth which is more than the max 563GB/s that this fictional SOC would have to share between both the GPU and CPU. Current motherboards usually offer four DIMM slots with two 64-bit memory channels. MRDIMMs multiplex the equivalent of two DDR5 DIMMS into one 128-bit memory channel. So assuming motherboards remain at dual channel, that would put a 17,600MT dual channel setup at 563GB/s. That's why I'm treating it as a theoretical max for now. I think that's fair considering we're comparing the the external memory bandwidth requirements of current generation AMD CPUs and GPUs and the bandwidth that can provided by the third generations of an unreleased DIMM standard.

              This fictional Navi 32 IO die wouldn't be a regular Navi 32 GCD re-used, it would be Navi 32 with the addition of the four SerDes interfaces, 2 IFOP PHYs , the security processor, USB controllers and PHYs that are on the IO die which would make the Navi 32 I/O die larger than a Navi 32.

              The next GPU that slots into the 7800XT's slot, whether that be a 8800XT or 9800XT, will require even more bandwidth.

              Originally posted by qarium View Post
              ​I honestly do not see the need for AMD for that. other companies like intel need it much more.
              ...
              so tell me why is apple better if they use 2 chiplets and if amd could do the same with 8 chiplets they need new gpu design to allow chiplets ???
              Yes, AMD doesn't need to bond two SOCs together, they can use a few chiplets like I said. I'm not saying that using 2 chiplets is better than 8. I'm just saying that they can't do what Apple did exactly because they can't scale their GPUs with chiplets.

              The advantages of chiplets is that they vastly reduce costs by improving yields, requiring fewer large chips to be created, and allow dies to be allocated to different designs but they have their downsides, too. Chiplets tend to put components physically farther away from each other than they would be if the blocks were monolithic. Bring traffic off-die which uses more power, increases latency, makes it harder to connect them at high bandwidths, and the interfaces needed to connect them often make the individual chiplets larger than they would be if they were blocks in a monolithic chip. That's why AMD has always put eight cores on a CCD even when CCDs contained two CCXs.

              Currently, chiplets aren't really more advanced than what was done in Multi-chip Modules and those have been around for years. The Wii U literally had a 3-core CPU chip connected to a GPU/IO die with memory controllers on one package back in 2012. There are ideas to get around the disadvantages of chiplets though like bridging chiplets, active interposers, etc. They're not here yet though, so chiplet designs need to figure out what makes the most sense to divide and what should be kept together. Those decisions are often going to be determined based on what you want your whole product to look like.

              In AMD's case, they already have 8 core SOCs but their iGPUs fall way short of a 7800XT. If they could scale up their GPU compute with chiplets then they could just use GPU chiplets to expand their small SOCs into larger SOCs. The Ryzen Pro 7940HS has eight Zen 4 cores and 12 CUs. Lets say that it was given 16MB of Infinity Cache. Then it could be paired with a chiplet that includes 48 CUs, 48MB Infinity Cache, and a few more 64-bit memory controllers then it you get an SOC with eight Zen 4 cores and an iGPU with 60 CUs, a 256-bit bus, and 64MB of Infinity Cache just like the 7800XT.

              That same 48 CU chiplet could be doubled up and paired with a front-end chiplet to make something between a 7600 and 7700XT. Add another one of those chiplets and you now have a 7900XTX. Imperfect chiplets could have CU, memory, controllers and Infinity Cache banks disabled to make other SKUs.

              That way you get your monolithic mobile SOCs, your big SOCs, and your entire GPU line out of 3 chiplets and none of them are as large as a Navi 32.

              That can't happen though because they can't scale their GPU compute with chiplets.

              Originally posted by qarium View Post
              well on desktop and workstation some people want more flexibility but only a small part of the market most peolpe who build OEM computers don't care they want final product and never upgrade.
              You'd still have to sell those people on the benefits of these large SOCs though and so far there isn't much of one. And if they want Nvidia GPUs then this doesn't help them.

              Originally posted by qarium View Post
              also on notebook/laptop market such a big-SOC would eliminate the need for a dGPU and 99% of all people do not expect to upgrade a notebook/laptop... and 1% buy framework laptop and try to upgrade the dGPU...

              if i would need to buy a laptop instead of a laptop with dGPU i would buy such a BIG-SOC laptop.
              That would be sensible place for a large SOC and would make what I suggested above make even more sense. Though if they're in laptops then they'll also be more bandwidth limited since SO-DIMMs often lag behind DIMMS in speed. They'd have to use soldered LPDDR5 with fat busses or just use GDDR6.

              Comment

              Working...
              X