Announcement

**brucethemoose** · 16 September 2022, 10:58 AM

Originally posted by Jabberwocky View Post

AI, Video processing, compression and (from other benchmarks) gaming has big improvements with 3D stacked cache.

It was very difficult for me to choose between the 5800X3D and the 5900X.

Having a fast chip like the 5800X3D helps when you are running 4 DIMMs, heck even 2 DIMMS when you are overclocking RAM.

I ended up going for the 5900X despite this. I took convenient security over RAM speed and cache size.

For my workloads two CCDs which helps with isolating software and allows both CCDs to run without as much current or future mitigations. At least that's the idea...

My 5900X struggles with 4 DIMMs + overclocking, fortunately I don't need to overclock at this point in time.

That makes me think... AMD could have replaced Threadripper with a 5900X3D/5950X3D + more 1P EPYC SKUs.

The cache would make up for the relatively narrow dual channel bus. If you are really memory bandwidth or PCIe bound, EPYC would serve you better than TR anyway.

**pheldens** · 16 September 2022, 11:38 AM

I have this 5800X3D and its very good at feeding my 6900XT Merc for windows gaming. I recommend people to get it or the 7000 series equivalents, with some fast DDR5 ram.

**middy** · 16 September 2022, 01:21 PM

Originally posted by brucethemoose View Post

That makes me think... AMD could have replaced Threadripper with a 5900X3D/5950X3D + more 1P EPYC SKUs.

The cache would make up for the relatively narrow dual channel bus. If you are really memory bandwidth or PCIe bound, EPYC would serve you better than TR anyway.

amd's robert shortly before the release of the 5800x3d talked about why they were not releasing a 3d cache version of the 5900/5950x sku's in a podcast. long story short, due to the infinity fabric bandwidth between both ccd's, when there was a performance gain, it was smaller compared to what the 5800x3d was getting, and overall, it was a big regression in performance. what they found out was having both ccd's having massive amounts of l3 cache caused the cores wanting to cross to access the l3 on the other ccd more often than with the 32mb versions. this mostly happened when one ccd had more free l3 available than the other ccd. the 5800x3d was perfect because it was a single ccd with zero core to core communication going over infinity.

with the 7000 series it seems like amd made bigger improvements with infinity fabric bandwidth along with 3d cache improvements to improve speed to help mitigate the regressions from infinity. i'm sure running on ddr5 with its higher speeds by defaults help this as well.

**brucethemoose** · 16 September 2022, 02:01 PM

Originally posted by middy View Post

amd's robert shortly before the release of the 5800x3d talked about why they were not releasing a 3d cache version of the 5900/5950x sku's in a podcast. long story short, due to the infinity fabric bandwidth between both ccd's, when there was a performance gain, it was smaller compared to what the 5800x3d was getting, and overall, it was a big regression in performance. what they found out was having both ccd's having massive amounts of l3 cache caused the cores wanting to cross to access the l3 on the other ccd more often than with the 32mb versions. this mostly happened when one ccd had more free l3 available than the other ccd. the 5800x3d was perfect because it was a single ccd with zero core to core communication going over infinity.

with the 7000 series it seems like amd made bigger improvements with infinity fabric bandwidth along with 3d cache improvements to improve speed to help mitigate the regressions from infinity. i'm sure running on ddr5 with its higher speeds by defaults help this as well.

Thats very interesting. I assume turning off L3 cache sharing across dies isn't as easy as flipping a switch either.

**Spacefish** · 16 September 2022, 02:19 PM

For future generations, they could have an additional larger and "slower" memory stacked onto the I/O die, like 2-8GB of HBM3 or something like that.
Have it software controlled, such that OS could split this additional memory into "cache" and an "high speed memory area" mapped into the physical address space, such that pages can be allocated on it.

Would even benefit peripherals like faster network cards and such, as they could have their memory buffers allocated on that "fast" memory / write to it via DMA instead of into the main DRAM and paying the full latency penality of a memory access when reading it with a CPU core later on.

Maybe AMD being required to use GlobalFoundries 12/14nm capacity hold that back, due to power / thermal restrictions, or non availiability of stacking technology for the GF-Process?

Especially with the integrated GPU on the upcoming Ryzen CPUs, an extra "fast memory" coupled to the I/O die (which includes the GPU) could be another win, as the GPU is memory bandwidth limited most of the time, especially for certain compute tasks.
Guess they are saving this for an upcoming product.. Will be interesting to see IO die shots of Ryzen 7xxxx CPUs.

**brucethemoose** · 16 September 2022, 06:09 PM

Originally posted by Spacefish View Post

Maybe AMD being required to use GlobalFoundries 12/14nm capacity hold that back, due to power / thermal restrictions, or non availiability of stacking technology for the GF-Process?

No, HBM price/GB and supply are whats holding that back :P. Stacked memory is indeed the future, but just doesn't make any economic sense on consumer stuff in the short term.

I am told the latency benefits for HBM wouldn't even be that significant... but I'm not sure about that. Surely the shorter traces have to help?

A wider (LP)DDR5 IGP bus is not out of the question, but it seems that laptop/desktop OEMs could not care less about IGP performance, seeing how they allegedly shunned Van Gogh (the Steam Deck APU). Apple seems to be the only one who cares, hence they ordered a big Broadwell IGP from Intel before they just did it themselves. Heck, many still don't use symmetric dual channel configs.

**Anux** · 17 September 2022, 03:27 AM

Originally posted by Spacefish View Post

For future generations, they could have an additional larger and "slower" memory stacked onto the I/O die, like 2-8GB of HBM3 or something like that.

I'm not sure there is much to be gained from. X3D is operating at native 50 cycles L3 latency and over 2 TB/s while the infinity fabric is somewhere in the lower 3 digit GB/s and has a much higher latency. Sure you would relieve the memory a bit but also use more power.

Whats much more likely is that future gpu dies could get a X3D stacked on top. For example Phoenix might get a seperate gpu die and then cpu and gpu could get their own L3 extension or an RDNA 3 refresh with stacked cache.

**aaahaaap** · 22 September 2022, 06:24 PM

Michael Will there also be gaming benchmarks? That's what the CPU was marketed for

Would also be interesting to compare both of them vs the fastest 65W TDP zen3 (5700x), do you happen to have one of those?

**Michael** · 22 September 2022, 06:35 PM

Originally posted by aaahaaap View Post

Michael Will there also be gaming benchmarks? That's what the CPU was marketed for

Would also be interesting to compare both of them vs the fastest 65W TDP zen3 (5700x), do you happen to have one of those?

5800X/X3D gaming benchmarks will be part of my Zen 4 review.

No I don't have any 5700X.

**arQon** · 02 October 2022, 02:18 AM

Originally posted by brucethemoose View Post

Stacked memory is indeed the future, but

"Kinda sorta", but also not.

The biggest problem with it (as Apple users are discovering) is that it means you're committing to that exact memory size and can't change it. Given that RAM upgrades are the first and most significant thing users can do to extend machine lifetime, if you don't already have a predefined load you're either paying for something VERY expensive that serves no purpose for years, or you're forced to discard and replace a fully functional CPU for no reason. Both are very poor outcomes at best, and even if you DO have a known load there's a pretty good chance you'll want to just add more RAM a couple of years from now, as all the cloud providers are discovering now that there's no more free money to be had.

So yeah, from a tech standpoint it's undeniably great, but from a practical perspective it's a disaster. Improved processes etc won't change the calculus on that, nor will anything else on the roadmap. I can see it being used for SoC-level systems like IoS and cellphones, but for "real" machinery it's just never going to be viable unless you can make enough money out of it to just keep repurposing (or essentially, throwing away) machines ~annually. Trading and DCs will do that, but nobody else will.

Announcement

AMD Ryzen 7 5800X vs. Ryzen 7 5800X3D On Linux 6.0 Benchmarks

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment