Announcement

**drakonas777** · 09 March 2023, 04:30 AM

I just don't see how scheduling for AMD 3DV cache CCDs is "a lot easier" in comparison to Intel's hybrid architecture. Personally I think they are both going to be quite complex, especially if the goal is not only optimize for the performance, but power efficiency as well. I'm not saying you are wrong regarding L3 performance counters argument, I just don't think it's as easy as you think it is.

**Weasel** · 09 March 2023, 10:15 AM

Originally posted by agd5f View Post

In that case buy a 7800X3D or an Epyc.

Unfortunately, only 8 cores, so it's not any better than 7950X3D if I could "disable" the others without 3D VCache.

EPYC seems a bit on the expensive side. Although I don't mind it as much, the real issue is that it won't work with consumer motherboards and so on. So I'll need completely different system.

Originally posted by agd5f View Post

You can get performance variations across cores from any number of factors (asymmetry in cores, thermal headroom, differences in silicon, etc.).

Right but at least those aren't by design, or at least, not as impacting.

**Weasel** · 09 March 2023, 10:21 AM

Originally posted by coder View Post

I think you're missing a key point about the E-cores, which is that they're actually faster than when a thread is sharing a P-core with another hyperthread. Not only that, they occupy only a quarter the area of a P-core and use even less power than that! So, they're quite simply the most cost-efficient and energy-efficient way to scale multi-threaded performance, period.

The only reason to have any P-cores is for lightly-threaded workloads. Once your thread-count increases enough, you're better off with more E-cores.

Then just make all cores E-cores and problem solved. Ok maybe not true E-cores but a mix of P and E core. Decently strong single-threaded perf and not huge like P cores.

For me the issue is mixing two different types of cores.

Originally posted by coder View Post

Well, then Intel has got you covered!

https://www.anandtech.com/show/18741...pcie-5-0-lanes

If you prefer AMD, you can buy a Zen 4 EPYC today. Or you can wait for their Zen 4 Threadrippers, later this year.

Those don't seem to have huge 3D VCache though?

Especially since I'll definitely go with ECC memory, which is on the slower side, so cache is even more valuable than what you'll find in most benchmarks.

**Weasel** · 09 March 2023, 10:23 AM

Originally posted by F.Ultra View Post

The 5800x3d was crippled in the way that it runs at 11% lower base clock and 4% lower boost. So far swapping L3 for frequency tends to mostly favour games and since games are still far from fully utilizing more than 6 cores it makes financial sense for AMD to spec the 7950x3d and the 7900x3d they way they did, that way they can get better performance in games while still maintain some performance in non-game applications.

The sad truth is that a "non crippled" 7900x3d or 7950x3d would just be equal in performance in games to what the current models are and be slower for other workloads. Yes scheduling would be linear and non-complex, but there would be no other benefit and only drawbacks.

Unfortunately that's not the case for me. I don't care about insane frequency boosts. Cache is way more important. Especially since I go with slower ECC memory.

**F.Ultra** · 09 March 2023, 06:59 PM

Originally posted by Weasel View Post

Unfortunately that's not the case for me. I don't care about insane frequency boosts. Cache is way more important. Especially since I go with slower ECC memory.

Yes not everyone have the same work loads, that is true. But it's not the just the boost that is lower, also the base clock is quite lower. What I hope for is for AMD to solve this problem so we can have both large cache and high base clock.

**coder** · 09 March 2023, 07:03 PM

Originally posted by Weasel View Post

Right but at least those aren't by design, or at least, not as impacting.

So, you run with hyperthreading disabled? Because that can introduce performance variability on par with P- vs. E- cores.

Originally posted by Weasel View Post

Then just make all cores E-cores and problem solved. Ok maybe not true E-cores but a mix of P and E core. Decently strong single-threaded perf and not huge like P cores.

You need to decide on your priorities. I think most Intel boards will let you disable the E-cores in BIOS, if you have them. You should also disable hyperthreading, while you're at it.

If you care more about energy efficiency, then just stick with all E-cores. Buy an 8-core Alder Lake-N today, and then Sierra Forest in 2025. Or AMD's Zen4c-based Begamo, later this year.

If you can deal with not having 3D cache, then buy a non-3D Ryzen, disable SMT, and learn to be content with that.

If you can't deal with not having 3D cache and want more than 8 cores, then accept that you've upsold yourself out of the desktop segment.

Originally posted by Weasel View Post

Especially since I'll definitely go with ECC memory, which is on the slower side, so cache is even more valuable than what you'll find in most benchmarks.

ECC memory isn't intrinsically slower. The biggest performance impact backed by empirical evidence I've seen is 1-2%. The DIMMs are typically marketed at slower, JEDEC speeds, though. People have overclocked them with a decent amount of success.

Also, now that Intel joined their HEDT segment with Xeon W (allowing the retail-boxed models to be overclocked) and dropped UDIMM support from the platform, there are now some factory-overclocked ECC RDIMMs on the market.

https://www.tomshardware.com/news/gs...-ecc-rdimm-ram

**Weasel** · 10 March 2023, 11:01 AM

Originally posted by coder View Post

So, you run with hyperthreading disabled? Because that can introduce performance variability on par with P- vs. E- cores.

Doubt it. It mainly goes into effect when you use more than the "physical" threads. Of course discounting threads that live for a very short time, those are mostly insignificant. I'm talking 100% load here for each core.

If the kernel puts two 100% load threads on the same core with nothing else then it's the kernel scheduler's problem. But that doesn't happen. SMT/HyperThreading is not that hard to schedule.

Also the thing with SMT is that it gives you more for the same physical cores/hardware, by utilizing it better. Meanwhile splitting cache to only half the cores is designed to give you less. It's the complete opposite.

Originally posted by coder View Post

ECC memory isn't intrinsically slower. The biggest performance impact backed by empirical evidence I've seen is 1-2%. The DIMMs are typically marketed at slower, JEDEC speeds, though. People have overclocked them with a decent amount of success.

Yeah, that's what I meant.

**coder** · 10 March 2023, 01:15 PM

Originally posted by Weasel View Post

Also the thing with SMT is that it gives you more for the same physical cores/hardware, by utilizing it better. Meanwhile splitting cache to only half the cores is designed to give you less. It's the complete opposite.

E-cores give you more performance per Watt, and more performance per mm^2 (therefore more performance per dollar). So, in a way, they're also about giving you more of what you paid for (i.e. the silicon).

With 3D cache, you didn't pay for the second cache chiplet. Furthermore, you get more clockspeed on that compute die, giving you a benefit even without it.

Concerns about performance inconsistency can easily be addressed, if the scheduler simply rotates threads between E-cores and P-cores. The same can also be done for partially-subscribed SMT. All that's needed is for the scheduler to have a rough idea of the cores' relative performance. Admittedly, this approach doesn't extend well to additional L3 cache, since the impact is so workload-specific.

**HerrLange** · 11 March 2023, 05:38 AM

Looking to Windows the efficiency for games is gathered by the xbox gamebar which seems to park the cores on the ccd without the extra 3d-cache if a game is detected to run. It seems like avoiding any load on the other ccd that tries to access the 3dvcache is very beneficial in most gaming cases.

Leaving beside the gamebar thingy, is there a way to do similar but manually in Linux via shell? Could you script somehow a tool that detects which programs/processes are running based on a positive-list? So if a program/process is running that is listed the non-cache-ccd cores are all parked? Even better would be off course just limit those processes to certain cores having direct 3dvcache access. But i guess this way would be harder to achieve.
Has somebody ever did something like this here? Is the idea worth it? I mean done is maybe better than perfect.

Edit: If such a tool would exist one could use the phoronix test suite to detect the best performance figure for maintaining such a postive-list in a more automated way.

**coder** · 11 March 2023, 06:18 AM

Originally posted by HerrLange View Post

Even better would be of course just limit those processes to certain cores having direct 3dvcache access.

I think taskset is probably what you want for that:

https://man7.org/linux/man-pages/man1/taskset.1.html

Announcement

AMD Ryzen 9 7900X3D Linux Performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment