I just don't see how scheduling for AMD 3DV cache CCDs is "a lot easier" in comparison to Intel's hybrid architecture. Personally I think they are both going to be quite complex, especially if the goal is not only optimize for the performance, but power efficiency as well. I'm not saying you are wrong regarding L3 performance counters argument, I just don't think it's as easy as you think it is.
Announcement
Collapse
No announcement yet.
AMD Ryzen 9 7900X3D Linux Performance
Collapse
X
-
Originally posted by agd5f View PostIn that case buy a 7800X3D or an Epyc.
EPYC seems a bit on the expensive side. Although I don't mind it as much, the real issue is that it won't work with consumer motherboards and so on. So I'll need completely different system.
Originally posted by agd5f View PostYou can get performance variations across cores from any number of factors (asymmetry in cores, thermal headroom, differences in silicon, etc.).
Comment
-
Originally posted by coder View PostI think you're missing a key point about the E-cores, which is that they're actually faster than when a thread is sharing a P-core with another hyperthread. Not only that, they occupy only a quarter the area of a P-core and use even less power than that! So, they're quite simply the most cost-efficient and energy-efficient way to scale multi-threaded performance, period.
The only reason to have any P-cores is for lightly-threaded workloads. Once your thread-count increases enough, you're better off with more E-cores.
For me the issue is mixing two different types of cores.
Originally posted by coder View PostWell, then Intel has got you covered!
If you prefer AMD, you can buy a Zen 4 EPYC today. Or you can wait for their Zen 4 Threadrippers, later this year.
Especially since I'll definitely go with ECC memory, which is on the slower side, so cache is even more valuable than what you'll find in most benchmarks.
Comment
-
Originally posted by F.Ultra View PostThe 5800x3d was crippled in the way that it runs at 11% lower base clock and 4% lower boost. So far swapping L3 for frequency tends to mostly favour games and since games are still far from fully utilizing more than 6 cores it makes financial sense for AMD to spec the 7950x3d and the 7900x3d they way they did, that way they can get better performance in games while still maintain some performance in non-game applications.
The sad truth is that a "non crippled" 7900x3d or 7950x3d would just be equal in performance in games to what the current models are and be slower for other workloads. Yes scheduling would be linear and non-complex, but there would be no other benefit and only drawbacks.
- Likes 1
Comment
-
Originally posted by Weasel View PostUnfortunately that's not the case for me. I don't care about insane frequency boosts. Cache is way more important. Especially since I go with slower ECC memory.
- Likes 2
Comment
-
Originally posted by Weasel View PostRight but at least those aren't by design, or at least, not as impacting.
Originally posted by Weasel View PostThen just make all cores E-cores and problem solved. Ok maybe not true E-cores but a mix of P and E core. Decently strong single-threaded perf and not huge like P cores.
If you care more about energy efficiency, then just stick with all E-cores. Buy an 8-core Alder Lake-N today, and then Sierra Forest in 2025. Or AMD's Zen4c-based Begamo, later this year.
If you can deal with not having 3D cache, then buy a non-3D Ryzen, disable SMT, and learn to be content with that.
If you can't deal with not having 3D cache and want more than 8 cores, then accept that you've upsold yourself out of the desktop segment.
Originally posted by Weasel View PostEspecially since I'll definitely go with ECC memory, which is on the slower side, so cache is even more valuable than what you'll find in most benchmarks.
Also, now that Intel joined their HEDT segment with Xeon W (allowing the retail-boxed models to be overclocked) and dropped UDIMM support from the platform, there are now some factory-overclocked ECC RDIMMs on the market.Last edited by coder; 09 March 2023, 07:19 PM.
- Likes 1
Comment
-
Originally posted by coder View PostSo, you run with hyperthreading disabled? Because that can introduce performance variability on par with P- vs. E- cores.
If the kernel puts two 100% load threads on the same core with nothing else then it's the kernel scheduler's problem. But that doesn't happen. SMT/HyperThreading is not that hard to schedule.
Also the thing with SMT is that it gives you more for the same physical cores/hardware, by utilizing it better. Meanwhile splitting cache to only half the cores is designed to give you less. It's the complete opposite.
Originally posted by coder View PostECC memory isn't intrinsically slower. The biggest performance impact backed by empirical evidence I've seen is 1-2%. The DIMMs are typically marketed at slower, JEDEC speeds, though. People have overclocked them with a decent amount of success.
Comment
-
Originally posted by Weasel View PostAlso the thing with SMT is that it gives you more for the same physical cores/hardware, by utilizing it better. Meanwhile splitting cache to only half the cores is designed to give you less. It's the complete opposite.
With 3D cache, you didn't pay for the second cache chiplet. Furthermore, you get more clockspeed on that compute die, giving you a benefit even without it.
Concerns about performance inconsistency can easily be addressed, if the scheduler simply rotates threads between E-cores and P-cores. The same can also be done for partially-subscribed SMT. All that's needed is for the scheduler to have a rough idea of the cores' relative performance. Admittedly, this approach doesn't extend well to additional L3 cache, since the impact is so workload-specific.Last edited by coder; 10 March 2023, 01:19 PM.
Comment
-
Looking to Windows the efficiency for games is gathered by the xbox gamebar which seems to park the cores on the ccd without the extra 3d-cache if a game is detected to run. It seems like avoiding any load on the other ccd that tries to access the 3dvcache is very beneficial in most gaming cases.
Leaving beside the gamebar thingy, is there a way to do similar but manually in Linux via shell? Could you script somehow a tool that detects which programs/processes are running based on a positive-list? So if a program/process is running that is listed the non-cache-ccd cores are all parked? Even better would be off course just limit those processes to certain cores having direct 3dvcache access. But i guess this way would be harder to achieve.
Has somebody ever did something like this here? Is the idea worth it? I mean done is maybe better than perfect.
Edit: If such a tool would exist one could use the phoronix test suite to detect the best performance figure for maintaining such a postive-list in a more automated way.Last edited by HerrLange; 11 March 2023, 05:43 AM.
Comment
-
Originally posted by HerrLange View PostEven better would be of course just limit those processes to certain cores having direct 3dvcache access.
Comment
Comment