Announcement

**bridgman** · 08 March 2023, 04:10 PM

Originally posted by coder View Post

But, to just throw up your hands strikes me as very lame.

Nobody is "throwing up their hands". The statement from agd5f was more like "don't make buying decisions based on the assumption that there will be performance improvements from some future scheduler".

**Weasel** · 08 March 2023, 04:13 PM

Originally posted by schmidtbag View Post

Asymmetric designs are the appropriate future for desktop and laptop use, and perhaps specific HEDT workstations and servers as well. After all, GPUs used for compute is effectively an asymmetric approach, but there are some obvious challenges associated with that. We're reaching a point where it's getting harder to balance cost, performance, and efficiency.

I think they need to stop trying to make a CPU into a GPU, because that's not what CPUs are for. See my reply above.

There's 0 logic for asymmetric designs other than "most people don't need so many powerful cores". Well then they shouldn't be buying a high-end high core count CPU? "Most" doesn't mean all. And some CPUs shouldn't be for most people because they don't even go for such expensive CPUs in the first place.

I don't care if it uses more power or whatever. I never asked for efficiency. So I want such an option. I'd pay even 50% more for it. ffs.

**coder** · 08 March 2023, 04:41 PM

Originally posted by Weasel View Post

There's 0 logic for asymmetric designs other than "most people don't need so many powerful cores". Well then they shouldn't be buying a high-end high core count CPU? "Most" doesn't mean all.

I think you're missing a key point about the E-cores, which is that they're actually faster than when a thread is sharing a P-core with another hyperthread. Not only that, they occupy only a quarter the area of a P-core and use even less power than that! So, they're quite simply the most cost-efficient and energy-efficient way to scale multi-threaded performance, period.

The only reason to have any P-cores is for lightly-threaded workloads. Once your thread-count increases enough, you're better off with more E-cores.

Originally posted by Weasel View Post

I don't care if it uses more power or whatever. I never asked for efficiency. So I want such an option. I'd pay even 50% more for it. ffs.

Well, then Intel has got you covered!

https://www.anandtech.com/show/18741...pcie-5-0-lanes

If you prefer AMD, you can buy a Zen 4 EPYC today. Or you can wait for their Zen 4 Threadrippers, later this year.

**agd5f** · 08 March 2023, 04:45 PM

Originally posted by Weasel View Post

Yeah. The previous gen 3D version wasn't crippled. What I mean is that I want a big cache on every core. I expect that when I run an app to have the big cache. Because that's what I pay for. That's why I buy 3D version of the CPU. I want a big cache on every app I run. No exceptions.

In that case buy a 7800X3D or an Epyc.

Originally posted by Weasel View Post

I don't want performance variations just because one day it decided to place the app on a core without the cache and tomorrow on one with cache. How do you not see this as a problem?

You can get performance variations across cores from any number of factors (asymmetry in cores, thermal headroom, differences in silicon, etc.).

**pieman** · 08 March 2023, 05:07 PM

Originally posted by coder View Post

I think you're missing a key point about the E-cores, which is that they're actually faster than when a thread is sharing a P-core with another hyperthread. Not only that, they occupy only a quarter the area of a P-core and use even less power than that! So, they're quite simply the most cost-efficient and energy-efficient way to scale multi-threaded performance, period.

The only reason to have any P-cores is for lightly-threaded workloads. Once your thread-count increases enough, you're better off with more E-cores.

Well, then Intel has got you covered!

https://www.anandtech.com/show/18741...pcie-5-0-lanes

If you prefer AMD, you can buy a Zen 4 EPYC today. Or you can wait for their Zen 4 Threadrippers, later this year.

The individual tiles for a Sapphire Rapids XCC chip are all identical/symmetrical, so each tile provides a quarter of the CPU cores, I/O, and memory channels of the entire chip. As such, each tile can provide up to a maximum of 32 PCIe 5.0 lanes (112 total on the w9-3495X), while each tile also includes up to two memory controllers providing eight-channel memory across the W-3400 series.

are tiles intel's version of amd's ccd's?

**F.Ultra** · 08 March 2023, 07:33 PM

Originally posted by Weasel View Post

Yeah. The previous gen 3D version wasn't crippled. What I mean is that I want a big cache on every core. I expect that when I run an app to have the big cache. Because that's what I pay for. That's why I buy 3D version of the CPU. I want a big cache on every app I run. No exceptions.

I don't want performance variations just because one day it decided to place the app on a core without the cache and tomorrow on one with cache. How do you not see this as a problem?

The 5800x3d was crippled in the way that it runs at 11% lower base clock and 4% lower boost. So far swapping L3 for frequency tends to mostly favour games and since games are still far from fully utilizing more than 6 cores it makes financial sense for AMD to spec the 7950x3d and the 7900x3d they way they did, that way they can get better performance in games while still maintain some performance in non-game applications.

The sad truth is that a "non crippled" 7900x3d or 7950x3d would just be equal in performance in games to what the current models are and be slower for other workloads. Yes scheduling would be linear and non-complex, but there would be no other benefit and only drawbacks.

**coder** · 08 March 2023, 07:38 PM

Originally posted by agd5f View Post

You can get performance variations across cores from any number of factors (asymmetry in cores, thermal headroom, differences in silicon, etc.).

Not to mention SMT/hyperthreading, which we've been living with for the better part of nearly 2 decades.

**coder** · 08 March 2023, 07:40 PM

Originally posted by pieman View Post

are tiles intel's version of amd's ccd's?

It's like a throwback to the original Zen 1 EPYC.

The main difference is that Intel's EMIB technology is a lot higher-bandwidth than AMD's organic substrate was. But, we should still see fairly pronounced NUMA effects.

**drakonas777** · 09 March 2023, 02:31 AM

Originally posted by coder View Post

I think you're missing a key point about the E-cores, which is that they're actually faster than when a thread is sharing a P-core with another hyperthread. Not only that, they occupy only a quarter the area of a P-core and use even less power than that! So, they're quite simply the most cost-efficient and energy-efficient way to scale multi-threaded performance, period.

The only reason to have any P-cores is for lightly-threaded workloads. Once your thread-count increases enough, you're better off with more E-cores.

I don't get why are you comparing P core "shared by 2 threads" with single threaded E core. Seems like you are artificially creating this situation to make E core look better. Makes no sense since E core can also be effectively shared by 2 threads by OS time slicing.

Lightly threaded workloads are specifically the workloads where Intel hybrid arch is much easier to schedule than AMD 3D if we care mostly about optimal performance and not so much about energy efficiency. Desktops in other words.

**coder** · 09 March 2023, 04:04 AM

Originally posted by drakonas777 View Post

I don't get why are you comparing P core "shared by 2 threads" with single threaded E core.

Because that's a decision that schedulers have to make, as well - which threads to pair up on a core that supports SMT! On average, whichever ones you pair up will run even slower than if they ran on an E-core!

Originally posted by drakonas777 View Post

Seems like you are artificially creating this situation to make E core look better. Makes no sense since E core can also be effectively shared by 2 threads by OS time slicing.

The situation was created by SMT, not me. I don't get your point about temporal multiplexing, as you can do that on any core, including ones that support SMT.

The reason I brought up SMT is that it already creates a hierarchy of cores. So, even a Ryzen CPU without 3D V-Cache has cores in 2 different speed classes: those with only 1 thread and those with 2. Optimal scheduling requires that a scheduler be judicious in determining which threads to pair up, especially if the number of threads is greater than the number of cores but less than the CPU's cumulative SMT capacity.

Originally posted by drakonas777 View Post

Lightly threaded workloads are specifically the workloads where Intel hybrid arch is much easier to schedule than AMD 3D if we care mostly about optimal performance and not so much about energy efficiency. Desktops in other words.

If you only think about the easy case, where you have the same or fewer threads running than P-cores, and don't care about energy efficiency, then you've completely missed my point.

I wish, in life, we could think about only the easy problems and just ignore everything else.

Announcement

AMD Ryzen 9 7900X3D Linux Performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment