Originally posted by Espionage724
View Post
Announcement
Collapse
No announcement yet.
AMD Launches EPYC 9004 "Genoa" Processors - Up To 96 Cores, AVX-512, Incredible Performance
Collapse
X
-
- Likes 2
-
Originally posted by Espionage724 View PostI'm of the opinion that ... even CCX/CCD are shortcuts and marketing gimmicks that cause nothing but scheduler difficulties and issues across different platforms. I have a 2700X now and I'd rather have 16 real cores, vs 8 cores, split into two, with 4 in each group that communicate over a slower path causing performance issues for non-aware schedulers and drivers. Sure I could pin applications to certain cores and make sure IRQs only run off of certain cores for certain devices and deal with the manual set-up of all of that, but why does this complexity need to exist for such a small amount of cores?
Originally posted by Espionage724 View PostCrossing the CCD and/or having SMT on lowers performance even on 7000-series Ryzen CPUs.
- Likes 3
Comment
-
Originally posted by Espionage724 View Post
Consumer chips don't need 96 cores; 16 maybe average, 8 minimum, and 32-64 for the few games that might be able to do something with that with higher-end CPUs. If 96 cores can go into that Epyc chip, surely 16 can work fine in consumer stuff at a minimum. Actual workstations and professional app users can deal with SMT and higher thread counts where all that might be a benefit.
I'm of the opinion that SMT/HT and even CCX/CCD are shortcuts and marketing gimmicks that cause nothing but scheduler difficulties and issues across different platforms. I have a 2700X now and I'd rather have 16 real cores, vs 8 cores, split into two, with 4 in each group that communicate over a slower path causing performance issues for non-aware schedulers and drivers. Sure I could pin applications to certain cores and make sure IRQs only run off of certain cores for certain devices and deal with the manual set-up of all of that, but why does this complexity need to exist for such a small amount of cores?
Look at this nonsense: https://www.neowin.net/news/windows-...zen-7000-cpus/
Crossing the CCD and/or having SMT on lowers performance even on 7000-series Ryzen CPUs.
Leaving it in the design doesn't really negatively impact users that much, especially when you can turn it off or alter core affinity yourself.
... And don't really understand what that has to do with CCDs.
That being said, you do see ARM, Apple and such avoiding SMT, as it really is unecessary in those kinds of workloads. I wouldnt be surprised if Ampere or Nuvia introduce it though, as we already saw Broadcomm do with their SMT4 server processors.
- Likes 2
Comment
-
Originally posted by anarki2 View Post
Yes, Dell can keep sitting on their @rses while people switch to self built workstations or other brands. We waited for them to finally release Ryzen workstations, then ended up building them for ourselves from off the shelf parts.
They have ridiculously priced Alienware gimmicks with bundled GPUs. No thanks.
- Likes 1
Comment
-
Originally posted by brucethemoose View PostThat being said, you do see ARM, Apple and such avoiding SMT, as it really is unecessary in those kinds of workloads.
Comment
-
Originally posted by coder View PostIt seems like it should have a slightly negative impact on perf/W, in some cases. For Apple, excluding it is a no-brainer, because they seem happy to spend $ on dies with greater silicon area, as long as doing so can scale performance without hurting efficiency.
I wonder if asymmetric SMT would be practical, where you have one "main" thread in a core as usual, and one "background" thread that strictly sucks up idle resources. It would have its own tiny l1, not put anything into l2/l3, and always stop to yield other resources so that it doesn't reduce the main thread's performance at all. And scheduling would basically be the same as big.LITTLE or Alder Lake.Last edited by brucethemoose; 11 November 2022, 08:29 PM.
- Likes 1
Comment
-
Originally posted by brucethemoose View PostI wonder if asymmetric SMT would be practical, where you have one "main" thread in a core as usual, and one "background" thread that strictly sucks up idle resources. It would have its own tiny l1, not put anything into l2/l3, and always stop to yield other resources so that it doesn't reduce the main thread's performance at all. And scheduling would basically be the same as big.LITTLE or Alder Lake.
But, I think I see where you're coming from. I guess you want a thread running on a P-core to have roughly the same performance, whether or not it's sharing the core with a second thread. That way, the core can be more fully utilized while still offering a performance advantage over threads running on an E-core. Nice idea!
Comment
-
-
Originally posted by josmat View PostCouldn't it be because RISC architectures like ARM don't benefit as much as x86 from SMT?
However, the wider a CPU micro-architecture, the bigger the payoff you tend to get from SMT. So, in a way, it gives you a better return-on-investment from making your architecture wider. Furthermore, you can get better pipeline utilization for a given reorder buffer size, the more SMT threads you have. At the extreme, you end up with a GPU-like ~12-way SMT in-order core. Yet, SMT isn't a bottomless well - each SMT thread requires additional state and software never scales linearly with respect to the number of threads. So, if Intel and AMD have found SMT-2 adequate to get good pipeline utilization, that could explain why they haven't gone further.
One interesting thing that's happened, in the past few years, is that instruction reorder buffers have nearly closed the gap vs. (best-case) DRAM access latency. That doesn't mean you can find enough useful work to do while waiting for a L3 cache miss, but it's pretty impressive that it's even theoretically possible.
Comment
-
Originally posted by anarki2 View Post
Yes, Dell can keep sitting on their @rses while people switch to self built workstations or other brands. We waited for them to finally release Ryzen workstations, then ended up building them for ourselves from off the shelf parts.
They have ridiculously priced Alienware gimmicks with bundled GPUs. No thanks.
Self-built is not an option for any medium or large organization. And most consumers lack the skill or desire to build their own. DIY builds are an enthusiast niche, at best, not a solution to Intel's anti-competitive practices.Last edited by torsionbar28; 14 November 2022, 01:26 PM.
- Likes 1
Comment
Comment