Originally posted by dragorth
View Post
Announcement
Collapse
No announcement yet.
AMD EPYC 7773X "Milan-X" Benchmarks Show Very Strong HPC Performance Upgrade
Collapse
X
-
Originally posted by ddriver View PostWaiting for amd to stack another 4-8 gb of L4 cache onto the io die as well.
But on a $5k+ Epyc CPU, slapping 8-24 GiB of HBM on the I/O die as L4 cache seems like a no-brainer, even with the 3D V-Cache.
Originally posted by petko View PostI'm curious would 3D V Cache increase the data transfer rates to GPU in GPU-enabled workloads?
Last edited by jaxa; 21 March 2022, 06:20 PM.
Comment
-
Originally posted by Oppenheimer View PostImpressive numbers. My only query is how comparable are the results of vanilla Milan and 'MilanX with cache disabled' when comparing otherwise equivalent CPUs?
- Likes 1
Comment
-
AMD had only publicly cited a few select workloads of popular commercial applications. After testing, it's great to see a fairly wide range of HPC workloads benefiting from this large L3 cache
Comment
-
Originally posted by Oppenheimer View PostImpressive numbers. My only query is how comparable are the results of vanilla Milan and 'MilanX with cache disabled' when comparing otherwise equivalent CPUs?
And note that in at least one case (Timed Kernel Compilation) where dual-7773X lost to dual-7763, the single-CPU configuration better leveraged the strengths of the 7773X to let it eke out a win.
Indeed, the single-CPU results make 3D cache even more of a slam dunk, for 1P configurations.Last edited by coder; 21 March 2022, 09:29 PM.
- Likes 1
Comment
-
Really appreciate that Michael tested all CPUs with the performance governor, so that Linux's subpar default schedutil governor doesn't impair any of the results with less-than-stellar "clever" clockspeed decisions.
That's the spirit and how everyone should run Linux to get the most [...], well, performance out of their systems!
(I already pity those who buy these "arm&leg" expensive workstations, only to then achieve subpar ROI because of a poor governor choice...)
- Likes 2
Comment
-
Originally posted by pentaprism View PostConsidering nothing is actually optimized for this yet
I get the impression you think this is like AVX etc, where a -march will magically make things better if you're lucky, and it just isn't. There are, maybe, a few pathological cases where you could optimize for a known-large L3 by keeping a very small amount of extra information in a data structure that saves you a dereference or some trivial math, etc, but even so it would nearly always implicitly be the wrong choice *even if* you knew you would only be running on this specific CPU, because if there was room for it in the cache line in the first place it would be in there already for the L1 case, which is the one that matters.
Don't get me wrong: this is great stuff for even the "common" case as it is. It's even better for stuff that thrashes large sets of pages around with some degree of locality, like databases, statistics, and so on. But you have unrealistic expectations of what benefit it can and can't provide beyond that, or what fraction of software could even attempt to leverage it at the code level without actually just making things worse instead.
Comment
-
Originally posted by Linuxxx View Postsubpar default schedutil governor
> (I already pity those who buy these "arm&leg" expensive workstations, only to then achieve subpar ROI because of a poor governor choice...)
meh - half the people who buy that sort of kit do so for vanity, not performance. The ones who buy workstations because they DO need them will also have either an IT team, or a VAR, or the knowledge themselves to make sure they run properly.
- Likes 1
Comment
Comment