Linux Prepares For Next-Gen AMD CPUs With Up To 12 CCDs

smitty3268 replied

25 November 2021, 02:43 AM
Originally posted by TemplarGR View Post

So, apparently, "power efficiency is king" (which is correct), but Intel having that power efficiency due to big.little is a failure cause we are all amd fanbois here and we juts have to push the red team, amirite?

The issue is that Intel doesn't have the power efficiency right now. That's the problem I'm pointing out.

It doesn't really matter on the desktop, because nobody cares if they use more power. It will matter for Raptor Lake, and it's a very big question exactly how that will look.

Intel's e-cores are really efficient compared to their p-cores. It's not nearly as impressive versus Zen 3 cores, though. How high will they be clocked on Raptor Lake, and what kind of efficiency will they get there? I have no idea. Maybe you are tuned into the Intel rumors more than I am, but I don't think there's currently much out there.

Raptor Lake isn't coming out much before Zen 4, by the way, so that is it's competition. Not the current Zen 3 based Epyc systems.

AMD's approach of just adding more slightly weaker than Intel p-cores won't be better in efficiency in any multi-threaded use case. It is just a matter of time until schedulers get optimized and by then AMD's Zen 4 is going to be a dinosaur. AMD's "plan" is to just add 50% more cores and some more cache.

Except that's not their plan. Zen 4 is supposed to have massive IPC increases, just like Alder Lake did. And they've got an all e-core version of their own that's going to come out as well. I'd be pretty surprised if it doesn't beat Intel in terms of efficiency, although where Intel will have them beat is the mix of both big and little cores on a single chip. Still, it's yet to be proven how that will perform compared to Zen 4.

The reason AMD came back from the dead was not that they had the better architecture, they never had it. Ryzen has always been a me-too copycat of Intel's designs.

Ehh, that's debatable. It's got a lot of truth to it, but at the same time AMD has done some genuinely interesting work on their architecture. Such as going to chiplets. Or if you want to go back to older designs, the whole x86_64 design or the integrated memory controllers. Typically AMD comes out with a few good ideas that give them the edge for a short while, then Intel copies them and goes back to the lead their massive financial advantage typically gives them.

That will teach them to price their products sky-high every time in history they have a competitive product while pretending to be the pro-consumer company.

I don't think AMD has ever said anything like that, sounds like you're more responding to fanboys than AMD itself. They really have no choice but to cash in whenever possible. Keeping prices low does nothing for them - they can't expand their marketshare any more, because they're already production limited, and being nice to people for the sake of being nice isn't very good business. Better for them to get money to fuel R&D while they can.

4 e-cores equal 1 p-core in die area. That means that instead of a 128 core Ryzen you can get in a theoretical scenario 512 Intel e-cores.

There's a bit of confusion here - Intel's e-cores are only half the size of Ryzen 3, and about the same size as a Ryzen 2 core, so they're not actually all that tiny. It's just Intel's p-cores are absolutely giant compared to everything else. The 128 core Bergamo Ryzen system is also meant to be e-cores, so even though it's a newer architecture it's quite possible they'll be equivalent size to Intel's e-cores. I don't think any kind of information about exactly how big those chips will be has leaked out yet.

Sure, lower clocked and slightly lower IPC, but you get 4 times the cores. You need a heavily multi core system, remember? Which is going to be more efficient?

So only 2 times the cores. And while 2 e-cores will beat a Zen 3 core in performance, I'm not sure how much more efficient it actually would be. The Zen 3 server cores run on pretty low power (~4W each), and we don't know what Zen 4 will be yet, or what Intel's e-core will be on Raptor Lake. Too many unknowns to project much at the moment, in my opinion.

Last edited by smitty3268; 25 November 2021, 03:05 AM.
Likes 3
Leave a comment:
TemplarGR replied

25 November 2021, 02:39 AM
Originally posted by coder View Post

In terms of scalar integer IPC, only. Not in absolute performance (because they clock lower) and definitely not in floating-point or integer vector performance.

Depends on what you're doing. Vectorized workloads would better benefit from big cores, especially those supporting AVX-512.

1) E cores don't have to have the best absolute performance in any kind of load. 4 e-cores equal 1 p-core in die area. That means that instead of a 128 core Ryzen you can get in a theoretical scenario 512 Intel e-cores. Sure, lower clocked and slightly lower IPC, but you get 4 times the cores. You need a heavily multi core system, remember? Which is going to be more efficient?

2) Alder Lake's successor will more than likely have official AVX-512 again. They probably cut it out from Alder Lake because they weren't ready to have it enabled at the same time as e-cores. So instead of a theoretical 32 p-cores of AMD, imagine you get 16 p-cores AND 64 e-cores. Even in highly vectorized workloads this is probably going to be the most efficient solution don't you think?
Leave a comment:
coder replied

25 November 2021, 02:34 AM
Originally posted by TemplarGR View Post

That will teach them to price their products sky-high every time in history they have a competitive product while pretending to be the pro-consumer company.

U mad, bro?

They gotta make money while they can. They've always offered decent value for money, but mo' cores gonna cost mo' money. It's not their fault Intel couldn't scale up to as many cores.

P.S. when did they ever do "pretending to be the pro-consumer company"? Some people like them because underdog, but I think you're projecting.
Likes 1
Leave a comment:
coder replied

25 November 2021, 02:30 AM
Originally posted by TemplarGR View Post

Seriously, i don't think you understand just how much better in power efficiency Raptor Lake is going to be

What's your use case, again? Because people in this thread seem to be simultaneously talking about Alder Lake and cloud. Well, cloud uses server CPUs, like Sapphire Rapids, which won't have E-cores.

Alder Lake and Raptor Lake are client-only (i.e. laptops and desktops). If you buy one in the form of a Xeon E-series (which are just rebranded desktop chips with a few extra features), you could put it in a small server, but that's a niche market. Mainstream servers use Xeon Scalable (i.e. Sapphire Rapids).

If Intel announced a true cloud CPU based on E-cores, I sure haven't heard about it. To my knowledge, they only offer Atom-branded server chips for embedded server applications, like 5G basestations and enterprise NAS boxes.

Last edited by coder; 25 November 2021, 02:38 AM.
Leave a comment:
coder replied

25 November 2021, 02:25 AM
Originally posted by TemplarGR View Post

E-cores of Alder Lake ARE high performance. They are the equivalent in performance of Core i10xxx.

In terms of scalar integer IPC, only. Not in absolute performance (because they clock lower) and definitely not in floating-point or integer vector performance.

Originally posted by TemplarGR View Post

So it makes more sense to have more of THOSE when you need 128-256 cores etc.

Depends on what you're doing. Vectorized workloads would better benefit from big cores, especially those supporting AVX-512.
Leave a comment:
TemplarGR replied

25 November 2021, 02:22 AM
Originally posted by smitty3268 View Post

It's actually the exact opposite. Power efficiency is king in most of the big data centers.

It's not about the raw price of the power, it's about the actual power and cooling systems installed in their buildings. That's the limiting factor - the more efficient the processors are, the more of them they can pack into the same building in 1 datacenter, rather than having to build a dozen different datacenters across hundreds of miles.

It's the workstation and HEDT markets that don't care about power use.

So, apparently, "power efficiency is king" (which is correct), but Intel having that power efficiency due to big.little is a failure cause we are all amd fanbois here and we juts have to push the red team, amirite?

Seriously, i don't think you understand just how much better in power efficiency Raptor Lake is going to be (with the rumor of much more e-cores). Alder Lake is already great. It sports both the best IPC in the business for the p-cores and numerous efficient e-cores. AMD's approach of just adding more slightly weaker than Intel p-cores won't be better in efficiency in any multi-threaded use case. It is just a matter of time until schedulers get optimized and by then AMD's Zen 4 is going to be a dinosaur. AMD's "plan" is to just add 50% more cores and some more cache.

The reason AMD came back from the dead was not that they had the better architecture, they never had it. Ryzen has always been a me-too copycat of Intel's designs. It is just that Intel's fabs failed unexpectedly and they got stuck at 14nm for too long while AMD exploited TMSC. That gave AMD their supposed "efficiency". But this is coming to an end, i am afraid, and AMD is going back to where it belong: To the budget bin. That will teach them to price their products sky-high every time in history they have a competitive product while pretending to be the pro-consumer company.
Leave a comment:
TemplarGR replied

25 November 2021, 02:14 AM
Originally posted by skeevy420 View Post

Y'all can't see the forest for the trees. All these "clouds" need lots of high performance cpu threads. It doesn't matter if LIbreOffice, Call of Duty, compilers, or anything else isn't optimized for 128 cores. What matters to them is being able to sell off some cores for some time and knowing that any task on any thread performs just as well. The end-user running poorly optimized software isn't the concern of the cloud providers. It's not their fault you didn't setup Cmake or used an inferior solution during premium, paid-for time...shit, they want you to run unoptimized solutions so you have to pay for extended runtime.

On the desktop side AMD can start selling better APUs since its not like most non-workstation desktops need more than 6C12T. I'd go with 8C16T to mirror game consoles. Instead of giving desktops more computing cores they can give them more graphics cores where the removed 120 computing cores would otherwise be.

Or do something similar to Intel with an 8C16T high performance CCD, an 8C16T low performance CCD, and more graphics cores where the removed 112 computing cores would otherwise be.

Yeah sure buddy, we "cant see the forest for the trees", it is not like you are a blatant AMD fanboi. E-cores of Alder Lake ARE high performance. They are the equivalent in performance of Core i10xxx. They are not ARM or something low power. So it makes more sense to have more of THOSE when you need 128-256 cores etc. Pure high performance cores that offer the ultimate per core performance aren't typically needed in that high numbers in the "clouds". There is a reason Intel went with that approach, and Intel has always been a leading force in x86 trends.... Intel's architectures are simply going to be much more efficient than AMD's, whether you are on a desktop, a workstation, or a server farm.
Leave a comment:
coder replied

25 November 2021, 01:27 AM
Originally posted by pipe13 View Post

BTW, "make -jN" is a feature I use many times each day. Sure "make -j32" is quick, but plain old default single-threaded "make -j1" stops much closer to the actual command-line compiler error. I usually do "make -j32; make" ftw.

I have many years' experience with GNU Make. When we switched to CMake, I initially kept using Make as the backend, but found that Ninja was significantly faster. Maybe that has more to do with inefficiencies in CMake's backend for Make than GNU Make, itself.

And, at least when CMake is used to drive Ninja, it has the nice property of echoing the failed commandline + errors at the end. This eliminates the need to use a serial build or go searching through a logfile to find the cause of a failed build.

Prior to using CMake, the GNU Make buildsystem I wrote used extensive metaprogramming techniques, to achieve most of the same properties as CMake (e.g. public/private dependencies with public dependencies' include paths being automatically inherited). The advantage being that it's single-pass. The main benefit we got from switching to CMake is that it's standardized and fairly well-documented vs. my ad hoc buildsystem.

Something I wish Ninja would do is record the previous (user + sys) time to perform each step, so that subsequent builds of the same sources could be more optimally scheduled.
Leave a comment:
pipe13 replied

24 November 2021, 11:54 PM
BTW, "make -jN" is a feature I use many times each day. Sure "make -j32" is quick, but plain old default single-threaded "make -j1" stops much closer to the actual command-line compiler error. I usually do "make -j32; make" ftw.
Leave a comment:
coder replied

24 November 2021, 10:00 PM
Originally posted by Sonadow View Post

ANd why would someone building FOSS for personal use as a hobby require ECC?

I use ECC in my own machines, when possible (i.e. except for my laptop... grrr). My reason is simple: I value stability and my time, more than the price delta between ECC RAM/platform and non-ECC.

Where I consider ECC to be a must is in work on high-value data and in servers.

Speaking specifically of FOSS, I'd say anyone building packages for redistribution should consider the cost to downstream users, if they produce a bad build due to memory errors. For that reason, it's probably also a good idea to use a filesystem with checksums, like BTRFS. That said, it seems like most distros have their own build service, which presumably utilizes appropriately-spec'd server hardware.
Likes 4
Leave a comment:

Announcement

Linux Prepares For Next-Gen AMD CPUs With Up To 12 CCDs

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: