Apple Announces Its New M2 Processor

schmidtbag replied

08 June 2022, 09:53 PM
Originally posted by Developer12 View Post

You can run a full copy of debian on the M1, today, and the only thing that would affect benchmarks is the lack of power management. Yet the benchmarks largely come out the same between macOS and linux.

Well, that ultimately proved the point I was trying to make. So, while I wasn't aware you could daily drive a M1 (part of me questions how usable it really is...), my point was the Linux isn't going to have such a distinct lead once you get a full blown desktop running on it.
Leave a comment:
kgardas replied

08 June 2022, 05:51 PM
Originally posted by Developer12 View Post

X86 chips still pay the price despite all the instruction caching they claim. There's no free lunch for having a bad ISA. That caching is of limited size, consumes massive amounts of die area in addition to the decoding circuitry, and the ISA still imposes a low limit on how quickly you can decode *new* code while following program execution. Since the dawn of pentium, x86 has always spent more than double the number of transistors to achieve the same performance.

Yes, sure, x64 is paying the price indeed. And so what? ARM is paying the price too although not so big. Every OoOE CPU is paying translation price. The question is price/energy/perf/availability. Here Apple lose due to 'availability'. Let's see how they new Mac Pro will look like...
Leave a comment:
kgardas replied

08 June 2022, 05:49 PM
Originally posted by mangeek View Post

I've been saying this for a while, but if Intel made a similar consumer CPU package that had 16GB of very high speed RAM next to the processor on one bus, and then had any additional/user-added RAM hang off of a CXL link, I think they'd sell like hotcakes and reduce their costs. They could make one part that covered 90% of desktop/laptop use cases, maybe laser-off half of the RAM or a few cores for the 'low end' models.

I really don't think Intel is doing themselves any favors by making 80+ flavors of Alder Lake fore every use case. Just make the one I mentioned above for 'casual computing' (with a handful of clock limits to stratify the market) and call them "Evo '22 [7/5/3]"

They are making just 3 die types, not more with ADL.
Likes 1
Leave a comment:
kgardas replied

08 June 2022, 05:47 PM
Originally posted by Developer12 View Post

I'm speaking in general. Even as early as the pentium pro and the Sun Ultra 5 the pentium needed double the transistors for it's instruction decoding and control. It's "risc-like-core" didn't save it that embarrassment.

Oh. So you are comparing UltraSPARC IIi with Pentium II probably. Both from 1997. Hmm, all right so pentium required a bit more energy to have the same performance. But this is nearly 30 years ago. And where do you see USparc now? And what's their price/perf?

So yes, I very much like SPARC, but honestly it lost. The question will be if x86/x64 is going to lose to ARM/RISC-V or not.
Leave a comment:
Developer12 replied

08 June 2022, 04:44 PM
Originally posted by HEL88 View Post

So, ARM is bad ISA too, because Cortex A77, A78, X1, X2, Neoverse-V1, Neoverse-V2, Neoverse-N2, every top power ARM procesor since 2019 have uOP cache like x86.

A78, X1, X2, V1, V2 have big 3000 uOP like Zen2/3 other (A77, N2) have smaller 1500 uOP.

There's a difference between a uop cache and an instruction decoding cache. Much as intel/amd like to muddy the water with a "but our CPUs are risc on the inside!" tagline.

The micro-op cache holds decoded uops that have been decoded already, waiting for reordering. The instructions decode cache stores translation mappings between instructions and the resulting uops. At any rate, for those chips that *do* feature some kind of instruction translation cache, you might actually want to check how much die area it consumes.
Leave a comment:
YamashitaRen replied

08 June 2022, 04:42 PM
Originally posted by tunnelblick View Post

The M2 MBA is, again, fan-less. What else is there to say? Wake me up when AMD or Intel come up with a chip that can be cooled with a heat pipe and delivers the same performance the M1/2 offers.
You can dislike Apple and their OS or their philosophy or whatever but their CPUs are great.

This.
I hate Apple so much but still I use an 13'3 M1 Pro and this is the best laptop I have ever bought.
Not once I have heard its fan.
While my Ryzen 3 laptop made me want to kill myself. Had to use it unplugged to make sure the CPU would be throttled, which would make the user experience average.

You can all throw your fancy copium numbers around, but reality is M1 design is vastly superior from a perf/watts perspective. And this should be one of the main design goal for laptops given their usage.
Likes 1
Leave a comment:
Developer12 replied

08 June 2022, 04:41 PM
Originally posted by kgardas View Post

Seriously I very much doubt your claim here. What performance? Raw number or perf/watt? Also in comparison with what exactly? I hope you understand you can't compare 2 CPU preciselly when they are on different processes right? And also you don't know their transistor counts...

I'm speaking in general. Even as early as the pentium pro and the Sun Ultra 5 the pentium needed double the transistors for it's instruction decoding and control. It's "risc-like-core" didn't save it that embarrassment.
Leave a comment:
Developer12 replied

08 June 2022, 04:40 PM
Originally posted by piotrj3 View Post

X86 costs in efficiency due to that it is old architecture is 5% efficiency according to some Intel engineer. Internally processor is RISC like anyway, and only bonus is that you need to decode CISC into RISC.

In fact 6xxx series mobile CPUs for AMD trade blows very well in efficiency vs M1. Losing just a litle in ST scores but actually winning vs M1 on 8 core configuration per watt.

By far biggest impact is Apple's memory configuration (RAM chips soldered very close to CPU) so apple pays smaller price for something not in cache, probably less complicated I/O part of die because of closed platform.

And lastly number of transistors ... oh boy.

12900k is around ~~10bln transitors (no official data but it is likely to be overestimate).
RTX 3090 is 28.3 bln transistors (official data).

M1 ultra is 114 bln transistor. In nutshell in size of one m1 ultra silicone you can get 3 times 12900k + rtx 3090.

Now yes M1 ultra is faster in multithreaded workloads then 12900k and consumes less power. But we talk here about insane transitor diffrence. 12900k spending 5 times number of transitors on just E cores would bring huge efficiency improvments if they were tuned towards let's say 3GHz. And in single threaded 12900k wins.

At least in CPU war Apple is competitive. But when we get to rtx 3090 ... oh boy . In blender/V-ray etc. (and i am talking about CUDA/Vulkan vs Metal) we talk about... 500% performance diffrence for rtx 3090. In fact RTX 3090 stock, is more power efficient per work done on entire computer consumption then M1 ultra. And RTX 3090 is seen as "The inefficient one that when you drop TDP to 75% you still retain 96% of performance".

>according to some intel engineer
>"internally a risc processor anyway"

Maybe don't get your figures and taglines from the guys getting creamed.
Leave a comment:
Developer12 replied

08 June 2022, 04:38 PM
Originally posted by Weasel View Post

Must be why x86 chips, despite having less transistor density (because built on inferior node) are still faster than ARM right? Or why the fastest supercomputer is x86 based huh?

I don't think performance means what you think it does. If you mention power efficiency please never touch the internet again.

Apple fanboys are more delusional than clowns eating Russian propaganda.

1. The nodes aren't that different. Only one generation.
2. The dies aren't even close to the same size. x86 dies are waaaaaay bigger.

No, I mean literal performance. Instructions completed per second. Even if you go all the way back the the pentium Pro and a Sun Ultra 5, the pentium has twice as many transistors to reach even close to the same level of performance. The difference between them, or between modern x86 chips and the M1, is that you burn too many transistors on pipeline control and instruction decoding and instruction caching when implementing the x86 ISA.

But sure, fewer transistors can get you more power efficiency. But to be honest? Power efficiency is intel's biggest problem right now even in performance. The heat dissipation of Alder lake is seriously knee-capping the clock speeds it can hit and sustain.
Likes 1
Leave a comment:
Developer12 replied

08 June 2022, 04:31 PM
Originally posted by schmidtbag View Post

You do both realize that you can't yet daily drive Linux on an M1 Mac, right? Running a few synthetic benchmarks through a primitive interface isn't exactly an Apples to apples (pun intended) comparison.
This was the same sort of thinking back in the days where games ran faster in WINE vs Windows, because WINE simply lacked the rendering capabilities. It essentially was running games at a lower detail level.
Strip down MacOS to just a command line (which last time I checked, is actually possible - not sure if it still is), I'm sure the benchmarks would turn out roughly the same.
Run Linux with GNOME or KDE with compositing effects on and benchmark graphical programs like Chrome or a game and Linux isn't going to have such an obvious lead anymore.

I'm not favoring Apple here, I'm just saying that when all things are equal, they do a damn good job optimizing form and function. We can whine about their closed nature all day but they know what they're doing. Apple might not have the most optimal solution for each individual program but their forced homogeneity allows more complex software to run more efficiently. In the modern world, everything is layered with abstraction. Apple is effectively removing some of these layers.

It's loosely based on a BSD kernel. Some here would argue BSD is even faster than Linux.

Yes, by all means continue to tell me about how I can't daily drive this primitive interface: https://regmedia.co.uk/2021/08/23/asahi-gnome.jpg
Go do some research before you open your mouth.

You can run a full copy of debian on the M1, today, and the only thing that would affect benchmarks is the lack of power management. Yet the benchmarks largely come out the same between macOS and linux.
Leave a comment:

Announcement

Apple Announces Its New M2 Processor

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: