With 80 cores (even if individual cores aren't that great), this could be a hell of a video editing chip, acting rather like Bulldozer does for that job. If you recall, Bulldozer sucked for most workloads, but was something like 2 1/2 times faster than Phenom II x6 for a straight libx264 encode job benchmark. Real world 1080p video editing I can render to a 1080p finished file in kdenlive in realtime to about 1.3x realtime, with 8 threads on four very wide cores.
With 80 individual cores that again are a bit slow, and a video editor written to ensure nothing runs single-threaded and throttles the rest, you would think ten times as fast, but lose some to ARM being slower than x86. That used to be a 4-1 loss with a 1GHZ single-core ARM performing rather like a 233mhz Pentium 2, caused by in-order execution as I recall. On the other hand, I once tested turning off 2 bulldozer modules vs enabling "one thread per core" (disable SMT) and found that adding the second thread to a core only really added 30% more throughput on a multithreaded job, similar to intel hyperthreading. Lose some to slower per-thread architecture, gain some back from having all of a core for each thread. 25% as fast single-threaded if ARM is still 1/4 as fast per-clock/per-core as x86, but instead of being 1.3x as fast as a 40 core, 2x as fast as a 40-core setup. We get 2/1.3 for individual cores being about 1.5x as fast as paired cores, 1.5*0.25 gives 0.375x as fast per-clock per core as Bulldozer if Bulldozer was as fast as other x86 chips per-core, which it is not. Multiply by ten times the thread count and you get 3.75 times as fast, then more for not losing as much per-core/clock throughput relative to Bulldozer as to a "normal" x86 design.
Thus, a cluster of 80 overclocked ARM cores that ran as fast as overclocked bulldozer (4.3 GHZ here) should end up being at least 4x as fast real-world for a perfectly scaling multithreaded job. This would require that no one job force all the others to wait while using more than 1/80th of the total resources and being single-threaded. If that worked, we would have realtime video rendering of 4K video to H264 (rejecting patent-troll favorite H265, which is twice as CPU intensive).
Right now, this might be an expensive server core. Ten years from now, that same rack-mount server box with everything in it might sell at a computer show for a few hundred bucks if even that, as something even faster comes along. Assuming my bulldozer chip lives that long, this could make a replacement for it.
Announcement
Collapse
No announcement yet.
Ampere Altra Announced - Offering Up To 80 Cores Per Socket
Collapse
X
-
Originally posted by edwaleni View PostNow that Ampere has a "TM" notation all over their documents, maybe NVidia will stop using it.
I guess all those electronics textbook publishers better start paying royalties.
Leave a comment:
-
Originally posted by ms178 View PostIt seems they don't support SVE or SVE2 extensions.
Leave a comment:
-
It is not always about the total performance in synthetic benchmarks. Sometimes many slower cores may be beneficial instead of fewer faster cores.
Leave a comment:
-
Originally posted by boxie View Post
I would like you to stop for a second and appreciate what you just said... About an ARM server thrashing Intel and just pipping AMD in a benchmark.
Sure, these are vendor numbers and we shouldn't trust them until validated, but - an ARM server going up against high end x86 and not being totally humiliated...
kinda awesome huh?
- Likes 4
Leave a comment:
-
Originally posted by boxie View Post
I would like you to stop for a second and appreciate what you just said... About an ARM server thrashing Intel and just pipping AMD in a benchmark.
Sure, these are vendor numbers and we shouldn't trust them until validated, but - an ARM server going up against high end x86 and not being totally humiliated...
kinda awesome huh?
- Likes 1
Leave a comment:
-
Now that Ampere has a "TM" notation all over their documents, maybe NVidia will stop using it.
- Likes 1
Leave a comment:
-
Originally posted by willmore View Post80 cores vs 64 cores and just eeking out a win in one of the easiest multi-core benchmarks?
Sure, these are vendor numbers and we shouldn't trust them until validated, but - an ARM server going up against high end x86 and not being totally humiliated...
kinda awesome huh?
- Likes 8
Leave a comment:
-
80 cores vs 64 cores and just eeking out a win in one of the easiest multi-core benchmarks? That's not much of an endorsement. The TCO graphs are notorious for BS as well as the 'T' is often not very total. Even if the processors were free, a big machine stuffed full of solid state drives and TBs of DRAM would see little improvement.
I do appreciate more entrants in the server market. It's been clear what even one competetive alternative to Intel has done for performance and performance/cost. A second should help some more. But it's too soon to say if they're competetive in any real sense. I'm quite willing to recompile the world and retune software for a completely different architecture, but that comes with different costs for different workloads. I'm lucky enough to run most of my software from well maintained open source projects. Not everyone shares similar benefits.
- Likes 2
Leave a comment:
-
"Ampere has yet to publish much in the way of Altra benchmarks, but from one of the numbers they did disclose during the press briefing was that for SPEC int rate performance"
It sounds like - hey there is one test we won (but we suck at most of other ones).
Especially this company is based on ex-intel guys
- Likes 5
Leave a comment:
Leave a comment: