Originally posted by coder
View Post
Apple M2 Support Added To Upstream LLVM Along With The A15, A16
Collapse
X
-
-
-
Originally posted by name99 View PostI think you do not understand/appreciate what instruction fusion does.Last edited by coder; 26 September 2022, 10:53 PM.
Leave a comment:
-
-
Originally posted by coder View PostNot sure where you got the 10% figure, but it's not consistent with what ARM reported (as I quoted in comment 13).
In any case, I just think it's interesting. I don't have a dog in this fight -- just a bemused observer.
Leave a comment:
-
-
Originally posted by coder View PostI considered that, but I still think things like the ratio of different execution ports can have a measurable effect. Maybe in just a few compute-heavy corner cases, but I'm not convinced it's irrelevant.
Interesting. I'd have expected their OoO would handle that, too. I guess, if you can just patch the compiler, then why bother doing it in hardware?
Leave a comment:
-
-
Originally posted by name99 View PostLosing 3x from not having a SIMD ISA is a big deal. Losing 10% by having autovectorization go down one path rather than another is no big deal.
In any case, I just think it's interesting. I don't have a dog in this fight -- just a bemused observer.
Leave a comment:
-
-
Originally posted by name99 View PostYou don’t need a scheduling model when you’re as OoO as Apple, you really don’t!
Originally posted by name99 View PostAll you need is hints to ensure that fused pairs are always placed adjacent in the instruction stream.
Leave a comment:
-
-
Originally posted by coder View PostThanks for the tip, and I will check it out, but my point still stands about them missing out on SVE-optimized software. So, I think they'll eventually need to add it.
Apple’s bet is that little specifically SVE optimized code will be written (as opposed to auto-vectorized code). They are probably correct.
It’s no longer the 1990s, not even the 2010s.
Losing 3x from not having a SIMD ISA is a big deal. Losing 10% by having autovectorization go down one path rather than another is no big deal.
Leave a comment:
-
-
Originally posted by coder View PostOkay, thanks for pointing that out. What I meant was the scheduling model. I was expecting to see a custom scheduler model for the new cores, but I now see that Apple is always just using Cyclone. I'm also noticing they didn't bother to tune the prefetch parameters since A7.
Do you think they maintain a different scheduler model, on their internal fork? I guess a way to find out would be to compile the same code with the same version of public LLVM that Apple's tools seem sync'd with.
Leave a comment:
-
-
Originally posted by name99 View PostActually it does. Look at the feature list, eg the fuse options. You can track these through LLVM to see the exact pattern that are fused.
Do you think they maintain a different scheduler model, on their internal fork? I guess a way to find out would be to compile the same code with the same version of public LLVM that Apple's tools seem sync'd with.
Leave a comment:
-
-
Originally posted by name99 View PostOr maybe the plan is to provide an alternative to SVE…
SVE is better than the hash Intel has made of AVX but it’s far from perfect in various ways.
Look up the Macroscalar architecture…
Leave a comment:
-
Leave a comment: