I found this test weird. It's not just *tuning* options that are being changed, it's which set of instruction-set extensions are enabled. That's interesting to test too (e.g. whether Zen benefits a lot or a little from letting the compiler auto-vectorize with AVX2, use BMI2's more efficient shift instructions, and stuff like that).
-O3 -march=znver1 -mtune=generic would enable all of Zen's instruction-sets, but *tune* the same as plain -O3.
-march=k8-sse3 is a really weird choice. K10 (-mtune=amdfam10 or -mtune=barcelona) would seem to be more sensible. -mtune=bdver4 (Excavator) would also be a good choice to compare against, since it's the next-most-recent AMD CPU.
Anyway, it's hard to know which effects are from different tuning and which are from instruction-sets.
-O3 -march=znver1 -mtune=generic would enable all of Zen's instruction-sets, but *tune* the same as plain -O3.
-march=k8-sse3 is a really weird choice. K10 (-mtune=amdfam10 or -mtune=barcelona) would seem to be more sensible. -mtune=bdver4 (Excavator) would also be a good choice to compare against, since it's the next-most-recent AMD CPU.
Anyway, it's hard to know which effects are from different tuning and which are from instruction-sets.
Comment