GCC On AArch64 Handles Rewriting "-march=native" To "-mcpu=native"
Toward the end of 2022 a GCC AArch64 compiler change was quietly made by Arm that allows "-march=native" to be handled on 64-bit ARM by treating it as the equivalent "-mcpu=native" option. The change happened to fly under my radar at that time and didn't draw much attention at large while now it's finally being officially documented in hopes of similar behavior being adopted by other compilers for AArch64.
NVIDIA compiler engineer Kyrylo Tkachov landed a documentation patch for the GNU Compiler Collection that outlines the GCC AArch64 behavior change made back in 2022. Mapping "-march=native" to "-mcpu=native" is a convenient helper given a lot of software out there aiming for peak performance have long used "-march=native" that works out well in the vast x86_64 landscape. But to avoid breaking software out there now entering the Arm space and caring about performance, since GCC 13 this helper change has been quietly in place.
The commit making the change back in 2022 explained:
This wasn't documented back then though and only today did this patch land to document the behavior. NVIDIA's Kyrylo Tkachov noted:
Here's to hoping other compilers now follow suit.
NVIDIA compiler engineer Kyrylo Tkachov landed a documentation patch for the GNU Compiler Collection that outlines the GCC AArch64 behavior change made back in 2022. Mapping "-march=native" to "-mcpu=native" is a convenient helper given a lot of software out there aiming for peak performance have long used "-march=native" that works out well in the vast x86_64 landscape. But to avoid breaking software out there now entering the Arm space and caring about performance, since GCC 13 this helper change has been quietly in place.
The commit making the change back in 2022 explained:
aarch64: Rewrite -march=native to -mcpu if no other -mcpu or -mtune is given
We have received requests to improve the out-of-the box experience and performance of AArch64 GCC users, particularly those porting software from other architectures. This has many aspects. One such aspect are apps built natively with an -march=native used as a tuning flag in the Makefile. On AArch64 this selects the right architecture features on GNU+Linux for the host system but tunes for the "generic" CPU target. This patch makes GCC also tune for the host CPU, as well as selecting its architecture. That is, it translates -march=native into -mcpu=native. This maintains the documentation that it "causes the compiler to pick the architecture of the host system" since -mcpu=native does that, but it also gives a better performance experience for the user.
If the user explicitly asked for a particular CPU tuning through -mcpu or -mtune then we don't do this rewriting so that the user option is honoured.
This wasn't documented back then though and only today did this patch land to document the behavior. NVIDIA's Kyrylo Tkachov noted:
Commit dd9e5f4db2debf1429feab7f785962ccef6e0dbd changed -march=native to treat it as -mcpu=native if no other mcpu or mtune option was given. It would make sense to document this, especially if we try to persuade compilers like LLVM to take the same approach. This patch documents that behaviour.
Here's to hoping other compilers now follow suit.
2 Comments