A Look At The GCC Compiler Tuning Performance Impact For Intel Ice Lake
For those wondering if it's worthwhile for performance recompiling your key Linux binaries with the microarchitecture instruction set extensions and tuning for Ice Lake, here are some GCC compiler benchmarks looking at that impact for the Core i7 1065G7 on the Dell XPS 7390.
In particular, this article is looking at the affect on generated benchmark binaries when built under the following CFLAGS/CXXFLAGS configurations:
-O3 -march=skylake - Just optimizing for conventional Skylake processors.
-O3 -march=skylake-avx512 - Optimizing for Skylake AVX-512 processors like Skylake-SP/Skylake-X. The Skylake AVX-512 enables use of the AVX512F, CLWB, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instructions.
-O3 -march=icelake-client - Optimizing for Icelake client/desktop processors. New instructions exposed here not found with Skylake/Skylake-AVX512 include AVX512VBMI, AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2, AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ, and VAES. Note there is also the "icelake-server" target for future Ice Lake Xeon Scalable processors where additionally PCONFIG and WBNOINVD are flipped on.
Tests were done using Intel's Clear Linux with the Linux 5.3 kernel and GCC 9.2.1 compiler. No other changes were made to the hardware/software during testing besides changing around the CFLAGS/CXXFLAGS being used to build the benchmarks each time.