Intel Makes One Line Tweak To GCC To Fix A "Random Performance Penalty"
Coming up on my radar today is a commit made to the GNU Compiler Collection (GCC) for adjusting the loop alignment with Intel's generic tuning path. In turn this should address "some random performance penalty in benchmarks" with coping better around cache lines.
Intel compiler engineer Haochen Jiang made this commit to GCC that landed today in GCC 15 Git:
That's it and tuning the loop alignment is said to take care of the performance penalty but without any further insight on the workload(s) tested and quantifying the impact. And, yes, the "it will somehow solve the issue" isn't exactly reassuring. Hopefully though this is a nice one-line improvement for Intel CPUs relying on GCC-built binaries with this tuning path.
In any case every little bit counts and Intel engineers continue contributing a lot to the GCC and LLVM/Clang upstream compilers.
Intel compiler engineer Haochen Jiang made this commit to GCC that landed today in GCC 15 Git:
"Previously, we use 16:11:8 in generic tune for Intel processors, which lead to cross cache line issue and result in some random performance penalty in benchmarks with small loops commit to commit.
After changing to always aligning to 16 bytes, it will somehow solve the issue."
That's it and tuning the loop alignment is said to take care of the performance penalty but without any further insight on the workload(s) tested and quantifying the impact. And, yes, the "it will somehow solve the issue" isn't exactly reassuring. Hopefully though this is a nice one-line improvement for Intel CPUs relying on GCC-built binaries with this tuning path.
In any case every little bit counts and Intel engineers continue contributing a lot to the GCC and LLVM/Clang upstream compilers.
18 Comments