GCC 6/7 Gets A Performance-Sensitive Fix
Written by Michael Larabel in GNU on 9 May 2017 at 09:06 AM EDT. 7 Comments
GNU --
A Phoronix reader pointed out a performance regression fix now available for GCC 6 and GCC 7 that could help some rather trivial C code perform much better.

A GCC bug was opened regarding its poor code generation of this bit of code that can be used in distance measurements and other areas:
x_x = (x * x) / 200;
y_y = (y * y) / 200;

The instructions generated by GCC came down to:
movl %edi, %r13d
imull %edi, %r13d
movl %r13d, %eax
sarl $31, %r13d
imull %ebx
sarl $6, %edx
movl %edx, %ecx
subl %r13d, %ecx

While Clang was generating a more efficient alternative:

movl %edx, %ebp
imull %ebp, %ebp
imulq $1374389535, %rbp, %rbp # imm = 0x51EB851F
shrq $38, %rbp

This case has now been fixed in GCC's code as of this week. The better-generated code is approximately 15% faster. Thanks to Sven for pointing it out and more details in this GCC bug report.

There will be fresh GCC/Clang Linux compiler benchmarks coming up on Phoronix in the next few weeks.
Related News
About The Author
Author picture

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter or contacted via MichaelLarabel.com.

Popular News This Week