Announcement

**skeevy420** · 22 October 2021, 07:00 AM

More results that show that we need repositories that can push out packages based on gcc feature level.

How come there aren't -O2 native and flto tests?

**Michael** · 22 October 2021, 07:13 AM

Originally posted by skeevy420 View Post

How come there aren't -O2 native and flto tests?

Only so much time in a day, especially after seeing not much change and this would end up only as a 1 page article.

**skeevy420** · 22 October 2021, 07:30 AM

Originally posted by Michael View Post

Only so much time in a day, especially after seeing not much change and this would end up only as a 1 page article.

I totally get that. I'm on around hour 19 of a massive file system reshuffle. Long story short: I'm moving all my (mostly) WORM data from LZ4 over to Zstd-19.

Update:

Code:

[FONT=monospace][COLOR=#000000]took 15h 36m 8s   [/COLOR][COLOR=#b2b2b2][/COLOR][COLOR=#000000] at 08:02:12  [/COLOR][/FONT]

**syrjala** · 22 October 2021, 08:46 AM

"GCC 12 Compiler Performance"
-> I was expecting a benchmark on how unbearably slow gcc itself has become in recent years. I think a really interesting benchmark would be to take a few significantly older gcc releases and benchmark their compilation speed vs. speed of the generated code to see if the increase in compilation time can be justified. Of course one slight problem could be that you also need benchmarks that can be built on both old and new gcc releases. So that might limit on what benchmarks can be used, and/or which gcc versions can be tested.

**brucethemoose** · 22 October 2021, 09:50 AM

Originally posted by syrjala View Post

"GCC 12 Compiler Performance"
-> I was expecting a benchmark on how unbearably slow gcc itself has become in recent years. I think a really interesting benchmark would be to take a few significantly older gcc releases and benchmark their compilation speed vs. speed of the generated code to see if the increase in compilation time can be justified. Of course one slight problem could be that you also need benchmarks that can be built on both old and new gcc releases. So that might limit on what benchmarks can be used, and/or which gcc versions can be tested.

Default flags/settings will not be the same, so it would be apples to oranges.

**syrjala** · 22 October 2021, 09:59 AM

Originally posted by brucethemoose View Post

Default flags/settings will not be the same, so it would be apples to oranges.

You can test whatever combination flags you want on any of the compilers. The don't even have to match. The point is whether those extra optimizations passes (or just bloat) that make it so slow actually worth it?

**smitty3268** · 22 October 2021, 11:12 AM

Originally posted by syrjala View Post

You can test whatever combination flags you want on any of the compilers. The don't even have to match. The point is whether those extra optimizations passes (or just bloat) that make it so slow actually worth it?

Newer compilers are also adding more language features. It's not just optimization passes.

**hubicka** · 23 October 2021, 08:33 AM

Periodic testers (maintained by Martin Liska) at SUSE also compare different GCC release branches & trunk
https://lnt.opensuse.org/db_default/..._report/branch (slow to load)

https://lnt.opensuse.org/db_default/v4/SPEC/spec_report/branch?sorting=gcc-11%2Cgcc-trunk&all_elf_detail_stats=on (slow to load showing only gcc11 vs trunk)

long story short:

zen1 (Ryzen 5 1600) with -O2 -flto:

test
SPECint 2017

base (gcc 6)
3.886

GCC 7
4.20%

GCC 8
5.04%

GCC 9
4.59%

GCC 10
5.12%

GCC 11
4.47%

trunk (GCC 12)
10.11%

zen2 (AMD EPYC 7702) -O2 -flto:

test
SPECint 2017

bsae (gcc 6)
3.609

GCC 7
4.32%

GCC 8
4.12%

GCC 9
3.51%

GCC 10
5.46%

GCC 11
5.89%

trunk (GCC 12)
12.13%

The change between gcc11 and gcc12 is mostly due to vectorization by default that greatly improves x264 benchmark (by 44%). Difference between gcc6 and 7 is exchange2 benchmark. Changes in specfp scores are within 1% range.

zen2 -Ofast -march=native -flto is:

test SPECfp 2017	base (GCC 6) 7.096	GCC 7 -3.80%	GCC 8 10.97%	GCC 9 18.56%	GCC 10 19.05%	GCC 11 20.13%	GCC 12 22.74%
SPECint 2017	4.020	4.00%	8.20%	6.47%	10.23%	15.49%	14.48%

zen1 -Ofast -march=native -flto is:

test SPECfp 2017	base (gcc 6) 4.240	GCC 7 4.67%	GCC 8 7.89%	GCC 9 8.56%	GCC 10 13.90%	GCC 11 17.50%	trunk (GCC 12) 17.26%
SPECint 2017	7.489	~	5.32%	7.06%	7.52%	7.75%	9.02%

zen2 -Ofast -march=native -flto and pgo is:

test SPECfp 2017	base (gcc 6) 7.404	GCC 7 ~	GCC 8 11.37%	GCC 9 16.99%	GCC 10 17.75%	GCC 11 18.58%	trunk (GCC 12) 18.39%
SPECint 2017	4.346	3.94%	5.63%	2.75%	8.94%	9.46%	9.70%

and zen1 -Ofast -march=native -flto and pgo is:

test SPECfp 2017	base (gcc 6) 4.538	GCC 7 4.61%	GCC 8 4.56%	GCC 9 4.44%	GCC 10 9.14%	GCC 11 9.01%	trunk (GCC 12) 9.68%
SPECint 2017	7.829	~	5.68%	6.13%	6.73%	6.73%	10.22%

Compile time of complete spec2017 (in seconds) with -O2 -flto:

base (gcc 6)
729.993

gcc 7
3.42%

gcc 8
13.22%

gcc 9
~

gcc 10
37.52%

gcc 11
36.09%

trunk
57.81%

So performance keeps improving but also compile times keeps growing. I will definitly spend again some time speeding up trunk once stage1 development ends.

More results (for kabylake, zen1 and zen2 and also for spec2006 and other flags) are easily seen in the link above. As well as breakdown to individual tests. Curiously zen1 machine seems to

**arQon** · 23 October 2021, 07:31 PM

Originally posted by skeevy420 View Post

More results that show that we need repositories that can push out packages based on gcc feature level.

Once upon a time, you could do this on a practical level just by building from source. But then clusterf**ks like CMake happened, and more and more people started jumping on the NIH bandwagon and reinventing autoconf and make, so now you need 20 different build systems installed to do that.

It's probably just about doable still if you have infinite patience and near-infinite free time (and I'm sure there'll be an Arch user or somesuch who does exactly that, and will be more than happy to tell us all about it! :P) but unless you have a specific package that you run for hours per day it's a huge net loss overall.

Announcement

An Early Look At The GCC 12 Compiler Performance On AMD Zen 3

An Early Look At The GCC 12 Compiler Performance On AMD Zen 3

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment