Main most of geomean difference is due to the two oneDNN benchmarks. I perfed it on zen4 and it is a jitter based benchmark. Most time is spent by jit produced code and omp runtime:
5.15% benchdnn libomp.so [.] 0000000000058a◆
3.38% benchdnn libomp.so [.] 0x00000000000c▒
2.69% benchdnn libm.so.6 [.] __ieee754_logl▒
1.16% benchdnn benchdnn [.] rnn::fill_memo▒
0.90% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.78% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.41% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒
0.38% benchdnn libomp.so [.] 0x00000000000c▒
0.27% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.24% benchdnn libomp.so [.] 0x00000000000c▒
0.24% benchdnn libdnnl.so.3.1 [.] dnnl_memory_de▒
0.23% benchdnn libm.so.6 [.] __logl ▒
0.23% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒
0.19% benchdnn libomp.so [.] 0x00000000000c▒
0.16% benchdnn benchdnn [.] rnn::fill_weig▒
0.16% benchdnn libomp.so [.] 0x000000000008▒
0.14% benchdnn libomp.so [.] 0x000000000008▒
0.12% benchdnn libomp.so [.] 0x00000000000c▒
0.12% benchdnn libomp.so [.] 0x00000000000c▒
0.12% benchdnn libomp.so [.] 0x00000000000c▒
0.12% benchdnn libomp.so [.] 0x00000000000c▒
0.12% benchdnn libomp.so [.] 0x00000000000c▒
0.11% benchdnn libomp.so [.] 0x000000000005▒
0.11% benchdnn libomp.so [.] 0x00000000000c▒
0.10% benchdnn libomp.so [.] 0x000000000005▒
0.10% benchdnn libomp.so [.] 0x00000000000c▒
0.09% benchdnn libomp.so [.] 0x00000000000c▒
0.09% benchdnn libomp.so [.] 0x00000000000c▒
0.07% benchdnn libomp.so [.] 0x00000000000c▒
0.07% benchdnn libomp.so [.] 0x00000000000c▒
0.07% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.06% benchdnn libomp.so [.] 0x00000000000c▒
0.05% benchdnn libomp.so [.] 0x00000000000c▒
0.05% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒
0.05% benchdnn benchdnn [.] dnn_mem_t::set▒
0.05% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒
0.05% benchdnn libomp.so [.] 0x00000000000c▒
0.04% benchdnn libomp.so [.] 0x00000000000c▒
0.03% benchdnn libomp.so [.] 0x00000000000c▒
0.03% benchdnn libm.so.6 [.] __logf_fma ▒
0.03% benchdnn libomp.so [.] 0x000000000005▒
0.03% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.03% benchdnn benchdnn [.] round_to_neare▒
0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.02% benchdnn libomp.so [.] 0x00000000000c▒
0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.02% benchdnn libomp.so [.] 0x00000000000c▒
0.02% benchdnn libc.so.6 [.] __sched_yield ▒
0.02% benchdnn libc.so.6 [.] __memset_avx2_▒
0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒
0.02% benchdnn libomp.so [.] 0x00000000000c▒
0.02% benchdnn libomp.so [.] 0x00000000000c▒
0.02% benchdnn [unknown] [k] 0xffffffffb244▒
0.02% benchdnn [unknown] [k] 0xffffffffb257▒
0.01% benchdnn libdnnl.so.3.1 [.] std::_Function▒
So it is not really testing quality of code generation in compiler.
LLVM Clang 16 vs. GCC 13 Compiler Performance On AMD 4th Gen EPYC "Genoa"
Collapse
X
-
Main most of geomean difference is due to the two oneDNN benchmarks
Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite
I perfed it on zen4 and it is a jitter based benchmark. Most time is spent by jit produced code and omp runtime
Code:[FONT=Courier New][COLOR=#000000]5.15% benchdnn libomp.so [.] 0000000000058a◆ [/COLOR] [COLOR=#18b218]3.38%[/COLOR][COLOR=#000000] benchdnn libomp.so [.] 0x00000000000c▒ [/COLOR] [COLOR=#18b218]2.69%[/COLOR][COLOR=#000000] benchdnn libm.so.6 [.] __ieee754_logl▒ [/COLOR] [COLOR=#18b218]1.16%[/COLOR][COLOR=#000000] benchdnn benchdnn [.] rnn::fill_memo▒ [/COLOR] [COLOR=#18b218]0.90%[/COLOR][COLOR=#000000] benchdnn libdnnl.so.3.1 [.] std::_Function▒ [/COLOR] [COLOR=#18b218]0.78%[/COLOR][COLOR=#000000] benchdnn libdnnl.so.3.1 [.] std::_Function▒ [/COLOR] 0.41% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒ 0.38% benchdnn libomp.so [.] 0x00000000000c▒ 0.27% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.24% benchdnn libomp.so [.] 0x00000000000c▒ 0.24% benchdnn libdnnl.so.3.1 [.] dnnl_memory_de▒ 0.23% benchdnn libm.so.6 [.] __logl ▒ 0.23% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒ 0.19% benchdnn libomp.so [.] 0x00000000000c▒ 0.16% benchdnn benchdnn [.] rnn::fill_weig▒ 0.16% benchdnn libomp.so [.] 0x000000000008▒ 0.14% benchdnn libomp.so [.] 0x000000000008▒ 0.12% benchdnn libomp.so [.] 0x00000000000c▒ 0.12% benchdnn libomp.so [.] 0x00000000000c▒ 0.12% benchdnn libomp.so [.] 0x00000000000c▒ 0.12% benchdnn libomp.so [.] 0x00000000000c▒ 0.12% benchdnn libomp.so [.] 0x00000000000c▒ 0.11% benchdnn libomp.so [.] 0x000000000005▒ 0.11% benchdnn libomp.so [.] 0x00000000000c▒ 0.10% benchdnn libomp.so [.] 0x000000000005▒ 0.10% benchdnn libomp.so [.] 0x00000000000c▒ 0.09% benchdnn libomp.so [.] 0x00000000000c▒ 0.09% benchdnn libomp.so [.] 0x00000000000c▒ 0.07% benchdnn libomp.so [.] 0x00000000000c▒ 0.07% benchdnn libomp.so [.] 0x00000000000c▒ 0.07% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.06% benchdnn libomp.so [.] 0x00000000000c▒ 0.05% benchdnn libomp.so [.] 0x00000000000c▒ 0.05% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒ 0.05% benchdnn benchdnn [.] dnn_mem_t::set▒ 0.05% benchdnn libdnnl.so.3.1 [.] dnnl::impl::cp▒ 0.05% benchdnn libomp.so [.] 0x00000000000c▒ 0.04% benchdnn libomp.so [.] 0x00000000000c▒ 0.03% benchdnn libomp.so [.] 0x00000000000c▒ 0.03% benchdnn libm.so.6 [.] __logf_fma ▒ 0.03% benchdnn libomp.so [.] 0x000000000005▒ 0.03% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.03% benchdnn benchdnn [.] round_to_neare▒ 0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.02% benchdnn libomp.so [.] 0x00000000000c▒ 0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.02% benchdnn libomp.so [.] 0x00000000000c▒ 0.02% benchdnn libc.so.6 [.] __sched_yield ▒ 0.02% benchdnn libc.so.6 [.] __memset_avx2_▒ 0.02% benchdnn libdnnl.so.3.1 [.] std::_Function▒ 0.02% benchdnn libomp.so [.] 0x00000000000c▒ 0.02% benchdnn libomp.so [.] 0x00000000000c▒ 0.02% benchdnn [unknown] [k] 0xffffffffb244▒ 0.02% benchdnn [unknown] [k] 0xffffffffb257▒ 0.01% benchdnn libdnnl.so.3.1 [.] std::_Function▒[/FONT]
So it is not really testing quality of code generation in compiler.
Leave a comment:
-
-
Originally posted by rene View PostSo cool this Linux distribution now has a clang by default as sys-cc and clang lot the linux kernel on other supported architectures, ... https://www.youtube.com/watch?v=nLyUhEMwGws !
Hmm, one of those is not like the restLast edited by skeevy420; 31 May 2023, 12:30 PM.
Leave a comment:
-
-
Originally posted by oleid View PostSo in the end it's a "benchmark your code to see what works best". At work I use clang for development and gcc for deployment. Mostly due to clang's faster compile time of c++ code. Diagnostics are mostly on par nowadays.Last edited by carewolf; 31 May 2023, 03:37 AM.
Leave a comment:
-
-
Originally posted by filbo View PostI'd also be quite curious to see the entire set of tests also run on both compilers with '-march=blended', or whatever is used these days to generate broadly operable output code
-march=x86-64-v2
-march=x86-64-v3
-march=x86-64-v4
All these are generic options for x86_64 processors. Each one supports a different minimum-requirement of certain instruction sets, but are backwards compatible. I.E, a CPU that supports -march=x86-64-v4 will compile just fine with -march=x86-64, albeit without the beneifits that newer instruction sets will bring.
Leave a comment:
-
-
These tests need to be run against the versions underneath to compare whether it is worthwhile to invest on DevOps and CICD to change:- gcc-11.4
- gcc-12.3
- clang-14
- clang-15
Leave a comment:
-
The 'Number Of First Place Finishes' chart uses two nicely offsetting colors. Please use those same colors for all the individual bar charts so it is more readily apparent which is which. (I do see that in this article, the two contenders always appear in the same order, but you don't always do it that way; and color hinting is an improvement in any case.)
Since this was nominally a test of '-march=znver4', it would be nice to see that explicitly tested vs. '-march=native'. It would also be sufficient to say in the text that you ran a few representative tests on each compiler and confirmed that the two are equivalent, as intended.
I'd also be quite curious to see the entire set of tests also run on both compilers with '-march=blended', or whatever is used these days to generate broadly operable output code.
Finally, in the final geometric mean chart, it would be very interesting to see a 'fastest' contender. This would be a fake run where each individual benchmark got the fastest score of any of the individual real contenders. 'Fastest' would be the 100% contender, reducing each of the other competitors to a somewhat smaller number showing what fraction of the overall fastest possible score it can deliver. This seems particularly interesting in cases like this, where the final geomean seems to show that they're quite close together, yet there are many individual tests where one or the other rockets ahead. Seeing real contenders with, say, 96% scores would tell you it isn't that important to pick and choose for each binary; real scores in the 80% range might indicate that you really *should* think about mix and match.
Leave a comment:
-
-
So cool this Linux distribution now has a clang by default as sys-cc and clang lot the linux kernel on other supported architectures, ... https://www.youtube.com/watch?v=nLyUhEMwGws !
Leave a comment:
-
Leave a comment: