Originally posted by Barley9432
View Post
Announcement
Collapse
No announcement yet.
Benchmarking The Linux 5.19 Kernel Built With "-O3 -march=native"
Collapse
X
-
First, why gcc 11 on 12 gen processor? Literally quoting phoronix itself from few months ago "GCC 11 as the stable compiler introduced earlier this year there was the initial Intel "alderlake" target. However, that initial implementation was carrying the exisiting Ice Lake cost table that was not tuned for Alder Lake processors that launched last month. Merged for GCC 12 is that tuned Alder Lake support in place for those compiling binaries specifically using the "-march=alderlake" option."
Secondly, afaik kernel have build in flags -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx etc. to prevent gcc from unexpected optimizations, so dunno if simple kcflag would somehow overwrite those. And if not then there wont be much difference between native and x86-64.
- Likes 2
Comment
-
Originally posted by mercuriete View PostAs a Gentoo user I wanted to say ...
The time expend on kernel is very little compared with userspace.
This article didn't benchmark boot time where kernel is run 100% of the time (after firmware/BIOS and before init)
In my personal experience I gain one or two seconds of boot time after enabling -march=native on the kernel and -O2.
I need to redo my test again but at that time for me was very clear the gain using systemd-analyze.
TLDR;
boot time is better when --march=native on my tests.
The kernel itself completely loads in less than a second.
- Likes 3
Comment
-
Originally posted by brad0 View Post
Except it is not at all.
As someone else already commented, maybe rerun this test on icelake or one of the skylake derivatives and maybe a zen3, so the picture becomes a bit clearer.
- Likes 2
Comment
-
Originally posted by mlau View Post
To me it is .. I'd have expected that the usage of all the additional instruction set extensions which came out since the original K8 was designed (bmi1/2, movbe in particular) had a more positive impact. Or the gcc tuning model for alder lake is simply garbage.
As someone else already commented, maybe rerun this test on icelake or one of the skylake derivatives and maybe a zen3, so the picture becomes a bit clearer.
Comment
-
I would suggest a -O2 -march=native benchmark aswell if that thing tops -O2 fine, if it tops on -O3 aswell even better and you could also go the other way and do -Os -march=native.
And i can also explain some of the benchmarks probably the kernel hogs registers from mmx/sse/avx, and the handcrafted code from some of the userspace programms wants those registers free and the userspace looses in that cases, while the kernel time is better for IO/MMU/Sheduler.Last edited by erniv2; 13 July 2022, 01:42 PM.
Comment
-
Originally posted by Michael View Post
Yep it's all open source. People are lazy?
- Likes 3
Comment
-
Maybe the problem is that the kernel isn't the sole purpose of the machine. If you optimize too much the kernel (say, it now can use an additional variable in registers instead of the stack), now that register can't be used in the next instruction and it needs more push/pop to free it. But I ain't an expert, so what do I know
- Likes 2
Comment
-
Originally posted by Anux View PostDo all Gentoo users have the same i5-12600K CPU or how did you come to that conclusion?
You know that GCC with "nativ" can only optimize for one CPU core and one cache size? This CPU has different cores with different amount of cache so its gonna run bad on one or the other core.
Not shure what a bug report should do, I know no way to put 2 different binarys in one and than run them on the corresponding core. The easyest fix is, don't use native on such a CPU.
I guess phoronix needs to retest with e cores disabled.
It is also possible that the gcc native check went to e cores so the kernel was compiled with optimization for e core instead of p core.Last edited by zamroni111; 13 July 2022, 07:01 PM.
- Likes 3
Comment
Comment