Torvalds Is Unconvinced By LTO'ing A Linux Kernel

Guest replied

09 April 2014, 05:37 PM
Originally posted by hubicka View Post

For fully standrad compliant software, LTO is fully transparent, and you can just enable -flto and expect improvements. Many of key packages (glibc, kernel, web browsers,...) however do a lot of non-standard things and needs some care to work with LTO. So yes, blind rebuild of your Gentoo with -flto is going to show interesting problems, but it really depends on upstream developers of these packages how quickly they will disappear.

LTO is new and has issues, this is a chicken-egg problem we need to solve - because LTO now works well in tests and environment, the problems won't be hammered out effectively without feedback. The more LTO users are here, the faster it will mature.

I don't see how this is in any way related to my comment, which talked about LTO in kernel and not specific use cases/distributions/userspace stuff.
Leave a comment:
Garp replied

09 April 2014, 04:27 PM
Looking through that original email thread, even the proponents are saying it's going to result in negligble speed benefits, mostly just some size reduction, largely dependent on how limited your kernel config is (LTO seems to be good at not compiling unnecessary code from disabled code paths)

A few choice quotes:

Originally posted by Tim Bray

LTO shows promise for allowing more automation in configuration
handling (that is, requiring less CONFIG options).

People should definitely be warned off using this in any production
setting, but I think it's valuable for developers experimenting with
tiny-size systems to have this easily available in mainline.

Originally posted by Honza

4) LTO brings noticeable performance wins on average, but it is largely benchmark
dependent; some see huge improvemnts, others no improvements at all.

Basic observation is there is not much that LTO can do that can not be done
by hand. Careful developer can just identify the important spot and
restructure the soruces.
The runtime benefits are more visible on bigger, bloated and less
optimized projects than on hand tuned video encoder implementation.
I believe Kernel largely falls into hand tuned category despite its size.
...
6) LTO will pay back more in long term.

It is not only because LTO implementation in GCC (and LLVM) has more room
for improvement than the per-file optimizers.

Main things is that despite the aim to be transparent to user,
LTO is invasive change. Existing programs was developed and tuned for
per-file optimization model and many of them contains a lot of hacks to
work around its limitation (such as a lot of inline code in headers, etc.)
With LTO becoming mainstream, developers will have time to work on different
hacks

Originally posted by Andi Kleen

> I would be curious about the results on Kernel.

We saw some upsides in performance with some standard tests, but nothing
too significant.

At the momen this seems like something certainly interesting, but a bunch of changes without a real payoff at the moment as the tooling and maturity of LTO compilers/optimisers isn't there yet.
Leave a comment:
Garp replied

09 April 2014, 04:14 PM
Originally posted by Azrael5 View Post

So someone has to make tests and benchamarks to evaluate its capabilities, problems and advantages.

I think this is being missed in all the fuss. This is such a basic and simple step in getting software from 'hobby code' to 'production worthy'. Given the scope of the changes, everything needs to be fully tested, and changes on this scope and potential complexity should be fully justified with compelling arguments for rather than vague and rather hand-wavey "it does stuff quicker/smaller".
Leave a comment:
caligula replied

09 April 2014, 03:58 PM
Originally posted by curaga View Post

Of course it matters. In Firefox's case, it directly affects startup speed.

But how much it really matters? Firefox static initialization takes a while, especially if you have bookmarks and history and some old session. SSD can read disk 600 MB/s and the binary is smaller than 300 MB. To me the real bottleneck is elsewhere.
Leave a comment:
curaga replied

09 April 2014, 03:19 PM
Originally posted by caligula View Post

I understand that 5% off is important in 4 MB flash storage. However in desktop apps it doesn't matter at all. Hard drives are now 1 TB (ssd) and 4 TB (3.5" hdd). You can also set up raid6 or zfs. So you get tens of terabytes and it's very cheap. You shouldn't bother with binary sizes. In fact there's plenty of room for more functionality in Firefox. Luckily they're working hard at implementing more new features with each release.

Of course it matters. In Firefox's case, it directly affects startup speed.
Leave a comment:
hubicka replied

09 April 2014, 03:06 PM
Originally posted by tpruzina View Post

Actually -march=native is even more problematic than the LTO (depends on platform tho), for example, it generates sse instruction which make context switches slower since it takes a while to set them up, which is pretty bad, especially if you have realtime requirements.

LTO seems much safer, though any gain for desktop users is doubtful due to the fact that distribution kernels are super robust and have everything in modules, which is presumably least effective use-case for LTO (and yet, most widely used).

For fully standrad compliant software, LTO is fully transparent, and you can just enable -flto and expect improvements. Many of key packages (glibc, kernel, web browsers,...) however do a lot of non-standard things and needs some care to work with LTO. So yes, blind rebuild of your Gentoo with -flto is going to show interesting problems, but it really depends on upstream developers of these packages how quickly they will disappear.

LTO is new and has issues, this is a chicken-egg problem we need to solve - because LTO now works well in tests and environment, the problems won't be hammered out effectively without feedback. The more LTO users are here, the faster it will mature.
Leave a comment:
Azrael5 replied

09 April 2014, 02:17 PM
Dear Trovalds, don't make us angry .
Leave a comment:
tjwhaynes replied

09 April 2014, 02:03 PM
It's not lunchtime

Originally posted by caligula View Post

For example in browsers they already switched to slower JavaScript because NaCl, .NET, and OpenJDK/Sun JVM started to became too fast.

Don't feed the trolls.
Leave a comment:
caligula replied

09 April 2014, 01:31 PM
Originally posted by milkylainen View Post

Saying that 5% size does not matter on desktop is a misunderstanding on how computers work.
While stability is a major concern as Linus points out, smaller size generally benefits everything.
Loading times, memory contention, cache usage etc.
Just reducing size while maintaining everything else will generate a speedup.
If it is measurable in comparison to general code behavior or not, that is another question.

I'm just saying that you're wasting storage resources with too small binaries. The manufacturers can't sell new equipment if the software doesn't grow "naturally", that is according to Moore's law. It is important that you spend twice as much disk space and other resources every 18 months while doing the same thing. For example in browsers they already switched to slower JavaScript because NaCl, .NET, and OpenJDK/Sun JVM started to became too fast.
Leave a comment:
milkylainen replied

09 April 2014, 01:21 PM
Smaller size benefits all, not just embedded.

Originally posted by caligula View Post

I understand that 5% off is important in 4 MB flash storage. However in desktop apps it doesn't matter at all. Hard drives are now 1 TB (ssd) and 4 TB (3.5" hdd). You can also set up raid6 or zfs. So you get tens of terabytes and it's very cheap. You shouldn't bother with binary sizes. In fact there's plenty of room for more functionality in Firefox. Luckily they're working hard at implementing more new features with each release.

Saying that 5% size does not matter on desktop is a misunderstanding on how computers work.
While stability is a major concern as Linus points out, smaller size generally benefits everything.
Loading times, memory contention, cache usage etc.
Just reducing size while maintaining everything else will generate a speedup.
If it is measurable in comparison to general code behavior or not, that is another question.
Leave a comment:

Announcement

Torvalds Is Unconvinced By LTO'ing A Linux Kernel

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: