Announcement

Collapse
No announcement yet.

GCC Compiler Tests At A Variety Of Optimization Levels Using Clear Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by duby229 View Post

    You didn't even try it yet, like I've been urging you to. Right now you are just making an assumption that it is definable. But you've already had at least 2 people in this thread, and you came into this thread with a prior misunderstanding. Maybr it is definable, but I just don't know how. If you really do want to get a grasp on this, then you really do need to try it for yourself. You've already been told how to duplicate the experiences we had. Now you just need to duplicate them and see if you can define this behaviour. Or not.

    EDIT: The steps to duplicate is, build a gentoo system with -O3. We don't know how to define it, if you think you do, then please feel free to try. But in the mean time telling folks that -O3 is safe is misleading at the minimum. Whether it's the fault of the compiler or the compiled dependency is irrelevant. If it truly is the fault of compiled dependencies, then that is the bigger problem because there must be many of them. Hopefully it is actually a problem with GCC itself, because that is a single point of failure instead of many.
    I didn't try it yet because that will take me at least 3 hours, based on past experience in installing Linux, if I find a suitable hard disk partition. Right now I don't have that time to spare. I think this work should be done by Linux distributors (those who get paid for their work). Maybe some time later I will decide to devote a few hours to a day. but not right now.

    If you encounter build problems, and I'd assume cj.wijtmans is talking about compiler errors when he says packages "don't compile", why don't you just post the error you are getting?

    And in case I do try it out, and in case I am not getting build errors, how long do I have do do things until I can expect to encounter problems, and what specifically are typical problems, so that I know it is not my own error in setting things up?

    Comment


    • #32
      Originally posted by indepe View Post
      There is a lot of info one could give about that, and a software engineer would be able, quite easily, to bisect the source code and find the specific code that causes problems. Unless the build problems are different each time, in which case it should be identifiable as a compiler (GCC) problem.
      Unfortunately that's not really been the case in practice. It actually can be really hard to avoid writing code which has no potential for undefined behaviour in C, and it can be very hard to spot erroneous code (hint: accessing an array with an unsigned rather than int variable avoids one such case). An older article relating to the kernel when santiser first came out discusses the kind of common issues https://lwn.net/Articles/575563/

      Things are getting better with improvided compiler warnings, but there's a huge amount of legacy code that's not particularly well written, does not have test suites and the reward/risk ratio for changing it mean cargo cult advice like "use -O2" has been around since forever. Then just look at the benchmarks, theoretically superior options like -O3 and -march=native tend not to deliver significant benefits. The same effect meant that programs built for an AMD64 with 32bit pointers ABI were often slower than 32bit or 64bit despite the wastage of registers or memory.

      Whilst the real world is annoying and I have shared your frustration, I have seen almost NO progress in 25 years on this, so I expect to be stuck with this one.

      Comment


      • #33
        Originally posted by rob11311 View Post
        Unfortunately that's not really been the case in practice. It actually can be really hard to avoid writing code which has no potential for undefined behaviour in C, and it can be very hard to spot erroneous code (hint: accessing an array with an unsigned rather than int variable avoids one such case). An older article relating to the kernel when santiser first came out discusses the kind of common issues https://lwn.net/Articles/575563/
        I was talking about a build problem, as it was mentioned. You know right away if something doesn't compile (at least in the usual meaning of those words). So, as a developer, you can find out which code you have to remove in order to let it compile. If you don't get a line number as output, you can still bisect the source code.

        Originally posted by rob11311 View Post
        Things are getting better with improvided compiler warnings, but there's a huge amount of legacy code that's not particularly well written, does not have test suites and the reward/risk ratio for changing it mean cargo cult advice like "use -O2" has been around since forever. ]
        I am aware that there are legacy problems with old code, but here I am told that there are "modern" packages having problems. People keep talking about these problems without giving any specific information that gives a clue about what specifically goes wrong. IMHO it is wrong to generally discourage use of O3 in that way. If there are problems, the right way to go about that is to gather the information that leads to fixing them, until it gets fixed.

        Originally posted by rob11311 View Post
        Then just look at the benchmarks, theoretically superior options like -O3 and -march=native tend not to deliver significant benefits. The same effect meant that programs built for an AMD64 with 32bit pointers ABI were often slower than 32bit or 64bit despite the wastage of registers or memory.
        Benchmarks show that Clear Linux is consistently faster than other distributions, and my understanding is that using -O3 (for many packages, but not all) is one part of the reason. (Regardless of the untypical results in this article.) In general my impression is that O3 is not always faster, but in many cases it is, also in previous benchmarks here on phoronix.

        Originally posted by rob11311 View Post
        Whilst the real world is annoying and I have shared your frustration, I have seen almost NO progress in 25 years on this, so I expect to be stuck with this one.
        I don't find that an acceptable situation, but I am glad to hear someone shares my frustration.

        Comment


        • #34
          The problem isn't necessarily that packages fail to compile, I mean sure sometimes that does happen, but usually it's a fact of the matter that the package does compile but doesn't behave properly during runtime. But the -only- way to tell is -during- compile time and Gentoo devs have not made a solution for that yet. There is no automated detection of undefined behaviour. Nobody has done it yet.

          So for right now the only method available is for a user to test each individual dependency for correct runtime behaviour one at a time. It just isn't something that can effectively be done.

          Comment


          • #35
            Originally posted by rob11311 View Post
            ..., theoretically superior options like -O3 and -march=native tend not to deliver significant benefits.
            ...
            By the way, I don't consider -march=native superior since it can (usually) be used only for code that runs only on a single machine. I don't know why this article doesn't include "-O2" by itself, but for that purpose one would also need a much larger selection of benchmarks, since results vary very much per benchmark.

            Comment


            • #36
              Originally posted by indepe View Post
              Benchmarks show that Clear Linux is consistently faster than other distributions, and my understanding is that using -O3 (for many packages, but not all) is one part of the reason. (Regardless of the untypical results in this article.) In general my impression is that O3 is not always faster, but in many cases it is, also in previous benchmarks here on phoronix.
              You say that it's untypical, but it's fully typical. Year after year new ricers show up on Gentoo's forum and year after year they get proven that ricing generally breaks their system and doesn't yield any performance gain.

              The best performance gain I've ever experienced was with lto, but it isn't universally possible yet for every package.

              EDIT: Clear Linux gets most of it's performance differentials from tweaks to source code, and not compiler optimizations.
              Last edited by duby229; 28 March 2017, 03:22 PM.

              Comment


              • #37
                Originally posted by duby229 View Post
                The problem isn't necessarily that packages fail to compile, I mean sure sometimes that does happen, but usually it's a fact of the matter that the package does compile but doesn't behave properly during runtime. But the -only- way to tell is -during- compile time and Gentoo devs have not made a solution for that yet. There is no automated detection of undefined behaviour. Nobody has done it yet.

                So for right now the only method available is for a user to test each individual dependency for correct runtime behaviour one at a time. It just isn't something that can effectively be done.
                Of course end-users should not have to deal with this problem at the individual package level, not even Gentoo end-users who compile their own system. But it certainly should be dealt with by someone. It is surely a solvable problem, once identified. Maybe it just needs a good bug report. And short of a solution, it would be great to know what the actual problem is. Don't you think so?

                I'd agree that finding the source of sporadic runtime problems is some work, and I am not prepared to that work myself, especially with my lack of experience in building Linux, and with Linux altogether. So I am not sure if me installing Gentoo would be productive, except I might be able to confirm what you already said. But build errors, even if they occur only sometimes, are a different thing. A single such report can lead to the problem getting fixed, depending on the compiler log. And with that, you can identify one package that fails, and that may already be a start. Or does Gentoo not tell you which package, and which file, failed to compile?

                Comment


                • #38
                  Originally posted by indepe View Post

                  Of course end-users should not have to deal with this problem at the individual package level, not even Gentoo end-users who compile their own system. But it certainly should be dealt with by someone. It is surely a solvable problem, once identified. Maybe it just needs a good bug report. And short of a solution, it would be great to know what the actual problem is. Don't you think so?

                  I'd agree that finding the source of sporadic runtime problems is some work, and I am not prepared to that work myself, especially with my lack of experience in building Linux, and with Linux altogether. So I am not sure if me installing Gentoo would be productive, except I might be able to confirm what you already said. But build errors, even if they occur only sometimes, are a different thing. A single such report can lead to the problem getting fixed, depending on the compiler log. And with that, you can identify one package that fails, and that may already be a start. Or does Gentoo not tell you which package, and which file, failed to compile?
                  It logs the build process with incredible details. And GCC logging capability has improved these past few years, especially since Clang introduced a few concepts that got literally ported over to GCC. But there still is nothing that can interpret all that data to say for sure what is the cause of undefined behaviour and what is just noise. And the noise gets way worse the more optimizations you enable.

                  EDIT: I think you are correct that undefined behaviour probably can be identified by careful examination of the compile time logs. But there are literally millions of lines of log produced. The volume of information is tremendous.
                  Last edited by duby229; 28 March 2017, 04:03 PM.

                  Comment


                  • #39
                    Originally posted by duby229 View Post

                    You say that it's untypical, but it's fully typical. Year after year new ricers show up on Gentoo's forum and year after year they get proven that ricing generally breaks their system and doesn't yield any performance gain.

                    The best performance gain I've ever experienced was with lto, but it isn't universally possible yet for every package.

                    EDIT: Clear Linux gets most of it's performance differentials from tweaks to source code, and not compiler optimizations.
                    Past tests here on Phoronix have shown a variety of results, but most of them show O2 and O3 either at the same speed or O3 noticeably faster. In my time on this forum I haven't read that Michael encountered any problems with -O3, whereas he did encounter problems with a lot of other things. In my short time with Linux I also encountered problems, but using O3 wasn't one of them. I still don't think that it is a problem that goes beyond some specific OS packages within Linux. I've heard some of them don't build with clang either, and I guess that's because they are doing weird things and depending on special compiler features (not surprising for operating system code). That doesn't mean there is anything wrong with using clang, either.

                    One of Intels engineers who posts here occasionally, has given me the impression that O3 is a meaningful part of their optimizations. He didn't directly say so, but described that they spend some effort identifying the packages that benefit from O3.

                    Comment


                    • #40
                      Originally posted by duby229 View Post

                      It logs the build process with incredible details. And GCC logging capability has improved these past few years, especially since Clang introduced a few concepts that got literally ported over to GCC. But there still is nothing that can interpret all that data to say for sure what is the cause of undefined behaviour and what is just noise. And the noise gets way worse the more optimizations you enable.

                      EDIT: I think you are correct that undefined behaviour probably can be identified by careful examination of the compile time logs. But there are literally millions of lines of log produced. The volume of information is tremendous.
                      The compiler itself is behaving in an "undefined" way ?

                      Comment

                      Working...
                      X