Announcement

Collapse
No announcement yet.

Ubuntu 24.04 LTS To Enable Frame Pointers By Default For Better Profiling/Debugging

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ubuntu 24.04 LTS To Enable Frame Pointers By Default For Better Profiling/Debugging

    Phoronix: Ubuntu 24.04 LTS To Enable Frame Pointers By Default For Better Profiling/Debugging

    Canonical has decided for Ubuntu 24.04 LTS that they will now enable frame pointers by default when building packages. There will still selectively be some packages where they decide to disable frame pointers due to the performance overhead, but the focus on this change is to improve the out-of-the-box debugging and profiling support on the Linux distribution...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    So first fedora and now Ubuntu too.
    At least ubuntu just downright says it's for ease of debugging. Whereas fedora tried to sell it with reasons like cloud computing with not easy to re-compile environments benefiting of this.

    I find the move toward this "ease of debugging" just developer lazyness.

    I maintain:
    - Release builds should be build for performance. No extra debug aid.
    - Debug builds contain the full-fancy-debug features (slower, bigger)
    - Release with debug symbols. Those are trade-off builds where you gain debug information at the cost of a little performance and a lot of size. These builds are best to selectively send to people who have an issue to try and debug things. This mode also is often enough to develop software in. You still have much of the speed and much of the debug features.

    What distributions provide should be release builds.

    Edit.
    Small correction. Thank for pointing it out spicfoo! My above remark w.r.t. fedora is apparently wrong. I swear there was some fedora debug-like flag added to packaged for their fedora server/cloud/coreos version but this wasn't it.
    Last edited by markg85; 13 December 2023, 08:47 PM.

    Comment


    • #3
      How can frame pointers not be enabled in the first place? Aren't they needed for making function calls? How will called functions know how to reference its local variables without having a frame pointer?

      Comment


      • #4
        Just to clarify two points:
        • The main use case for this is not debugging, where DWARF-based stack unwinding is fine, it's profiling. DWARF was never designed for computing stack traces thousands of times per second, and the way perf works around that (by sampling the top of the stack and hoping that's enough to make a full stack trace in a later post-processing phase) is a terrible hack that works poorly as soon as programs have lots of stack-allocated data (which they should, given how slow heap allocators are). This results in huge profiles, thus slow analysis and high measurement bias, and as soon as you hit perf's hard limit of 64KB per sample you have no choice but to live with a corrupted profile where %Children overheads are misleading.
        • While frame pointers do have a (usually negligible) performance cost, compiling with debug symbols does not slow down a program in any way. It does not affect code generation, and if the debug symbols are split on disk, as is the case on modern Linux distros, the existence of debug symbols physically cannot have any impact on the way programs are loaded and executed, since the only remaining trace of the debug symbols in the final ELF binary is a few tiny headers.
          • I suspect part of this urban legend comes from CMake's stupid decision to historically make RelWithDebInfo -O2 and not -O3, but IIRC that performance bug is fixed in newer CMake releases and RelWithDebInfo is now -O3 -g as it should be.
        Last edited by HadrienG; 13 December 2023, 04:48 PM.

        Comment


        • #5
          Originally posted by markg85 View Post
          So first fedora and now Ubuntu too.
          At least ubuntu just downright says it's for ease of debugging. Whereas fedora tried to sell it with reasons like cloud computing with not easy to re-compile environments benefiting of this..
          What? Fedora change proposal said nothing about cloud computing whatsoever. It was talking about why production usage of this feature is useful. You are just confused



          Comment


          • #6
            Originally posted by sarmad View Post
            How can frame pointers not be enabled in the first place? Aren't they needed for making function calls? How will called functions know how to reference its local variables without having a frame pointer?
            The compiler knows exactly how large a stack it created when calling the function so it doesn't need a frame pointer to keep track.

            Comment


            • #7
              Originally posted by sarmad View Post
              How can frame pointers not be enabled in the first place? Aren't they needed for making function calls? How will called functions know how to reference its local variables without having a frame pointer?
              Just by using the stack pointer (SP).

              An example:

              Code:
              int bar(int *);
              
              int foo(int arg)
              {
                  int local = arg;
              
                  return bar(&local);
              }
              compiles to

              Code:
              foo:
                      subq    $24, %rsp
                      movl    %edi, 12(%rsp)
                      leaq    12(%rsp), %rdi
                      call    bar
                      addq    $24, %rsp
                      ret
              but with -fno-omit-frame-pointer

              Code:
              foo:
                      pushq   %rbp
                      movq    %rsp, %rbp
                      subq    $16, %rsp
                      movl    %edi, -4(%rbp)
                      leaq    -4(%rbp), %rdi
                      call    bar
                      leave
                      ret​
              So the register %rbp is used as the frame pointer and keeps the initial value of the stack pointer but can not be used for other purposes any more.
              Last edited by George99; 13 December 2023, 04:15 PM.

              Comment


              • #8
                Originally posted by HadrienG View Post
                [*]I suspect part of this urban legend comes from CMake's stupid decision to historically make RelWithDebInfo -O2 and not -O3, but IIRC that performance bug is fixed in newer CMake releases and RelWithDebInfo is now -O3 -g as it should be.
                i am pretty sure RelWithDebInfo is '-O2 -g' and will forever stay that way (CMake is almost pathological about keeping things the same).

                i agree RelWithDebInfo and Release not beeing identical ist pretty damn annoying, but i wouldve picked '-O2' for both.

                ideally debuginfo or not would be a seperate flag with default on/off for debug/release respectively

                Comment


                • #9
                  They just need to use the frame pointers to optimize the performance in more than 2% to have a return over their investment. I think that's hard, but feasible.

                  Comment


                  • #10
                    Originally posted by discordian View Post
                    i am pretty sure RelWithDebInfo is '-O2 -g' and will forever stay that way (CMake is almost pathological about keeping things the same).

                    i agree RelWithDebInfo and Release not beeing identical ist pretty damn annoying, but i wouldve picked '-O2' for both.

                    ideally debuginfo or not would be a seperate flag with default on/off for debug/release respectively
                    Not sure if this has changed, but for a long time -ftree-vectorize wasn't included in GCC's -O2 builds, and that was leaving quite a bit of performance on the table wrt -O3 when you have number crunching loops that vectorize well.

                    Comment

                    Working...
                    X