Announcement

Collapse
No announcement yet.

XZ 5.6 Released: Sandboxing Improvements, Prefers -O2 Instead Of -O3

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Anux
    Rust is similar to C++, maybe worse, in terms of code bloat due to monomorphisation of all the generics. Wouldn't be surprised if that causes too much code with O3, filling up caches making it slower overall. I can see this tradeoff working out differently with plain C. But yes, in general you are correct that it depends on the specific CPU and there's a threshold after which optimizations that increase code size hurt performance. I also suspect this might somewhat depend on system load, meaning a large number of concurrent processes would benefit from smaller code vs. a single process hogging the CPU, but I have seen no testing on this yet.

    Comment


    • #12
      Originally posted by binarybanana View Post
      Anux
      Rust is similar to C++, maybe worse, in terms of code bloat due to monomorphisation of all the generics.
      The only generics I use are vec and the image crate. And for the image I only use u8 so it should compile to only one version and be therefore smaller and faster than polymorph code.
      Also I use only one codegen-unit in release/bench which should lead to smaller binaries for monomorph code.

      I also suspect this might somewhat depend on system load, meaning a large number of concurrent processes would benefit from smaller code vs. a single process hogging the CPU, but I have seen no testing on this yet.
      No need for testing, of course if your L3 is filled with different programs there will be more eviction/misses with larger binaries than with smaller ones.

      Comment


      • #13
        Originally posted by Anux View Post
        O3 leading to smaller binary seems wrong to me. The main thing that O3 does is more unrolling so it should always lead to bigger binaries or same size if O2 already unrolled everything.

        If O3 is faster depends on your code and the CPU you use. Older or low end CPUs typically suffer from cache misses the larger your binaries get.
        Not all optimizations of -O3 necessarily increase code size over -O2, though many of them do. If you check e.g. gcc manpage, -Os and -O3 have considerable overlap.

        When -O3 increases code size, this also has the problem if you multitask that processes will be more likely to evict each other from L2 cache. This kind of problem is usually not shown in single-purpose benchmarks.

        Comment


        • #14
          Originally posted by chithanh View Post
          Not all optimizations of -O3 necessarily increase code size over -O2, though many of them do.​
          • -fgcse-after-reload (not sure, sounds like a slight size reduction?)
          • -fipa-cp-clone (size increase)
          • -floop-interchange (no size effect)
          • -floop-unroll-and-jam (size increase)
          • -fpeel-loops (size increase)
          • -fpredictive-commoning (don't know)
          • -fsplit-loops (size increase)
          • -fsplit-paths (not sure)
          • -ftree-loop-distribution (size increase)
          • -ftree-partial-pre (size increase)
          • -funswitch-loops (size increase)
          • -fvect-cost-model=dynamic (size increase)
          • -fversion-loops-for-strides (size increase)
          Apart from 3 unknowns everything screams for more bin size.

          If you check e.g. gcc manpage, -Os and -O3 have considerable overlap.
          Optimize for size. -Os enables all -O2 optimizations except those that often increase code size:​
          Of course but what has this to do with the O2/O3 topic? Edit: If there were any considerable size improvements with O3, those options would also be used with Os or Oz.
          Last edited by Anux; 26 February 2024, 09:59 AM.

          Comment


          • #15
            Originally posted by Anux View Post
            Of course but what has this to do with the O2/O3 topic? Edit: If there were any considerable size improvements with O3, those options would also be used with Os or Oz.
            The point is that if an optimization is part of both -O3 and -Os then it does not increase code size.
            The gcc manpage lists dozens of such optimizations.

            Which optimizations end up being applied depends of course on the code, so it may vary how much if at all binaries increase in size. Most will but some will not.

            Comment


            • #16
              Originally posted by chithanh View Post
              The point is that if an optimization is part of both -O3 and -Os then it does not increase code size.
              The gcc manpage lists dozens of such optimizations.
              Yes, that's why I listed all O3 opts. None of them are in Os or Oz. Or do you have any other resource?

              Comment

              Working...
              X