Announcement

Collapse
No announcement yet.

GCC 6.1 Compiler Optimization Level Benchmarks: -O0 To -Ofast + FLTO

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GCC 6.1 Compiler Optimization Level Benchmarks: -O0 To -Ofast + FLTO

    Phoronix: GCC 6.1 Compiler Optimization Level Benchmarks: -O0 To -Ofast + FLTO

    Here are some extra GCC 6.1 compiler benchmarks to share this weekend, complementing the recent GCC 4.9 vs. GCC 5 vs. GCC 6 comparison and the GCC 6.1 vs. Clang 3.9 compiler comparison...

    http://www.phoronix.com/scan.php?pag...-Optimizations

  • #2
    Ahhh, good old compiler flags. I think of two things here, gentoo and what impact it has on my own software.

    Comment


    • #3
      Originally posted by b15hop View Post
      Ahhh, good old compiler flags. I think of two things here, gentoo and what impact it has on my own software.
      Yea. Too bad there wasn't the standard -O2 -march=native in the results.

      Comment


      • #4
        I wouldn't use anything built using the -Ofast compilation flag, except maybe video players.

        Comment


        • #5
          Coward.

          Comment


          • #6
            Originally posted by birdie View Post
            I wouldn't use anything built using the -Ofast compilation flag, except maybe video players.
            Perhaps you mean video libraries. Other good example to use is all that archiving software such as bzip2, gzip etc. Graphics would gain from it as well

            Comment


            • #7
              Originally posted by birdie View Post
              I wouldn't use anything built using the -Ofast compilation flag, except maybe video players.
              -Ofast is only a very small improvement compared to VDPAU.

              With R9 390 and Mesa, "mplayer -vo=vdpau -vc ffh264vdpau,ffmpeg12vdpau,ffwmv3vdpau,ffvc1vdpau," has CPU utilization about 5%, while "mplayer -vo=gl" has CPU utilization 25-35%. VDPAU is also more power efficient than CPU decoding.

              The only problem is that usage of VDPAU raises the memory clock (Linux kernel 4.5.0), so there is the need to watch /sys/kernel/debug/dri/0/radeon_pm_info if you want to save power.
              Last edited by atomsymbol; 05-14-2016, 04:04 PM.

              Comment


              • #8
                Originally posted by birdie View Post
                I wouldn't use anything built using the -Ofast compilation flag, except maybe video players.
                No reason not to, that is, for well made programs.

                Comment


                • #9
                  Originally posted by GreatEmerald View Post

                  Yea. Too bad there wasn't the standard -O2 -march=native in the results.
                  I agree, I don't get why Michael doesn't use "-march=native" for all of them? I wouldn like to see the difference between the different optimization stages without changing the support for SSE/AVX/etc.

                  Comment


                  • #10
                    Originally posted by Azpegath View Post
                    I agree, I don't get why Michael doesn't use "-march=native" for all of them? I wouldn like to see the difference between the different optimization stages without changing the support for SSE/AVX/etc.
                    Using -march=native on AMD CPUs (and maybe newest Intel CPUs) goes against optimization - in the "bigger picture" - because Valgrind's callgrind tool does not support some of the instructions emitted by the compiler. Using -mavx is ok with callgrind, but for example -march=bdver3 isn't.

                    I am not sure whether callgrind understands instructions generated from -mavx2, I did not test this case.

                    Comment

                    Working...
                    X