Announcement

Collapse
No announcement yet.

GCC 7.3 vs. GCC 8.0 vs. LLVM Clang 6.0 On The POWER9 Raptor Talos II

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • GCC 7.3 vs. GCC 8.0 vs. LLVM Clang 6.0 On The POWER9 Raptor Talos II

    Phoronix: GCC 7.3 vs. GCC 8.0 vs. LLVM Clang 6.0 On The POWER9 Raptor Talos II

    As part of the remote testing of the Raptor Talos II Workstation that is comprised of fully free software down to the firmware and powered by high-end POWER9 processors, over Easter weekend I carried out some GCC vs. Clang benchmarks...

    http://www.phoronix.com/scan.php?pag...ER9-Benchmarks

  • #2
    The CacheBench and C-ray benchmarks are both off by factor of 4 in opposite directions, so that could be a missed autovectorization on the inner most loop. The LAME encoding is weirder.

    Comment


    • #3
      Originally posted by carewolf View Post
      The CacheBench and C-ray benchmarks are both off by factor of 4 in opposite directions, so that could be a missed autovectorization on the inner most loop. The LAME encoding is weirder.
      I was guessing the OpenMP command for the use of parallel SIMD was missing. But the LAME MP3 results say otherwise. Very odd.

      Comment


      • #4
        gcc 7 and 8 are both vectorizing the loop in cachebench write. What they do not do (and llvm apparently does) is unroll it. If you build with -funroll-all-loops the cachebench write improves by 3x on a p8, should be similar on p9.

        Comment


        • #5
          Is optimization enabled for CacheBench, 7-Zip Compression, LAME, or Redis? The "gcc options" footnote doesn't show that it is.

          Comment


          • #6
            Have you tried using -O3? I've heard it makes bigger difference on POWER.

            Comment

            Working...
            X