Announcement

Collapse
No announcement yet.

The Performance Of Clear Linux With GCC 8

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Performance Of Clear Linux With GCC 8

    Phoronix: The Performance Of Clear Linux With GCC 8

    Intel's Clear Linux operating system has been among the first notable Linux distributions upgrading to the recently-released GCC 8.1 as the default system compiler and then proceeding to rebuild its packages against this annual update to the GNU Compiler Collection. Here are some before/after benchmarks of their GCC 8 deployment for those interested.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Do you track CPU thermals during the tests? Is it possible that Xeon Silver hit over-temp while running the AVX-512 code?

    Comment


    • #3
      Originally posted by Zan Lynx View Post
      Do you track CPU thermals during the tests? Is it possible that Xeon Silver hit over-temp while running the AVX-512 code?
      PTS can and it does via Phoromatic for monitoring the current values if paying attention there but not archived by default. Though in this case with the FFTE performance I don't think it was due to thermal related since FFTE runs relatively quickly and FFTE was one of the first tests run after the system booted up from cold, so likely not thermal thresholds being crossed.
      Michael Larabel
      https://www.michaellarabel.com/

      Comment


      • #4
        Originally posted by monraaf
        Re-bench after passing:
        -D_GLIBCXX_USE_CXX11_ABI=0
        New ABI impedes single-core performance.
        That's interesting. I do some C++ coding for work. What things in particular should I watch out for?

        Comment


        • #5
          Why no AMD CPUs?

          Comment


          • #6
            Originally posted by Michael View Post
            PTS can and it does via Phoromatic for monitoring the current values if paying attention there but not archived by default. Though in this case with the FFTE performance I don't think it was due to thermal related since FFTE runs relatively quickly and FFTE was one of the first tests run after the system booted up from cold, so likely not thermal thresholds being crossed.
            Thermal throttling from AVX2 FMA on Haswell for example can ramp up almost instantly. That Xeon is a Skylake, so it will behave differently with AVX-512 down-clocking, but its Tcase is just 77*C so it might as well be throttling quite fast.

            Comment


            • #7
              Originally posted by monraaf
              Modern ABI forbids copy-on-write for std::string and requires std::list to keep track of their size. If you app is multithreaded, and you do lots of string manipulations, then -D_GLIBCXX_USE_CXX11_ABI=0 is a good approach for winning back some lost performance.
              Ah, ok. Yeah I knew about those. Actually in my application the C++11 strings work a lot better than the copy on write ones. Plus, I'd already had to do ugly work-arounds to force string copies on RHEL 5 and Ubuntu 10 because some versions of that stdlibc++ had thread bugs in string COW.

              Comment


              • #8
                Originally posted by monraaf

                Watch out for mixing ABI:s when linking against libs that are compiled with GCC <5.x, as they don't have the modern ABI. Beginning with GCC 5.x modern ABI is default. More info: https://gcc.gnu.org/onlinedocs/libst..._dual_abi.html
                Modern ABI forbids copy-on-write for std::string and requires std::list to keep track of their size.
                This statement is correct.

                If you app is multithreaded, and you do lots of string manipulations, then -D_GLIBCXX_USE_CXX11_ABI=0 is a good approach for winning back some lost performance.
                This is gross oversimplification and very misleading.

                Where CoW helps:
                • you reuse strings, but don't want to deal with ownership

                Where CoW is pointless:
                • you pass a std::string to a method which will not modify it or store it permanently (e.g. key lookup in a std::map) - use a const &
                • you pass a temporary which will not be used by the caller afterwards - use an rvalue reference (std::string &&)
                • you really have a string which is used in a lot of places - use std::shared_ptr<std::string>
                CoW should be avoided, especially in multithreading. CoW requires atomic operations, which can lead to cacheline bouncing and pipeline stalls.

                With CoW, you never know when a detach happens, but the compiler has to generate code to check for shared ownership each time you manipulate the string (it can omit this code when it knows it already has detached, e.g. during the function scope, but only if the string is not passed to another function in the mean time).

                Code:
                void bar(const std::string& s);
                void foo_03(std::string s) {
                   s += 'a';  // possible detach
                   bar(s);    // may reference s after return
                  s += 'b';   // another detach necessary
                }
                Code:
                void bar(const std::string& s);
                std::string baz(std::string&& s);
                void foo_11(std::string&& s) {   // rvalue reference gurantees we are exclusive owner of s by contract
                  s += 'a';              // no atomic operations necessary
                  bar(s);                // may copy, may not, but does not manipulate s in any way
                  s = baz(std::move(s)); // moves ownership of s to baz and holds ownership of the returned
                                         // string. baz may return the passed in value, so no copying, no atomics.
                  s += 'b';  // we own s exclusively, so no atomics
                }
                In general, string manipulations are much better with the C++11 semantics. No CoW overhead helps, also SSO (Short String Optimization), which is only available with C++11 strings.

                As soon as you manipulate a string, you have to detach, you have to copy. The benefit of reduced memory usage does no longer apply, but you still have the overhead.

                In the case of upgrading compilers in horrendous legacy applications -D_GLIBCXX_USE_CXX11_ABI=0 is often a must to begin with.
                In case of mixing ABIs, you have pay each time you cross the ABI boundary and back. Each crossing will likely involve two costly string constructions (std::[03]::string: allocation of the string data on the heap, copying; std::__cxx11::string: heap allocation, copying).

                You should only define _GLIBCXX_USE_CXX11_ABI=0 where you are calling legacy code.

                If you don't mix ABIs, e.g. when you have everything in source, you are better of to use the C++11 ABI.

                Comment

                Working...
                X