Announcement

Collapse
No announcement yet.

LLVM Clang Will Finally Honor "-mtune=" On x86/x86_64 CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • LLVM Clang Will Finally Honor "-mtune=" On x86/x86_64 CPUs

    Phoronix: LLVM Clang Will Finally Honor "-mtune=" On x86/x86_64 CPUs

    Starting with LLVM Clang 12.0 next year, the Clang compiler on x86/x86_64 CPUs will finally honor -mtune= in a similar manner to GCC...

    http://www.phoronix.com/scan.php?pag...lang-x86-mtune

  • #2
    Overdue...

    Comment


    • #3
      why hasn't it been built in already together with march? this would be my naive approach implementing this features

      Comment


      • #4
        Could anyone tell me what the practical implications of this are?

        To be honest, I barely understand what -mtune does in GCC if at all, it is my understanding that -march sets the supported instructions to use when generating code, and when native is specified there even cache sizes are taken into account; but what does the other option tune?

        Is it meaningful? I mean, performance wide or otherwise, does it play an important part in creating more efficient binaries maybe?

        Comment


        • #5
          Originally posted by KaoDome View Post
          Could anyone tell me what the practical implications of this are?

          To be honest, I barely understand what -mtune does in GCC if at all, it is my understanding that -march sets the supported instructions to use when generating code, and when native is specified there even cache sizes are taken into account; but what does the other option tune?

          Is it meaningful? I mean, performance wide or otherwise, does it play an important part in creating more efficient binaries maybe?
          CPUs with different microarchitectures may support the same instructions, but different instructions perfrom better on different microarchitectures. For example some instructions may be microcoded (and thus slower) on a small core, while they are implemented in silicon on big cores. Other instructions that are commonly microcoded are rarely used, old instructions included for compatibility's sake. In these cases, it may be faster to execute one or more different, in-silicon instructions that accomplish the same goal as the microcoded instruction. For this reason, GCC includes cost tables for each CPU it can tune for, that describe how fast all of the supported instructions are.

          Basically, -march sets the set of supported instructions the compiler is allowed to use, and -mtune sets the cost tables that the compiler should use to determine which of the supported instructions perform best - and thus which should be preferred by the compiler.

          Comment


          • #6
            Nice! Maybe I can finally use it as a replacement for GCC for optimising binaries for my computer's.

            Comment


            • #7
              Originally posted by airminer View Post

              CPUs with different microarchitectures may support the same instructions, but different instructions perfrom better on different microarchitectures. For example some instructions may be microcoded (and thus slower) on a small core, while they are implemented in silicon on big cores. Other instructions that are commonly microcoded are rarely used, old instructions included for compatibility's sake. In these cases, it may be faster to execute one or more different, in-silicon instructions that accomplish the same goal as the microcoded instruction. For this reason, GCC includes cost tables for each CPU it can tune for, that describe how fast all of the supported instructions are.

              Basically, -march sets the set of supported instructions the compiler is allowed to use, and -mtune sets the cost tables that the compiler should use to determine which of the supported instructions perform best - and thus which should be preferred by the compiler.
              Yes, pretty much. As a minor nit, -march=foo also implies -mtune=foo. The usefulness of being able to specify -mtune= separately (which overrides the implicit -mtune= specified by -march=) is that when producing binaries for deployment on multiple different computers (e.g. a linux distro), one can set -march= to the lowest common denominator one wants to support, and then with -mtune= specify a somewhat newer and more common cpu model.

              Comment


              • #8
                Originally posted by KaoDome View Post
                Could anyone tell me what the practical implications of this are?

                To be honest, I barely understand what -mtune does in GCC if at all, it is my understanding that -march sets the supported instructions to use when generating code, and when native is specified there even cache sizes are taken into account; but what does the other option tune?
                What -march=native does is that instead of the user specifying the CPU family on the command line, the compiler will check the cpu model (the CPUID instruction on x86), and automatically select the CPU family based on that.

                Comment


                • #9
                  Originally posted by jabl View Post

                  Yes, pretty much. As a minor nit, -march=foo also implies -mtune=foo. The usefulness of being able to specify -mtune= separately (which overrides the implicit -mtune= specified by -march=) is that when producing binaries for deployment on multiple different computers (e.g. a linux distro), one can set -march= to the lowest common denominator one wants to support, and then with -mtune= specify a somewhat newer and more common cpu model.
                  Yeah, you might want to optimize for haswell features, but not have the tuning work-arounds for the inefficiencies/brokeness of haswell. So using either -mtune=generic or -mtune=skylake.

                  Comment


                  • #10
                    Thanks airminer and jabl, I understand the implications of mtune now, it's good news that Clang devs are looking at honoring that switch in a newer release.

                    Comment

                    Working...
                    X