Announcement

Collapse
No announcement yet.

Clear Linux's make-fmv-patch Eases The Creation Of GCC FMV-Enabled Code Paths

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Clear Linux's make-fmv-patch Eases The Creation Of GCC FMV-Enabled Code Paths

    Phoronix: Clear Linux's make-fmv-patch Eases The Creation Of GCC FMV-Enabled Code Paths

    One of the GCC compiler features unfortunately not taken advantage of by most Linux distributions is FMV - Function Multi-Versioning. FMV is what allows for the compilation of different tuned code paths depending upon the processor and for the particular code-path to be chosen at run-time, i.e. optimizing to your heart's content with AVX, SSE4, and other instruction set extensions and compiling all of that into a single binary and for the preferred code path to be taken depending upon the CPU running the binary so it will still run on older CPUs as well as today's most powerful processors...

    http://www.phoronix.com/scan.php?pag...make-fmv-patch

  • #2
    Originally posted by tichun
    Does that imply Clear Linux support for older CPU's?
    I've run their compatibility tester and failed with pretty old e7300.
    Penryn is too old. IIRC, Clear is only supported going back to Westmere (~2010 CPUs and newer).
    Michael Larabel
    https://www.michaellarabel.com/

    Comment


    • #3
      I wonder if this tech could be integrated with Flatpak and Snap to allow for the creation of binaries that take advantage of modern CPU features while still being compatible with old AMD64 CPUs

      Comment


      • #4
        Originally posted by Michael View Post

        Penryn is too old. IIRC, Clear is only supported going back to Westmere (~2010 CPUs and newer).
        IF the Westmere uses EFI firmware.

        My Westmere box doesn't.

        EDIT: From my Dell T5500 x2 x5687 box:

        Code:
        Checking if host is capable of running Clear Linux* OS
        
        SUCCESS: 64-bit CPU (lm)
        SUCCESS: Supplemental Streaming SIMD Extensions 3 (ssse3)
        SUCCESS: Streaming SIMD Extensions v4.1 (sse4_1)
        SUCCESS: Streaming SIMD Extensions v4.2 (sse4_2)
        SUCCESS: Advanced Encryption Standard instruction set (aes)
        SUCCESS: Carry-less Multiplication extensions (pclmulqdq)
        FAIL: EFI firmware

        Comment


        • #5
          Originally posted by cybertraveler View Post
          I wonder if this tech could be integrated with Flatpak and Snap to allow for the creation of binaries that take advantage of modern CPU features while still being compatible with old AMD64 CPUs
          Technically speaking, using march=generic mtune=skylake or mtune=znver1 would work. At the minimum we'd want/need AMD tuned and Intel tuned binaries.

          Thinking deeper, we'd want generational divides combined with FMV like generic to westmere, sandy to skylake, skylake and up so a lot of the older processors aren't using unnecessarily bloated code -- targeted AVX code, for example, only adds code bloat for Westmere users.

          Comment


          • #6
            Awesome. Thanks for this!

            I actually very recently updated the README.md on this project in the hope that it would receive wider adoption from the open source community. For the record, I don't work for Intel

            Comment


            • #7
              Originally posted by skeevy420 View Post

              Technically speaking, using march=generic mtune=skylake or mtune=znver1 would work. At the minimum we'd want/need AMD tuned and Intel tuned binaries.

              Thinking deeper, we'd want generational divides combined with FMV like generic to westmere, sandy to skylake, skylake and up so a lot of the older processors aren't using unnecessarily bloated code -- targeted AVX code, for example, only adds code bloat for Westmere users.
              you are overthinking this way to much, FMV only adds few KB here and there since it uses the executable format and glibc infrastructure to just pick the correct implementation at runtime which is usually extremely small like a regular c++ polymorphic function.

              remember FMV just multi version the specific function that has been vectorized not the data or the whole executable/library whereas your solution would imply a massive infrastructure distro side with a bunch of code changes to package managers as well to pull cpu specific binaries to save maybe 5Mb of harddrive space on a distro like Ubuntu.

              Also please stop assuming FMV require a new processor because that is incorrect, Clear linux choose to add a default fallback that require Westmere+ CPU because of another FMV unrelated set of features on the kernel not because is limited by GCC.

              GCC efectively allow you to set a fallback as low as you wish(lets say SSE2 which should include every 64bits CPU ever made) with optimization all the way up to AVX-512

              Comment


              • #8
                Originally posted by skeevy420 View Post

                Technically speaking, using march=generic mtune=skylake or mtune=znver1 would work. At the minimum we'd want/need AMD tuned and Intel tuned binaries.

                Thinking deeper, we'd want generational divides combined with FMV like generic to westmere, sandy to skylake, skylake and up so a lot of the older processors aren't using unnecessarily bloated code -- targeted AVX code, for example, only adds code bloat for Westmere users.
                Also note FMV is brand agnostic, as long as the CPU support the proper vector extensions it doesn't give a damn which CPU brand you have, this is not ICC

                Comment


                • #9
                  Originally posted by jrch2k8 View Post

                  you are overthinking this way to much, FMV only adds few KB here and there since it uses the executable format and glibc infrastructure to just pick the correct implementation at runtime which is usually extremely small like a regular c++ polymorphic function.

                  remember FMV just multi version the specific function that has been vectorized not the data or the whole executable/library whereas your solution would imply a massive infrastructure distro side with a bunch of code changes to package managers as well to pull cpu specific binaries to save maybe 5Mb of harddrive space on a distro like Ubuntu.
                  I was thinking more on processor cache and less if/thens to detect the correct code to use. Disk space wasn't even a factor. I was thinking about people who want every last oomph from their hardware -- people like that don't want to waste cycles for the code to determine if it should use the AVX or AVX2 parts.

                  Originally posted by jrch2k8 View Post
                  Also please stop assuming FMV require a new processor because that is incorrect, Clear linux choose to add a default fallback that require Westmere+ CPU because of another FMV unrelated set of features on the kernel not because is limited by GCC.
                  I'm not assuming FMV requires a new processor. Technically it can be for SSE and SSE2 or for various ARMs. Not positive, but I think the Westmere cutoff line is due to AES.

                  Originally posted by jrch2k8 View Post
                  GCC efectively allow you to set a fallback as low as you wish(lets say SSE2 which should include every 64bits CPU ever made) with optimization all the way up to AVX-512
                  Which is why I used march=generic as the base and picked an mtune from newer CPUs like Skylake and Ryzen. It was just a quick example that would cover damn near everyone and include optimized code for newer processors. IMHO, if one compares CPU features it makes sense to lump similar x86_64 architectures together and to create some generational divides similar to i386, i486, i686, etc.

                  Comment


                  • #10
                    Originally posted by jrch2k8 View Post

                    Also note FMV is brand agnostic, as long as the CPU support the proper vector extensions it doesn't give a damn which CPU brand you have, this is not ICC
                    I know. Again, it was about how AMD supports some extensions that Intel doesn't and vice-a-versa....like 3DNOW to go old school....but it's more about Ryzen and Icelake....

                    Comment

                    Working...
                    X