Announcement

Collapse
No announcement yet.

Glibc's strncasecmp / strcasecmp Get AVX2 & EVEX Optimized Versions, Drops AVX

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Glibc's strncasecmp / strcasecmp Get AVX2 & EVEX Optimized Versions, Drops AVX

    Phoronix: Glibc's strncasecmp / strcasecmp Get AVX2 & EVEX Optimized Versions, Drops AVX

    The GNU C Library (glibc) has landed a set of 23 patches providing optimized AVX2 and EVEX versions of strcasecmp/strncasecmp functions while dropping support for the original AVX implementation...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    In case anyone wonders what the heck these "EVEX optimized" versions are, it appears it's AVX-512 code but using only the lower 256 bits of the registers. Presumably to avoid the downclocking that many CPU's suffer from when using the full 512-bit AVX-512 instructions.

    "EVEX" comes from the EVEX prefix used by AVX-512 instructions.

    Comment


    • #3
      So does glibc automatically detect that you have support for AVX2 and fall back if not?
      Or is it something that needs to be selected during compilation?

      Comment


      • #4
        Originally posted by jabl View Post
        Presumably to avoid the downclocking that many CPU's suffer from when using the full 512-bit AVX-512 instructions.
        IIRC AVX2 also causes slowdown, that's why this change surprises me.

        Comment


        • #5
          Originally posted by jabl View Post
          In case anyone wonders what the heck these "EVEX optimized" versions are, it appears it's AVX-512 code but using only the lower 256 bits of the registers. Presumably to avoid the downclocking that many CPU's suffer from when using the full 512-bit AVX-512 instructions.

          "EVEX" comes from the EVEX prefix used by AVX-512 instructions.
          Thank you, I was!

          The processors that principally benefit from the AVX(1) version are Sandy Bridge and Ivy Bridge to which they are "becoming outdated" and thus freeing that code from the Glibc code-base.
          *looks down at IVB based laptop* Well, screw me, then?

          Comment


          • #6
            Not really, no. Read the article again.

            Comment


            • #7
              Originally posted by willmore View Post
              *looks down at IVB based laptop* Well, screw me, then?
              Well, as the commit mention, SSE4.2 (which your IVB has, and the code still supports) is roughly equivalent to AVX performance (3-4% difference), so while your IVB laptop is showing its age, this particular change is unlikely to matter much in any real world scenario..

              Comment


              • #8
                Originally posted by MastaG View Post
                So does glibc automatically detect that you have support for AVX2 and fall back if not?
                Or is it something that needs to be selected during compilation?
                It's all automatic. When a program uses one of these glibc functions that have several optimized versions using different ISA features available, it does a CPU check and stores the result. The next time such a function is used it only needs to lookup the stored result and jump to the correct function.

                Comment


                • #9
                  The processors that principally benefit from the AVX(1) version are Sandy Bridge and Ivy Bridge to which they are "becoming outdated"
                  ​​​​​​​why they need to removed it in the first place ?

                  Comment


                  • #10
                    Originally posted by Aryma View Post

                    ​​​​​​​why they need to removed it in the first place ?
                    Maybe they don't want the maintenance burden and bloat for only a few % performance difference compared to the SSE4.2 version.

                    More generally, wouldn't surprise me if they are focusing on optimized code for the ISA extensions specified in the micro-architectural levels instead of for every single ISA extension separately (x86-64-v2 has SSE4.2, v3 AVX2, and v4 AVX-512).

                    Comment

                    Working...
                    X