Announcement

Collapse
No announcement yet.

AMD Sends Out Patches Adding "Znver3" Support To GNU Binutils With New Instructions

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Sends Out Patches Adding "Znver3" Support To GNU Binutils With New Instructions

    Phoronix: AMD Sends Out Patches Adding "Znver3" Support To GNU Binutils With New Instructions

    One of AMD's compiler experts this week sent out a patch wiring up Zen 3 support in the important GNU Binutils collection for Linux systems...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    a little smattering of AVX-512 hey, will be interesting to see how it goes

    Comment


    • #3
      Originally posted by boxie View Post
      a little smattering of AVX-512 hey, will be interesting to see how it goes
      I'm pretty sure that znver3 only supports the 256bit variant (AVX2) of this instruction
      Last edited by mlau; 19 October 2020, 02:04 AM.

      Comment


      • #4
        AVX-512 is supposed to come with Zen 4, hopefully with a better implementation than Intel's.

        Comment


        • #5
          Originally posted by mlau View Post

          I'm pretty sure that znver3 only supports the 256bit variant (AVX2) of this instruction
          That is the problem though, the 256bit versions are not AVX2. The AVX-512 instruction uses EVEX encodings and also natively includes masking using mask registers, and twice as many AVX registers (also in 128- and 256-bit "modes").

          This could be an odd partially working version of those extensions, or it could be the foundation for full AVX-512 support, but just not completed AVX-512F yet. We will have to see when we get the compiler patches.

          Edit: Okay at least VPCLMULQDQ has a VEX-encoded version of the extension for non-AVX512 CPUs. Could be VAES has the same. Damn shame they double loaded the mnemonics so we can't tell the versions apart.

          Second Edit: Yes, VAES also has a VEX-encoded version. They seem to be the only two new AVX-512 extensions with VEX versions.
          Last edited by carewolf; 19 October 2020, 04:08 AM.

          Comment


          • #6
            Originally posted by ms178 View Post
            AVX-512 is supposed to come with Zen 4, hopefully with a better implementation than Intel's.
            Zen 4's AVX-512 support might be light Zen 1's AVX2 support, i.e., emulating 512-bit operations with 256-bit ALUs.
            However due to the tremendous cost of AVX-512 on die area, this approach might also be another "worst is better" solution.

            Comment


            • #7
              Originally posted by zxy_thf View Post
              Zen 4's AVX-512 support might be light Zen 1's AVX2 support, i.e., emulating 512-bit operations with 256-bit ALUs.
              However due to the tremendous cost of AVX-512 on die area, this approach might also be another "worst is better" solution.
              to me it seems to be a very good solution: half throughput at full cpu speed, without the need to reduce core clock, so it might actually get 60-70% of intels avx512 speed (guessed, I don't know the clockrate a xeon can sustain full avx512 throughput).

              Comment


              • #8
                Originally posted by carewolf View Post

                That is the problem though, the 256bit versions are not AVX2. The AVX-512 instruction uses EVEX encodings and also natively includes masking using mask registers, and twice as many AVX registers (also in 128- and 256-bit "modes").
                There's a VEX version of this insns. to support the evex one, amd has to support avx512f at least, which it seems they don't.

                Comment


                • #9
                  Originally posted by zxy_thf View Post
                  Zen 4's AVX-512 support might be light Zen 1's AVX2 support, i.e., emulating 512-bit operations with 256-bit ALUs.
                  However due to the tremendous cost of AVX-512 on die area, this approach might also be another "worst is better" solution.
                  [Deleted the first sentence because I was wrong about Zen 1's AVX2 implementation]. If they implemented AVX-512 like that, it wouldn't be that beneficial after all, would it? I am not an ISA expert, but having larger vector units and fewer cycles for its instructions are what the performance comes from?! And from looking in the past of that approach showed that they were lacking behind in AVX performance quite a bit due to their implementation. Not that it mattered too much at that time as AVX2 wasn't that important at that time, but it might matter now if they want to go after Intel in AI, HPC workloads where AVX-512 is fully utilized. And with the x86-64-v4 target, it probably will get used soon more widely at least on Linux (Does anyone know if these new baselines will translate over into the Windows world? I'd love to see such a Windows version).
                  Last edited by ms178; 19 October 2020, 08:39 AM.

                  Comment


                  • #10
                    Originally posted by ms178 View Post

                    I guess you meant Bulldozer? As far as I know Zen 1's AVX2 implementation was up to par with Intel's. If they implemented AVX-512 like that, it wouldn't be that beneficial after all, would it?
                    No it is how AVX2 was implemented in Zen: https://en.wikichip.org/wiki/amd/mic...Floating_Point
                    In addition, one of the main improvements about Zen 2 is the introduction of "real" 256-bit FP pipeline.

                    In terms of performance there is little benefits, but users don't need to worry about SIGILL on Zen 4 if AVX-512 is used inside the binary.

                    Comment

                    Working...
                    X