Announcement

Collapse
No announcement yet.

Gallium3D's Gallivm Gets Basic AVX2 Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Gallium3D's Gallivm Gets Basic AVX2 Support

    Phoronix: Gallium3D's Gallivm Gets Basic AVX2 Support

    José Fonseca of VMware has added basic support for AVX2 to Gallivm, the driver-independent portion of LLVM integration with Gallium3D...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    The best part is the date on those patches

    Comment


    • #3
      from the patch linked...

      + if (type.width * type.length == 128) {
      + if(util_cpu_caps.has_sse2) {
      + if(type.width == 8)
      + intrinsic = type.sign ? "llvm.x86.sse2.padds.b" : "llvm.x86.sse2.paddus.b";
      + if(type.width == 16)
      + intrinsic = type.sign ? "llvm.x86.sse2.padds.w" : "llvm.x86.sse2.paddus.w";
      + } else if (util_cpu_caps.has_altivec) {
      + if(type.width == 8)
      + intrinsic = type.sign ? "llvm.ppc.altivec.vaddsbs" : "llvm.ppc.altivec.vaddubs";
      + if(type.width == 16)
      + intrinsic = type.sign ? "llvm.ppc.altivec.vaddshs" : "llvm.ppc.altivec.vadduhs";
      + }
      + }
      + if (type.width * type.length == 256) {



      seems obvious to me that the last line I pasted should be an else if.. or does the compiler automatically handle that? No reason to do a multiply/compare twice... (if you click the commit link from the article, the text will be properly indented... it didn't copy and paste that way so it may be hard to see in this post)

      Comment


      • #4
        Hmm using the AVX2 gather instruction.. I have been really been looking forward to using this instruction, but it turns out it is much much slower than doing all the fetches using normal load instructions, at least on Haswell, on Broadwell and Skylake it is on par, but nowhere is it any faster.

        Comment


        • #5
          Originally posted by Holograph View Post
          seems obvious to me that the last line I pasted should be an else if.. or does the compiler automatically handle that? No reason to do a multiply/compare twice... (if you click the commit link from the article, the text will be properly indented... it didn't copy and paste that way so it may be hard to see in this post)
          here is the patch I refer to https://cgit.freedesktop.org/mesa/me...235ffc430ac736

          not sure why trust the compiler to optimize that when you could just add an "else"

          Comment


          • #6
            Originally posted by Holograph View Post
            from the patch linked...

            + if (type.width * type.length == 128) {
            + if(util_cpu_caps.has_sse2) {
            + if(type.width == 8)
            + intrinsic = type.sign ? "llvm.x86.sse2.padds.b" : "llvm.x86.sse2.paddus.b";
            + if(type.width == 16)
            + intrinsic = type.sign ? "llvm.x86.sse2.padds.w" : "llvm.x86.sse2.paddus.w";
            + } else if (util_cpu_caps.has_altivec) {
            + if(type.width == 8)
            + intrinsic = type.sign ? "llvm.ppc.altivec.vaddsbs" : "llvm.ppc.altivec.vaddubs";
            + if(type.width == 16)
            + intrinsic = type.sign ? "llvm.ppc.altivec.vaddshs" : "llvm.ppc.altivec.vadduhs";
            + }
            + }
            + if (type.width * type.length == 256) {

            seems obvious to me that the last line I pasted should be an else if.. or does the compiler automatically handle that? No reason to do a multiply/compare twice... (if you click the commit link from the article, the text will be properly indented... it didn't copy and paste that way so it may be hard to see in this post)
            I'm not sure I understand.
            An "else if" still requires a condition, so it would still be a pointless multiply/compare. It's pretty much the same.
            An "else" is bad as any other value (not 128) will fire the second code block and this is not acceptable.
            Best theoretical way would have been doing the multiplication before this, store the result in a variable and then just check it all times you want, or use a switch statement.

            But I'm pretty sure the compiler can figure basic stuff like this on its own on the optimization phases.

            Comment


            • #7
              Originally posted by starshipeleven View Post
              I'm not sure I understand.
              An "else if" still requires a condition, so it would still be a pointless multiply/compare. It's pretty much the same.
              An "else" is bad as any other value (not 128) will fire the second code block and this is not acceptable.
              Best theoretical way would have been doing the multiplication before this, store the result in a variable and then just check it all times you want, or use a switch statement.

              But I'm pretty sure the compiler can figure basic stuff like this on its own on the optimization phases.
              He's just saying that if the width*length == 128, there's no way that the width*length can be 256. If the first "if" is true, the second can never happen, so it could've been done as an "else if" instead. You're right that a switch statement would've worked as well.

              It's not really a bug, just slightly suboptimal control flow. You'll still get the same results as if you had done an elseif, just with one more comparison/jump if you take the first branch and then have to skip the second... Unless the compiler figures it out for you.

              Comment


              • #8
                Here is the assembly that GCC produces. If / Else-If make no difference, but a switch should be much better.

                Compiler Explorer is an interactive online compiler which shows the assembly output of compiled C++, Rust, Go (and many more) code.

                Comment


                • #9
                  Originally posted by doom_Oo7 View Post
                  Here is the assembly that GCC produces. If / Else-If make no difference, but a switch should be much better.

                  https://godbolt.org/g/yoKVcb
                  Thanks for that. I would've added switch in as an idea if there were an edit button, heh. I tend to have the opinion that the compiler's optimizer should only be relied on when it can't always optimize something (i.e. optimizations that get enabled only for some configurations) or optimizing the code manually would be too messy. That said, I realize (and realized previously) this is a nit-pick. I'm not familiar enough with gallium3d or GPU-related stuff at all to make any super deep comments on the code, so I just commented on the one thing that jumped out at me. As such I will stop bringing it up, but just wanted to say thanks for taking my idea seriously.

                  Comment


                  • #10
                    Originally posted by doom_Oo7 View Post
                    Here is the assembly that GCC produces. If / Else-If make no difference, but a switch should be much better.

                    https://godbolt.org/g/yoKVcb
                    Really cool this godbolt.org, thanks for the link (y)

                    Comment

                    Working...
                    X