Announcement

Collapse
No announcement yet.

AMD Lands Support For Vendor Flavored SPIR-V Within LLVM

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Lands Support For Vendor Flavored SPIR-V Within LLVM

    Phoronix: AMD Lands Support For Vendor Flavored SPIR-V Within LLVM

    SPIR-V used by the likes of OpenGL, OpenCL, and Vulkan is a common intermediate representation (IR) / intermediate language for consumption by device drivers. With code now merged into LLVM, AMD has introduced the notion of vendor "flavored" SPIR-V for containing extra information pertinent to the GPU device/driver being targeted...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    AMD continue to blow their own fucking foot off.

    The whole reason AMD cards are absolutely useless is that HIP doesn't fucking work on most of them. There are probably less than 10 cards that are officially supported, and maybe twice that number that kinda work unofficially if you use old libraries but often crash. So long as HIP doesn't work reliably on consumer cards, people building workstations won't choose them. This is especially impactful in AI, where the inability to prototype small models on AMD laptops/desktops prohibits anyone from buying big enterprise GPUs from AMD for the actual deployment.

    Why is HIP/ROCm so terrible? Because AMD fucked the design. While they put a lot of work into easing the transition from CUDA to HIP, they made it so that HIP applications compile down into actual assembly for a specific GPU architecture. This severely limits the cards that code will reliably work on and which can be officially supported, while the code might run or might crash on an illegal instruction on related hardware.

    Everyone else was smarter. Nvidia uses PTX portable assembly, and has other tricks too for working cross-generation. You don't need to match a CUDA application someone is shipping against a hyper specific compute capability for it to work. OpenCL goes even further, shipping actual shader language which can then be compiled for ANY GPU, from any manufacturer. And now AMD wants to include manufacturer and architecture specific hints in that widely-compatible shader language? What?

    AMD are truly incapable of good software engineering or learning from their mistakes.

    Comment


    • #3
      Originally posted by Developer12 View Post
      AMD continue to blow their own fucking foot off.

      The whole reason AMD cards are absolutely useless is that HIP doesn't fucking work on most of them. There are probably less than 10 cards that are officially supported, and maybe twice that number that kinda work unofficially if you use old libraries but often crash. So long as HIP doesn't work reliably on consumer cards, people building workstations won't choose them. This is especially impactful in AI, where the inability to prototype small models on AMD laptops/desktops prohibits anyone from buying big enterprise GPUs from AMD for the actual deployment.

      Why is HIP/ROCm so terrible? Because AMD fucked the design. While they put a lot of work into easing the transition from CUDA to HIP, they made it so that HIP applications compile down into actual assembly for a specific GPU architecture. This severely limits the cards that code will reliably work on and which can be officially supported, while the code might run or might crash on an illegal instruction on related hardware.

      Everyone else was smarter. Nvidia uses PTX portable assembly, and has other tricks too for working cross-generation. You don't need to match a CUDA application someone is shipping against a hyper specific compute capability for it to work. OpenCL goes even further, shipping actual shader language which can then be compiled for ANY GPU, from any manufacturer. And now AMD wants to include manufacturer and architecture specific hints in that widely-compatible shader language? What?

      AMD are truly incapable of good software engineering or learning from their mistakes.
      I agree. Their good hardware is hindered by their awful software, probably they are understaffed at software engineering.

      Comment


      • #4
        SPV_INTEL_inline_assembly is such a dirty hack.
        They CBA to represent their instructions in a specified SPIR-V extension? Or is their shader compiler so slow that they can't afford to run the optimizer on that SPIR-V?

        Comment


        • #5
          Originally posted by Developer12 View Post
          AMD continue to blow their own fucking foot off.
          The whole reason AMD cards are absolutely useless is that HIP doesn't fucking work on most of them. There are probably less than 10 cards that are officially supported, and maybe twice that number that kinda work unofficially if you use old libraries but often crash. So long as HIP doesn't work reliably on consumer cards, people building workstations won't choose them. This is especially impactful in AI, where the inability to prototype small models on AMD laptops/desktops prohibits anyone from buying big enterprise GPUs from AMD for the actual deployment.
          Why is HIP/ROCm so terrible? Because AMD fucked the design. While they put a lot of work into easing the transition from CUDA to HIP, they made it so that HIP applications compile down into actual assembly for a specific GPU architecture. This severely limits the cards that code will reliably work on and which can be officially supported, while the code might run or might crash on an illegal instruction on related hardware.
          Everyone else was smarter. Nvidia uses PTX portable assembly, and has other tricks too for working cross-generation. You don't need to match a CUDA application someone is shipping against a hyper specific compute capability for it to work. OpenCL goes even further, shipping actual shader language which can then be compiled for ANY GPU, from any manufacturer. And now AMD wants to include manufacturer and architecture specific hints in that widely-compatible shader language? What?
          AMD are truly incapable of good software engineering or learning from their mistakes.
          ROCm/HIP only sucks because it is a CUDA clone on the source code level ...
          ZLUDA is only possible because ROCm/HIP is a CUDA clone.

          Nvidia with CUDA has the big advantage that they do not need to clone or copy anything they just build what fits best for them and all other people adopt their way.

          " AMD Lands Support For Vendor Flavored SPIR-V Within LLVM"

          about this i am sure AMD would never do this if it would not be 100% necessary.
          they just found user-cases who it is 100% necessary and only because of this they do it.

          keep in mind SPIR-V is used in Vulkan Compute this could just be a sign that Vulkan Compute matures.

          you say ROCm/HIP is bad but then you do not want Vulkan Compute to mature ?
          Phantom circuit Sequence Reducer Dyslexia

          Comment


          • #6
            Originally posted by qarium View Post

            ROCm/HIP only sucks because it is a CUDA clone on the source code level ...
            ZLUDA is only possible because ROCm/HIP is a CUDA clone.

            Nvidia with CUDA has the big advantage that they do not need to clone or copy anything they just build what fits best for them and all other people adopt their way.

            " AMD Lands Support For Vendor Flavored SPIR-V Within LLVM"

            about this i am sure AMD would never do this if it would not be 100% necessary.
            they just found user-cases who it is 100% necessary and only because of this they do it.

            keep in mind SPIR-V is used in Vulkan Compute this could just be a sign that Vulkan Compute matures.

            you say ROCm/HIP is bad but then you do not want Vulkan Compute to mature ?
            To be honest, I don't care about vulkan compute. It's "ok" but it's an awkward middle ground between openCL 3.0 and still trying to be a graphics API. And besides, if you really want to do compute and somehow all you have is vulkan you can just run rusticl on top of zink.

            AMD are doing this because they're completely incapable of learning from their mistakes. Even now, they've just released a compiler based on LLVM that seems to target specific NPU architectures, producing binaries that have to deal with the exposed details of their VLIW architecture. That's not a recipe for portability. Unless they ship that compiler as part of their NPU driver, every application (pytorch, etc) will have to ship with binaries compiled for each and every chipset they want to be able to support.

            This is already the case with ROCm on AMD GPUs, and it's the whole reason that entire ecosystem sucks. It would suck considerably LESS if they had copied nvidia's CUDA even harder, and kept the concept of portable PTX bytecode that is translated by the driver just before execution. Being a clone of CUDA isn't what makes HIP bad, it's all the dumb bullshit AMD have done that diverges from CUDA, injecting a ton of AMD-specific difficulties that makes their own maintenance burden higher and developer's lives harder.

            Now here they are, injecting those same bad decisions into the portable stuff used by everyone else.

            Comment


            • #7
              Originally posted by Developer12 View Post
              To be honest, I don't care about vulkan compute. It's "ok" but it's an awkward middle ground between openCL 3.0 and still trying to be a graphics API.
              the OpenCL standard was successfully sapotaged by Nvidia and Nvidia only can claim 3.0 support because the standard allows Nvidia with only OpenCL 1.2 to be fully 3.0 compatible this means OpenCL version numbers have become completely useless shit.
              OpenCL is the proof that this does not work.

              what you criticize "(Vulkan)still trying to be a graphics API." is the only factor why Vulkan Compute could not be sapotaged by Nvidia because there is a very easy way to FORCE Nvidia to adobt any vulkan extension you need for Compute you just create a popular game who use this vulkan extensions and well Nvidia is forced to adobt the extension.

              this is the only reason why Nvidia was successfull to sapotage OpenCL in favor of CUDA but they are not able to sapotage Vulkan Compute.

              Originally posted by Developer12 View Post
              And besides, if you really want to do compute and somehow all you have is vulkan you can just run rusticl on top of zink.
              you really fail to understand anything... again Vulkan Compute is the only standard Nvidia can not sapotage like they did sapotage OpenCL because every time Nvidia does not want to implement a vulkan extrension like they did sapotage new extensions in OpenCL the open-source compute people just make sure a new popular game use this functionality think about this like a game like Doom Eternal and such a game then force Nvidia to implement this extension.

              a historical example of this was the FP64 support in OpenGL companies like Nvidia did NOT implement FP64 in OpenGL but they still claimed to be compatible with OpenGL3.0... it did take like over 10 years when finally a popular game who depended on FP64 who forced them to implement this feature.

              see Nvidia did successfully sapotage OpenCL but yet they where not successfull to sapotage vulkan or vulkan compute and the power of these games and tools who enforced vulkan extensions you see in valve proton and intel was hit by this very hard... intel was FORCED unwillingly to implement every vulkan extensions valve proton used or else no game would work on intel hardware...

              intel was hit by this very hard. it took them years after release of their intel ARK GPUs to catch up all the vulkan extensions used on the valve steam deck.

              see Vulkan Compute is the only compute standard who Nvidia can not sapotage just create a AAA blockbuster game who use this vulkan extension and Nvidia is forced to implement the feature.

              Originally posted by Developer12 View Post
              AMD are doing this because they're completely incapable of learning from their mistakes. Even now, they've just released a compiler based on LLVM that seems to target specific NPU architectures, producing binaries that have to deal with the exposed details of their VLIW architecture.
              Vulkan/DX12/Metal was created after AMD did abolish VLIW means any vulkan app expect the architecture not to be VLIW...
              it is plain and simple not even possible to support these VLIW NPU architectures without Vendor Flavored SPIR-V.
              because these VLIW NPU architectures violate essencial princibles of "Vulkan"

              i recommend reading this bridgman quote you already know:

              Originally posted by bridgman View Post
              Depends on the environment... if you are in control of memory and cache residency then in-order can be very efficient, and most GPUs are still in-order today.
              For VLIW it's a function of how much you can control the workload. VLIW worked pretty well for graphics in our GPUs, and it was really the emerging use of compute both in graphics workloads and compute workloads that prompted us to move away from it.
              AI is arguably one of the most controlled workloads in the sense that the vast majority of processing happens in library code, and there is a trend towards treating AI processing as a streaming workload where caches are downplayed in favour of embedded memory or wide/burst fetches from main memory.
              For general purpose workloads it's probably still fair to say that in-order and VLIW have serious performance penalties.


              Vulkan/SPIR-V was created with the idea that VLIW is death ... now the NPUs are based on VLIW ....

              Originally posted by bridgman View Post
              That's not a recipe for portability. Unless they ship that compiler as part of their NPU driver, every application (pytorch, etc) will have to ship with binaries compiled for each and every chipset they want to be able to support.
              well the compiler is open-source and can be ship as part of their NPU driver.. so what is the problem?

              Originally posted by bridgman View Post
              This is already the case with ROCm on AMD GPUs, and it's the whole reason that entire ecosystem sucks. It would suck considerably LESS if they had copied nvidia's CUDA even harder, and kept the concept of portable PTX bytecode that is translated by the driver just before execution. Being a clone of CUDA isn't what makes HIP bad, it's all the dumb bullshit AMD have done that diverges from CUDA, injecting a ton of AMD-specific difficulties that makes their own maintenance burden higher and developer's lives harder.
              Now here they are, injecting those same bad decisions into the portable stuff used by everyone else.
              "It would suck considerably LESS if they had copied nvidia's CUDA even harder​"

              by law they can't... binary compatibility is against the law.

              "and kept the concept of portable PTX bytecode"

              they can't if Nvidia has patent on it.

              "Being a clone of CUDA isn't what makes HIP bad, it's all the dumb bullshit AMD have done that diverges from CUDA"

              they are forced to do so by law... just remember any Nvidia patent could stop the AMD product.

              "injecting a ton of AMD-specific difficulties that makes their own maintenance burden higher and developer's lives harder."

              my interpretation of this is they try to bypass the Nvidia patents and copyright and this is only possible if the implementation is uniquely different.


              "Now here they are, injecting those same bad decisions into the portable stuff used by everyone else."

              everyone else has the same problem with the Nvidia Patens...
              Phantom circuit Sequence Reducer Dyslexia

              Comment


              • #8
                Originally posted by qarium View Post

                the OpenCL standard was successfully sapotaged by Nvidia and Nvidia only can claim 3.0 support because the standard allows Nvidia with only OpenCL 1.2 to be fully 3.0 compatible this means OpenCL version numbers have become completely useless shit.
                OpenCL is the proof that this does not work.

                what you criticize "(Vulkan)still trying to be a graphics API." is the only factor why Vulkan Compute could not be sapotaged by Nvidia because there is a very easy way to FORCE Nvidia to adobt any vulkan extension you need for Compute you just create a popular game who use this vulkan extensions and well Nvidia is forced to adobt the extension.

                this is the only reason why Nvidia was successfull to sapotage OpenCL in favor of CUDA but they are not able to sapotage Vulkan Compute.



                you really fail to understand anything... again Vulkan Compute is the only standard Nvidia can not sapotage like they did sapotage OpenCL because every time Nvidia does not want to implement a vulkan extrension like they did sapotage new extensions in OpenCL the open-source compute people just make sure a new popular game use this functionality think about this like a game like Doom Eternal and such a game then force Nvidia to implement this extension.

                a historical example of this was the FP64 support in OpenGL companies like Nvidia did NOT implement FP64 in OpenGL but they still claimed to be compatible with OpenGL3.0... it did take like over 10 years when finally a popular game who depended on FP64 who forced them to implement this feature.

                see Nvidia did successfully sapotage OpenCL but yet they where not successfull to sapotage vulkan or vulkan compute and the power of these games and tools who enforced vulkan extensions you see in valve proton and intel was hit by this very hard... intel was FORCED unwillingly to implement every vulkan extensions valve proton used or else no game would work on intel hardware...

                intel was hit by this very hard. it took them years after release of their intel ARK GPUs to catch up all the vulkan extensions used on the valve steam deck.

                see Vulkan Compute is the only compute standard who Nvidia can not sapotage just create a AAA blockbuster game who use this vulkan extension and Nvidia is forced to implement the feature.



                Vulkan/DX12/Metal was created after AMD did abolish VLIW means any vulkan app expect the architecture not to be VLIW...
                it is plain and simple not even possible to support these VLIW NPU architectures without Vendor Flavored SPIR-V.
                because these VLIW NPU architectures violate essencial princibles of "Vulkan"

                i recommend reading this bridgman quote you already know:



                Vulkan/SPIR-V was created with the idea that VLIW is death ... now the NPUs are based on VLIW ....



                well the compiler is open-source and can be ship as part of their NPU driver.. so what is the problem?



                "It would suck considerably LESS if they had copied nvidia's CUDA even harder​"

                by law they can't... binary compatibility is against the law.

                "and kept the concept of portable PTX bytecode"

                they can't if Nvidia has patent on it.

                "Being a clone of CUDA isn't what makes HIP bad, it's all the dumb bullshit AMD have done that diverges from CUDA"

                they are forced to do so by law... just remember any Nvidia patent could stop the AMD product.

                "injecting a ton of AMD-specific difficulties that makes their own maintenance burden higher and developer's lives harder."

                my interpretation of this is they try to bypass the Nvidia patents and copyright and this is only possible if the implementation is uniquely different.


                "Now here they are, injecting those same bad decisions into the portable stuff used by everyone else."

                everyone else has the same problem with the Nvidia Patens...
                Jesus. Where to even begin with this.

                OpenCL 3.0 was the fixup version. It cut all the crap from the previous versions and fixed all the mistakes. That's why the versions inbetween don't matter for it.

                It's clear you don't have any concept of what "vulkan compute" actually is. Or the difference between an architecture and a graphics API. Or what VLIW is. Or the current problems with ROCm. Or the difference between using your own portable bytecode and binary compatibility with a competitor.

                Or......anything really. Did you take your meds this morning? Maybe get yourself checked for an aneurysm.,

                Comment

                Working...
                X