Announcement

Collapse
No announcement yet.

AMD's Latest ROCm Effort: More Blogging With A New Blog Platform

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    They absolutely still need to groom the full software ecosystem. If I cannot install a rocm enabled version of pytorch on conda-forge, I won't be able to use it and won't switch away from cuda or nvidia GPUs. Supporting only pip or something like a fedora base install is not enough. It's very low effort for AMD to support this.

    If they want people to flock to use their MI300's for ML on Azure soon, they should make the ecosystem easier to handle for devs to switch easily. For our company only price/performance matters, if there is no migration barrier.

    Comment


    • #12
      Poor support here is such a shame. My previous GPU was AMD, bought specifically because of their strong open-source Linux support. When I upgraded recently I wanted to go with AMD again, but the software I want to run (mostly photogrammetry and ML) are only available on Nvidia cards, so I bought one of those. I really hope the picture changes by the time I next upgrade.

      Comment


      • #13
        Originally posted by UpSideDown View Post
        Poor support here is such a shame. My previous GPU was AMD, bought specifically because of their strong open-source Linux support. When I upgraded recently I wanted to go with AMD again, but the software I want to run (mostly photogrammetry and ML) are only available on Nvidia cards, so I bought one of those. I really hope the picture changes by the time I next upgrade.
        Ditto. I buy AMD because of their superb Linux gaming support, but it's a damn shame that AMD isn't that great of a choice when you have a consumer-grade GPU and want to do anything more than just playing games with the default settings. That's especially true if you don't use one of the Enterprise distributions that they support. Even for gaming, compared to Windows, their Linux support could be better.

        Their competition, OTOH, has tools that work on any x86_64 distribution that meets their minimum dependencies. I'd be lying if I said I hadn't considered getting an external 4060 because there's a lot of software I don't have access to.

        Comment


        • #14
          This reminds me a lot of Intel's GPU driver progress - basically bragging about updates containing features/improvements that were expected to work on day 0.

          Comment


          • #15
            To be honest and I've been a huge AMD compute critic in the past, but it does seem that they're starting to get it. The blog is a bit of a joke but there are other initiatives on show. They're at least trying to do something to catch up, even if it's very late and largely the horse has bolted.
            Now if they really wanted to make a bit of a noise, and rebuild a modicum of (long lost) trust, they would release a consumer card, costing under 2 grand, and with 48-64GB of VRAM, that's fully ROCm capable. That would wake Nvidia up quickly because as much as that company has been great showing out CUDA, it's still defacto segmenting the market with its miserly VRAM offerings on the RTX cards. Definitely a "4-core-for-past-10-years-meets-Ryzen" moment potential here.
            Last edited by vegabook; 21 February 2024, 06:08 PM.

            Comment


            • #16
              Originally posted by vegabook View Post
              they would release a consumer card, costing under 2 grand, and with 48-64GB of VRAM, that's fully ROCm capable
              Instead they ended support for gfx906 that WAS fully ROCm capable (MI50, Radeon Pro VII, and Radeon VII) and forgot that gfx1030 (RX6800/6900) was also fully ROCm capable and reinvented the wheel, making ROCm 6 that is reportedly not compatible with version 5. And they were not particulary cheap cards.

              And shocked whole industry by making as much as three most expensive latest generation cards ROCm capable (at least until new generation arrives). nVidia is probably shaking with fear

              As a $999 gfx1030 user, and former RX570 user (Polaris was dropped in ROCm 4.5), i'm definitely waiting for "open-source-radeon-for-past-7-years-meets-Battlemage-and-Celestial" moment here.
              Last edited by sobrus; 21 February 2024, 06:52 PM.

              Comment


              • #17
                Originally posted by sobrus View Post

                Instead they ended support for gfx906 that WAS fully ROCm capable (MI50, Radeon Pro VII, and Radeon VII) and forgot that gfx1030 (RX6800/6900) was also fully ROCm capable and reinvented the wheel, making ROCm 6 that is reportedly not compatible with version 5. And they were not particulary cheap cards.

                And shocked whole industry by making as much as three most expensive latest generation cards ROCm capable (at least until new generation arrives). nVidia is probably shaking with fear

                As a $999 gfx1030 user, and former RX570 user (Polaris was dropped in ROCm 4.5), i'm definitely waiting for "open-source-radeon-for-past-7-years-meets-Battlemage-and-Celestial" moment here.
                Looks it's been a shambles for years and there are still signs, as you point out, that it's still a mess. I was an enormous critic on these forums for having been led down the garden path with Radeon VII when ROCm was an utter mess on Linux. Indeed I (and others) used to spar with poor old Bridgman non stop and I've noticed that he's no longer in these forums.

                What's clear is that there is still infighting at AMD, as there always was, about exactly what strategy to adopt with compute. I can imagine that within AMD, those who always wanted to go down the compute route (including Koduri), but were stymied by the beancounters/segmentation nuts, but are now seeing themselves finally vindicated, are just as bitter, and with higher stakes, than you or I.

                And it can't be fun for those who were right about compute all along, to see the GPTs come out and NVDA become a trillion dollar company.

                Ultimately what's wrong at AMD is not technical, its institutional. It's just a disfunctional political mess, if I were to guess, and probably looks like these forums x10.

                But I maintain that it seems there are signs that they may, maybe, perhaps, be on some track now.

                And I'll add one more thing. The dumpster fire that AMD's compute strategy has been, actually is a silver lining, because it is Exhibit A of What Not To Do and other firms (including Intel) have clearly watched and learned. Good for us in the loooong run, I suppose.
                Last edited by vegabook; 21 February 2024, 07:08 PM.

                Comment


                • #18
                  Originally posted by sobrus View Post

                  Instead they ended support for gfx906 that WAS fully ROCm capable (MI50, Radeon Pro VII, and Radeon VII) and forgot that gfx1030 (RX6800/6900) was also fully ROCm capable and reinvented the wheel, making ROCm 6 that is reportedly not compatible with version 5. And they were not particulary cheap cards.

                  And shocked whole industry by making as much as three most expensive latest generation cards ROCm capable (at least until new generation arrives). nVidia is probably shaking with fear

                  As a $999 gfx1030 user, and former RX570 user (Polaris was dropped in ROCm 4.5), i'm definitely waiting for "open-source-radeon-for-past-7-years-meets-Battlemage-and-Celestial" moment here.
                  RCoM 6.0 is fully compatible with 5.x besides some minor edge cases, so no idea where you get that from.

                  In no way did they reinvent the wheel, gfx1030 is still fully enabled and works perfectly fine on rx6800, its true that they never marked the consumer parts as supported in the documentation, but the rocm libs and the the runtime dont care about pro vs consumer cards one bit so this do sent really matter much. Also while they announced that gfx906 support will end for now its still supported in rocm 6.
                  What the transition to unsupported means is however unclear, which is a real problem. When they dropped gfx900 support at 4.5 nothing really changed except that they removed gfx900 nodes from their ci for automated testing. gfx900 still works absolutely fine to this day in rocm6.0 (i have several mi25s) and running the automated tests on this arch is now performed by the Debian team. So far the only architecture ever to really really have been dropped from rocm is gfx803 (fiji, polaris), and you can still get that working in rocm6.0 with some patches (mainly some tensile selected kernels in rocblas are broken)
                  Last edited by DiamondAngle; 21 February 2024, 08:06 PM.

                  Comment


                  • #19
                    Damn, the ngreedia white knights are out in force in this one. 😂

                    Comment


                    • #20
                      Originally posted by vegabook View Post
                      Looks it's been a shambles for years
                      These are my thoughts too. I know AMD has good hardware, and their general support for open source is excellent. But there's no point denying that something is very wrong with ROCm and it's not thay they can't do it, rather they don't want to do it.

                      Originally posted by DiamondAngle View Post
                      RCoM 6.0 is fully compatible with 5.x besides some minor edge cases, so no idea where you get that from.
                      I got it from Ollama's github, when (at least at that time) it was only running with ROCm 5. So it's not fully backwards compatible and might need older software to be fixed. For some reason, AMD seems to be maintaining both branches separately, ROCm 5.7.1 is still there. It would make no sense if 6.0 was fully compatible.

                      Originally posted by DiamondAngle View Post
                      gfx1030 is still fully enabled and works perfectly fine on rx6800
                      I know it is, but let's face it: it's not supported. YMMV. Radeon VII, their top tier card from 2019, was fully oficially supported and isn't since 5.7.1. It may work, but doesn't have to. I understand that this kind of "support" may be enough for someone, but for me it's like running expensive hackintosh. Especially since AMD says it is commited to bringing ROCm to more consumer cards. Which cards? Yet to be released ones? So why dropping or not officially enabling something that already works?

                      Also, according to AMD themselves, no bugfixes, new features, or any performance optimizations will be backported to gfx906:
                      Release notes for AMD ROCm™ 6.0 ROCm 6.0 is a major release with new performance optimizations, expanded frameworks and library support, and improved developer experience. This includes initial ena...

                      Hopefully this is not true for RDNA2, as it has the same GPU as still supported Radeon Pro.

                      Originally posted by NeoMorpheus View Post
                      Damn, the ngreedia white knights are out in force in this one. 😂
                      ngreedia white knights? Lol. I've been using AMD since Am386DX/40. Currently running RX6800XT and R5950x and I've paid over $2000 for these parts alone back in 2021.
                      Just wanted to say that my old 2014 $100 GTX750Ti has still better GPGPU support than my current rig, so I'm a nvidia fanboy for sure.

                      AMD always saying that only their then-last generation is supported, or only Pro line is supported, or creating different architecture for GPGPU so that consumer cards are crippled (and PITA to be enabled) is not greedy at all. You can run CUDA on any nvidia, even integrated entry level one from 10 years ago.​
                      They even still say that 2006 GeForce 8 is compatible with CUDA even though these are long gone.
                      Last edited by sobrus; 22 February 2024, 05:16 AM.

                      Comment

                      Working...
                      X