Announcement

Collapse
No announcement yet.

The State Of ROCm For HPC In Early 2021 With CUDA Porting Via HIP, Rewriting With OpenMP

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #81
    Originally posted by bridgman View Post

    With respect, we made zero deal about Radeon VII's FP64 capability other than eventually adding an FP64 FLOPs number to the product page alongside FP16 and FP32.

    The press made a big deal about it which is fine, and which eventually resulted in us launching the PRO version of the card (with even faster FP64) but from our perspective Radeon VII was a gaming card.
    Yes my mistake it was really the press that called it an FP64 brute, nevertheless AMD wouldn't have only 1/2 gimped it, as opposed to 1/8 gimped Vega64, if it didn't at least hope that some of them would make their way into scientific compute. And that's why I bought it.

    Comment


    • #82
      Originally posted by hoohoo View Post
      Exactly. Same here. Google can get an AMD tech to hold their hand while learning to build ROCm, but us little people, us small wallet people, could AMD maintain some basic documentation - fah.
      I never expected to actually feel this way, but the kind of patronizing responses I've read from AMD here today, I don't think I'll be spending money on Radeon again. CUDA at least actually works out of the box. Everywhere. My pair of Radeon VII's can mine ethereum, that's about what AMD seems to think they are good for.
      ethereum is 32bit fp ... your radeon7 is good in 64bit fp.

      you also have to unterstand that bridgman only talks today about this because right now they have the money to fix all these problems.
      he did not talk about it in 2016 because back in these times they did not have the money to fix this problem

      "I don't think I'll be spending money on Radeon again."

      i think this is complete wrong logically because next time you buy something AMD will have fixed all these problems.

      it only make sense if you think that do this because you want revenge...

      but revenge logically was never good advice.
      Phantom circuit Sequence Reducer Dyslexia

      Comment


      • #83
        Originally posted by hoohoo View Post
        I would have been very suspicious if AMD had crowdfunded the development of it's GPGPU compute software stack. Seems like a statement that the company has profound financial problems, and in 2016 that would have scared away anyone who bought into the August share issue.
        right because of this they did not do it. but today we have to agree that the crowdfunding way whould had better outcome.
        thats the problem of our modern economic system people prever to hide problems but thats bad people should adress problems and try to solve it instead of hide it in the dark.

        you for example you would have much better experience with your AMD hardware if some crowdfunding would had be successfull. but yes you can not do both hide your problems and in the same time do crowdfunding to address the problem.


        Originally posted by hoohoo View Post
        OTOH, AMD could have leveraged the open source community in the actual development or beta testing of the code. There has been growing desire for an alternative to CUDA, as nVidia has raised and raised and raised prices. But the hardware needed to run ROCm has progressively got more and more unobtainable for normal people or SME's. Bridgman mentioned to me that AMD did make the Radeon VII Pro available - but it was three times the price of a Radeon VII and you had to deal with corp sales channels to get one. It seems that AMD still has a substantial chunk of it's corp culture that thinks it is 1990 and mainframes and minicomputers and channel partner sales are the future when in fact nVidia demonstrated that if you sell it they will come, so sell it a lot, to everyone, every way they want to buy it.
        "Bridgman mentioned to me that AMD did make the Radeon VII Pro available - but it was three times the price of a Radeon VII"

        claims like this are complete HOAX because radeon7 has PCIe3.0 the PRO has PCIE4.0 (better performance) also the PRO has ECC ram (better stability) and the PRO has higher performance in 64bit FP ...

        for example i bought 128GB ECC ram for my 1920X threadripper system normal ram without ECC costs like 400€ the ECC ram like 700€ so the idea that you get ECC for free is complete stupid.

        I do have Vega64... if you can't find a radeon7 and the PRO is to expensive for you why not buy a used vega64 for 230€ ?

        the Vega64 has 13TFLOPS in FP32.... and is compatible with ROCm ...

        people talk about RX580 and claim RX5700 and 6800/6900 does not work... the logical step is then Buy a used Vega64 or radeon7 or a radeon7 PRO...

        i had 6 vega64 bought it at 667-740€ and 1 was broken after etherium mining after 6-8 month had 5 more i selled 3 for like 220€

        if you think it is not worth it to buy a used one for 220€ for ROCm then you clearly don't want it at all.

        because per dollar the vega64 is the fastes card you can get.
        Phantom circuit Sequence Reducer Dyslexia

        Comment


        • #84
          Originally posted by Qaridarium View Post

          right because of this they did not do it. but today we have to agree that the crowdfunding way whould had better outcome.
          thats the problem of our modern economic system people prever to hide problems but thats bad people should adress problems and try to solve it instead of hide it in the dark.

          you for example you would have much better experience with your AMD hardware if some crowdfunding would had be successfull. but yes you can not do both hide your problems and in the same time do crowdfunding to address the problem.




          "Bridgman mentioned to me that AMD did make the Radeon VII Pro available - but it was three times the price of a Radeon VII"

          claims like this are complete HOAX because radeon7 has PCIe3.0 the PRO has PCIE4.0 (better performance) also the PRO has ECC ram (better stability) and the PRO has higher performance in 64bit FP ...

          for example i bought 128GB ECC ram for my 1920X threadripper system normal ram without ECC costs like 400€ the ECC ram like 700€ so the idea that you get ECC for free is complete stupid.

          I do have Vega64... if you can't find a radeon7 and the PRO is to expensive for you why not buy a used vega64 for 230€ ?

          the Vega64 has 13TFLOPS in FP32.... and is compatible with ROCm ...

          people talk about RX580 and claim RX5700 and 6800/6900 does not work... the logical step is then Buy a used Vega64 or radeon7 or a radeon7 PRO...

          i had 6 vega64 bought it at 667-740€ and 1 was broken after etherium mining after 6-8 month had 5 more i selled 3 for like 220€

          if you think it is not worth it to buy a used one for 220€ for ROCm then you clearly don't want it at all.

          because per dollar the vega64 is the fastes card you can get.
          I bought Radeon VII cards to run Caffe and Tensorflow atop ROCm. I bought one on launch day and a second one about 8 months later. My project was using 32 bit models, and the VII was better 32 bit bang for the buck than nVidia's 2080ti, and much higher performing than the similar priced 2080. Now the deep learning project is over and I have these two cards... I could sell them on Ebay but whenever I sell something there I always get a purchaser who starts whining about I should cut the price after the sale is complete, just a PITA. I gotta do something with them so why not mine Eth? They are AFAICT the best Eth mining cards in existence, 90+ Megahash/s at ~235W.

          Meanwhile, sometime around ROCm 3.5 the hipCaffe branch actually stopped working. You cannot even stand up a modest network like Googlenet on a Radon VII using ROCm post 3.5 or so. It runs out of VRAM! I really have to shake my head at that.

          My point about the VII Pro was that Bridgman thinks it is an entry level compute card, but really the VII was the entry card. Compute is also about FP32, despite HPC focus on FP64. You can do a lot with 32 bit floats. AMD has never understood there is a big market of scientists, grad students, small software companies that are buying FLOPs on a budget - three Radeon VII's instead of one Radeon VII Pro gets you more FLOPs and they don't care about ECC and enterprise features. This is what nVidia understood - AMD on the other hand just had to EOL the VII and go for the hard upsell to the VII Pro. AMD could have kept the VII around at $700 and grown it's market instead...

          I have a Vega64 also. It is slower than a Radeon VII in deep learning work, and has half the VRAM.
          Last edited by hoohoo; 23 February 2021, 06:08 PM.

          Comment


          • #85
            Originally posted by Qaridarium View Post

            ethereum is 32bit fp ... your radeon7 is good in 64bit fp.

            you also have to unterstand that bridgman only talks today about this because right now they have the money to fix all these problems.
            he did not talk about it in 2016 because back in these times they did not have the money to fix this problem

            "I don't think I'll be spending money on Radeon again."

            i think this is complete wrong logically because next time you buy something AMD will have fixed all these problems.

            it only make sense if you think that do this because you want revenge...

            but revenge logically was never good advice.
            What I have heard here is Bridgman more or less saying, across several posts, that: "ROCm is really only targeted at big HPC; AMD never really talked about ROCm outside the HPC space; and "Oh, that Radeon VII, the one with 16 GB of cutting edge VRAM, with 1TB/s VRAM bandwidth, with second or third place FP32 speed and the fastest FP64 under a $5000 price tag at introduction and which we promoted in content creation as well as games at launch? -- well that's just a gaming card and we never said otherwise".

            Well, the first claim may be true, IDK; the second is false, AMD has talked ROCm well outside HPC meetings; the third just seems like a slap in the face. nVidia is a money grubbing company that behaves in a predictable manner and which says where it's toolkits work and it's toolkits do work there. AMD has to my mind just demonstrated that all I can count on it to be is money grubbing.

            There is some anger in me as I write this, but TBH also reason. Why would I blow another couple grand on cards from a company that might tell me a year or two down the road "Oh, you know what? Those were not compute cards you bought."
            Last edited by hoohoo; 23 February 2021, 06:10 PM.

            Comment


            • #86
              Originally posted by hoohoo View Post
              I bought Radeon VII cards to run Caffe and Tensorflow atop ROCm. I bought one on launch day and a second one about 8 months later. My project was using 32 bit models, and the VII was better 32 bit bang for the buck than nVidia's 2080ti, and much higher performing than the similar priced 2080. Now the deep learning project is over and I have these two cards... I could sell them on Ebay but whenever I sell something there I always get a purchaser who starts whining about I should cut the price after the sale is complete, just a PITA. I gotta do something with them so why not mine Eth? They are AFAICT the best Eth mining cards in existence, 90+ Megahash/s at ~235W.
              yes why sell it on ebay? i only know people who search the radeon7 and those who have it they keep it.
              and yes i mined eth with my 6 vega64.. my brother sold his etherium at 1600€ per ETH... he did make a lot of cash with it.
              yes mine ETH sounds very good. ETH is technically even better than bitcoin.
              but really i only know people who want to buy radeon7 on ebay i know zero people who want to sell it.

              all the people who buy 3080 or 3090 to mine ETH they spend a lot of money the 3090 right now is at 2000€ here in germany.

              Originally posted by hoohoo View Post
              Meanwhile, sometime around ROCm 3.5 the hipCaffe branch actually stopped working. You cannot even stand up a modest network like Googlenet on a Radon VII using ROCm post 3.5 or so. It runs out of VRAM! I really have to shake my head at that.
              My point about the VII Pro was that Bridgman thinks it is an entry level compute card, but really the VII was the entry card. Compute is also about FP32, despite HPC focus on FP64. You can do a lot with 32 bit floats. AMD has never understood there is a big market of scientists, grad students, small software companies that are buying FLOPs on a budget - three Radeon VII's instead of one Radeon VII Pro gets you more FLOPs and they don't care about ECC and enterprise features. This is what nVidia understood - AMD on the other hand just had to EOL the VII and go for the hard upsell to the VII Pro. AMD could have kept the VII around at $700 and grown it's market instead...
              I have a Vega64 also. It is slower than a Radeon VII in deep learning work, and has half the VRAM.
              out of vram with 16gb vram ?... sounds strange. but yes thats why the modern cards in this field have 32GB vram

              "AMD could have kept the VII"

              believe it or not thats not an option for AMD because at 700dollars they do not make money with that card.

              but what they can make is something like this: make a 7 PRO without ECC ram to drop the price would still be a upgrade to the normal 7 because of PCIe4.0 and higher FP64 performance.

              and the radeon7 and the pro has 3500 shaders but the orginal die has 4096 shaders (they sell the full version to APPLE)

              but right now the 6800/6900 is the new highend and i am sure apple go for this.

              this means it is possible to sell 4096shader version of the radeon7 pro. and yes maybe without ECC ram.

              and they can also build a 32GB version of the 7 PRO... because the version of apple with 4096 shaders also has 32GB vram.

              but whats not an option is to sell a non-profit product at 700dollars.. if you can sell a 32GB vram version with full 4096 shader cores instead. for like 1500€

              Phantom circuit Sequence Reducer Dyslexia

              Comment


              • #87
                Originally posted by hoohoo View Post
                What I have heard here is Bridgman more or less saying, across several posts, that: "ROCm is really only targeted at big HPC;
                he talked about the past and they hire people right now to change this for the future. he said they hire people to bring ROCm to any market means Desktop and notebook and more.

                Originally posted by hoohoo View Post
                AMD never really talked about ROCm outside the HPC space; and "Oh, that Radeon VII, the one with 16 GB of cutting edge VRAM, with 1TB/s VRAM bandwidth, with second or third place FP32 speed and the fastest FP64 under a $5000 price tag at introduction and which we promoted in content creation as well as games at launch? -- well that's just a gaming card and we never said otherwise".
                what you do not unterstand about the radeon7 and radeon7 pro is this: this is cutting down product the real VEAGA20 chip is sold as 4096 shader version with 32GB vram to APPLE.. just look at apple.com you can configure the MAC PRO with the VEGA20 with 4096 shader cores and 32GB vram.

                Originally posted by hoohoo View Post
                Well, the first claim may be true, IDK; the second is false, AMD has talked ROCm well outside HPC meetings; the third just seems like a slap in the face. nVidia is a money grubbing company that behaves in a predictable manner and which says where it's toolkits work and it's toolkits do work there. AMD has to my mind just demonstrated that all I can count on it to be is money grubbing.
                There is some anger in me as I write this, but TBH also reason. Why would I blow another couple grand on cards from a company that might tell me a year or two down the road "Oh, you know what? Those were not compute cards you bought."
                yes thats the feeling of emotions... i am a Autist i do not have these feelings about technical products.

                AMD right now is hiring people to bring ROCm to any market means gaming cards and desktop and laptop and so one.

                in present means right now your feelings are right but in 1-2 years your feelings just misguide you.
                Phantom circuit Sequence Reducer Dyslexia

                Comment


                • #88
                  Originally posted by hoohoo View Post
                  What I have heard here is Bridgman more or less saying, across several posts, that: "ROCm is really only targeted at big HPC; AMD never really talked about ROCm outside the HPC space; and "Oh, that Radeon VII, the one with 16 GB of cutting edge VRAM, with 1TB/s VRAM bandwidth, with second or third place FP32 speed and the fastest FP64 under a $5000 price tag at introduction and which we promoted in content creation as well as games at launch? -- well that's just a gaming card and we never said otherwise".

                  Well, the first claim may be true, IDK...
                  OK, I was just about to ask you what I said that you felt was condescending - this helps, thanks.

                  The first is not us saying "we don't care about you" just "we didn't have enough money at the time to make all the markets happy and we chose this subset. We made the stack available for other uses but yeah it's a bit lumpy".

                  Originally posted by hoohoo View Post
                  ... the second is false, AMD has talked ROCm well outside HPC meetings
                  I don't remember us ever talking about ROCm outside HPC (specifically the SC** conferences) and ML/datacenter space - if we did then I got that wrong and would appreciate you pointing me to whatever I missed so I know for the next time. I normally see everything that goes out about ROCm; it's possible I missed something.

                  Originally posted by hoohoo View Post
                  \he third just seems like a slap in the face. nVidia is a money grubbing company that behaves in a predictable manner and which says where it's toolkits work and it's toolkits do work there. AMD has to my mind just demonstrated that all I can count on it to be is money grubbing.

                  There is some anger in me as I write this, but TBH also reason. Why would I blow another couple grand on cards from a company that might tell me a year or two down the road "Oh, you know what? Those were not compute cards you bought."
                  OK, I don't think I understand. You are saying (correctly) that we talked about content creation, but AFAIK all the reviews and tests supported that. Large memory, good OpenCL performance etc...

                  I don't think I ever said Radeon VII is not a compute card - I just disagreed that we *promoted* it as a compute card with ROCm. This goes back to the second point - again, if I missed something there I apologize and please let me know. I normally have a pretty good handle on our public announcements because I need to be careful to stay within their limits but it's possible I missed something.

                  Can we step back a bit ? You have a Radeon VII which is (as far as I know) quite well supported by the ROCm stack, albeit with the same build/install warts as other cards. Is your concern just that building from source has been too problematic, or are you unable to use the prebuilt binaries as well for some reason ? Or is the issue that you are able to install the ROCm stack OK but that it is not working for you ?

                  You mentioned that HipCaffe stopped working - maybe I have this wrong, but my understanding was that Caffe was replaced with Caffe2 in pretty much all applications, and that Caffe2 in turn was merged into PyTorch (which we do support). Is there something you are doing with Caffe that can not be done with Pytorch, or am I misunderstanding something bigger here ?
                  Last edited by bridgman; 23 February 2021, 08:22 PM.
                  Test signature

                  Comment


                  • #89
                    Originally posted by bridgman View Post
                    You mentioned that HipCaffe stopped working - maybe I have this wrong, but my understanding was that Caffe was replaced with Caffe2 in pretty much all applications, and that Caffe2 in turn was merged into PyTorch (which we do support). Is there something you are doing with Caffe that can not be done with Pytorch, or am I misunderstanding something bigger here ?
                    i my unterstanding he just did something what run into out of VRAM memory yes with 16GB vram this is sad...

                    but why all the HPC people buy 32GB vram version? yes for gaming 16GB vram looks much but for compute is very low.
                    compare it to do compute with my threadripper i bought 128gb ECC DDR4 memory days ago because i did run out of ram very quickly with my 32GB vram right now. for example in the moment i use 7zip with my 24 thread cpu the max settings nees 92gb ram... now imagine i had a 2950X with 16cores it needs 120GB ...

                    this 7zip example is just a simple example but as you can see in Compute 16gb ram is a joke if you need 128GB ram for 16core cpu...

                    and the VEGA20 has up to 4096 cores... no wonder why the APPLE version of vega20 has 32gb vram...
                    Phantom circuit Sequence Reducer Dyslexia

                    Comment


                    • #90
                      Originally posted by hoohoo View Post
                      Hi Bridgman. I'm replying to you, not the other guys complaining because you might be able to effect some change at AMD.

                      The install doc for ROCm is flat out confusing. If you just start reading from the top what you get from it is to install rocm-dmks. That's what you get when you start reading from the top. Later in the doc, in the third sentence within a paragraph that seems to be just some fluff, it says if you use post 4.18 kernel to not install rocm-dkms. IIRC it does not actually say install rocm-dev at that place in the doc, this info is given elsewhere. There is no real table of contents to the doc, and the doc's structure does not reflect the decision tree one should be using: what kernel have I got; what distro am I using.

                      Reasonable doc would have right up front the decision flow, built into to a table of contents.

                      You seem to be a sort of liason guy to the community? Maybe you could impress on the people managing ROCm development that spaghetti documentation is even worse than spaghetti code: it pisses off potential users and pushes them to the Other Companie.
                      Yep, I have the same concern - the ROCm documentation has grown huge, and it seems like it has grown past the point of maintainability without chaining an architect or two to the documentation task. I'm strongly in the "documentation has to be small enough to be kept accurate and relevant" camp but that increasingly feels like a minority view (not just inside AMD).

                      I'll try to find out who is maintaining the docco these days and pass this along.
                      Test signature

                      Comment

                      Working...
                      X