Announcement

**hwertz** · 22 February 2024, 05:10 AM

The problem I saw when I messed with rocm in the past (meaning not just used it, I was having a go at compiling in support for the Picasso GPU on my notebook), the instructions I found involved compiling 43 different items. And you had to list which GPUs you were intending to support all the way from bottom to top, like building rocm-supporting version of tensorflow the build still involved setting an environment variable with the list of GPUs it would support. There was not a clean seperation like with CUDA where the client runs some CUDA lib that makes PTX bytecode then the Nvidia driver recompiles the PTX bytecode for the particular card it's running on (to the extent that you can usually run a driver supporting a newer CUDA version and run binaries intended for an older CUDA version and have everything work.) It seemed with rocm there were code paths linked in all the way up the stack to generate card-specific bytecode for every card the software was intended to support.

I did get support for Picasso built, and it "worked" but not really -- everything actually worked, but it would not simultaneously compute AND act as a GPU so the GUI (including mouse cursor) would totally lock up when a job was running, and if I ran a job that took more than like 5 or 10 seconds the video driver would decide the GPU had locked up and reset it.

**blackiwid** · 25 February 2024, 08:43 AM

Originally posted by vegabook View Post

And it can't be fun for those who were right about compute all along, to see the GPTs come out and NVDA become a trillion dollar company.

First that is a bubble, A.I. is a bubble that is clear, A you will not need as much power to train the K.I. forever, and B with the big general A.I. there will be losers that go out of the market. Still a very profitable bubble in the short term no question.

Second I am not sure how this discussion about consumer GPUs has to do with Nvidia's big increases in Revenue, as far as I know they made not that much money by selling millions of RTX 4090 or 4070 or their big 3060 sales, it's about selling this GPU that costs 20.000 Dollar? That is not a consumer card.

So if that is their big money grab AMD must compete not primary with software of course that has to work on their top Card then too, but with bringing out either a lot faster or similarly fast cheaper card.

Also apparently if AMD would come out tomorrow and change nothing in the drivers but just say "we support a few cards officially again now", apparently a lot people here would be happy (for the most part), which is also strange to me, if it works it works, especially if we talk about obsolete cards like a Radeon VII.

And compared to the past AMD to my knowledge seems to caught up a lot, in the past there was opencl vs cuda, so basically in other words a more and more stronger monopoly of cuda. Now you know all stuff is and will be supported by rocm and other stuff, as example Pytorch.

And because every cpu will become A.I. Processor I don't even understand how universal support would be optional at all in the future? Maybe special A.I. chips have different software and Rocm is only for general purpose accelerators?

**Panix** · 25 February 2024, 03:02 PM

Originally posted by blackiwid View Post

First that is a bubble, A.I. is a bubble that is clear, A you will not need as much power to train the K.I. forever, and B with the big general A.I. there will be losers that go out of the market. Still a very profitable bubble in the short term no question.

Second I am not sure how this discussion about consumer GPUs has to do with Nvidia's big increases in Revenue, as far as I know they made not that much money by selling millions of RTX 4090 or 4070 or their big 3060 sales, it's about selling this GPU that costs 20.000 Dollar? That is not a consumer card.

So if that is their big money grab AMD must compete not primary with software of course that has to work on their top Card then too, but with bringing out either a lot faster or similarly fast cheaper card.

Also apparently if AMD would come out tomorrow and change nothing in the drivers but just say "we support a few cards officially again now", apparently a lot people here would be happy (for the most part), which is also strange to me, if it works it works, especially if we talk about obsolete cards like a Radeon VII.

And compared to the past AMD to my knowledge seems to caught up a lot, in the past there was opencl vs cuda, so basically in other words a more and more stronger monopoly of cuda. Now you know all stuff is and will be supported by rocm and other stuff, as example Pytorch.

And because every cpu will become A.I. Processor I don't even understand how universal support would be optional at all in the future? Maybe special A.I. chips have different software and Rocm is only for general purpose accelerators?

It's not so much about that it works - it's their bad documentation and I guess they think they will solve with a blog?

There's also a problem with setting it up - lots of ppl have posted on forums - if not this one, then others - about rocm/linux/amd gpus - and ppl get frustrated and either rant and quit or they rant/quit and sell their card and switch to an nvidia card to do AI/ML stuff.

I rarely read about ppl having problems configuring their nvidia gpu for AI/ML use - so, that to me suggests the biggest problem isn't 'official support' - if they don't want to 'officially support' a lot of gpus - that's one thing but probably not the biggest problem since ppl DO REPORT their gpu working with rocm/ai but there is a large volume of ppl who are unable to get everything working. Imho, that is where the focus should be - the 'WHY' and solving why ppl are having trouble.

Go ahead and research that if you doubt it - but, I assure you, ppl have issues for whatever reason and so there must not be an easy, intuitive way to configure/use their cards w/ ROCm - be it in ML/AI etc.

There's programs in other fields - often also using ROCm - in which there's reports of issues, too - so, there's a pattern there. If it's not just basic use - there's also performance complaints. It's why I haven't picked an amd gpu yet.

**blackiwid** · 25 February 2024, 11:38 PM

Originally posted by Panix View Post

There's programs in other fields - often also using ROCm - in which there's reports of issues, too - so, there's a pattern there. If it's not just basic use - there's also performance complaints. It's why I haven't picked an amd gpu yet.

Well that's your anecdote people talk to this day how bad the AMD drivers are aka having many bugs, like with normal games and stuff, on windows, which probably was true 10-15 years ago maybe true 7 years ago maybe 5 years ago, but today it's definitely untrue so maybe some people have problems with nvidia same with amd, if it happens with nvidia nobody thinks much about it but if same thing happens with AMD they say see my prejustice was correct they remember it stronger.

Can I say for sure that this is the same with ROCM no, but it could be either way.

Saying that people have problems with setting up a work tool, which they earn money with probably seems a minor problem to me, I mean you can make the same argument with linux as example...

What I know for sure is that people exaggerate about everything that is bad with AMD GPUs, and ignore all problems with Nvidia, except maybe here where their horrible proprietary shit get called out from time to time.

Maybe Graphic cards are not expensive enough, when people think that 2000 euro graphic cards that have not big problems are a problem in itself to expensive maybe it changes if they cost 15.000 dollar and amds 9000 with 20% less speed and then people don't whine if they have to setup the thing for half an hour longer.

Also I think AMD people are aware that not all is rosy always, but Nvidia people have to justify why they pay the higher price therefor they want not feel as dumb user and therefor they don't tell about their problems, similar to Apple a horrible company with partially horrible problems with their products but do you often hear big parts of their customers complain no, I think as Apple Nvidia started to be a cult already, so that's why nearly nobody complained about their horrible interface their not only in windows but that you had to log into their spy cloud shit to do some things... that nobody thinks that this are horrific antifeatures then I don't know what else would be.

The difference is AMD might slack here and there but Nvidia intentionally do evil stuff to you, and it seems people are comfortable with that because they know they are in "capable" hands, like the women that feel more save if their boyfriend is abusive to them, because then they know they can be more abusive to a attacker in extreme forms they seek even murderers in prisons.

Sorry a bit a rant, but the point states, at least the problems seems to be nitpicking overblown, maybe it's even American culture of "convinience over everything", if not just some prejusitice and cherry picked memory, like even some linux user romantacize sometimes if they are angry about some linux thing that in windows everything worked perfectly out of the box which of course is also not true.

Also remember the Starfield thing, Nvidia did similar stuff with games a 100 times and there is no real evidence that they forbit them to implement DLSS still everybody was sure and hated on AMD for it, that's a example of this bias that people seem to have against AMD.

**blackiwid** · 25 February 2024, 11:42 PM

It's maybe also that we expect just unfairly more from the underdog, so we measure unfairly. Because if 2 companies sell the same product we buy from the marked leader so the if nvidia pays of some game companies to implement exclusively DLSS or at least only the newest version of DLSS but not FSR, then we say well we know Nvidia is evil no surprise here but if the underdog dares to do the same... unbelievable. even if they do only 10% as bad stuff as the marked leader a huge scandal...

**Panix** · 26 February 2024, 07:42 AM

Originally posted by blackiwid View Post

Well that's your anecdote people talk to this day how bad the AMD drivers are aka having many bugs, like with normal games and stuff, on windows, which probably was true 10-15 years ago maybe true 7 years ago maybe 5 years ago, but today it's definitely untrue so maybe some people have problems with nvidia same with amd, if it happens with nvidia nobody thinks much about it but if same thing happens with AMD they say see my prejustice was correct they remember it stronger.

Can I say for sure that this is the same with ROCM no, but it could be either way.

Saying that people have problems with setting up a work tool, which they earn money with probably seems a minor problem to me, I mean you can make the same argument with linux as example...

What I know for sure is that people exaggerate about everything that is bad with AMD GPUs, and ignore all problems with Nvidia, except maybe here where their horrible proprietary shit get called out from time to time.

Maybe Graphic cards are not expensive enough, when people think that 2000 euro graphic cards that have not big problems are a problem in itself to expensive maybe it changes if they cost 15.000 dollar and amds 9000 with 20% less speed and then people don't whine if they have to setup the thing for half an hour longer.

Also I think AMD people are aware that not all is rosy always, but Nvidia people have to justify why they pay the higher price therefor they want not feel as dumb user and therefor they don't tell about their problems, similar to Apple a horrible company with partially horrible problems with their products but do you often hear big parts of their customers complain no, I think as Apple Nvidia started to be a cult already, so that's why nearly nobody complained about their horrible interface their not only in windows but that you had to log into their spy cloud shit to do some things... that nobody thinks that this are horrific antifeatures then I don't know what else would be.

The difference is AMD might slack here and there but Nvidia intentionally do evil stuff to you, and it seems people are comfortable with that because they know they are in "capable" hands, like the women that feel more save if their boyfriend is abusive to them, because then they know they can be more abusive to a attacker in extreme forms they seek even murderers in prisons.

Sorry a bit a rant, but the point states, at least the problems seems to be nitpicking overblown, maybe it's even American culture of "convinience over everything", if not just some prejusitice and cherry picked memory, like even some linux user romantacize sometimes if they are angry about some linux thing that in windows everything worked perfectly out of the box which of course is also not true.

Also remember the Starfield thing, Nvidia did similar stuff with games a 100 times and there is no real evidence that they forbit them to implement DLSS still everybody was sure and hated on AMD for it, that's a example of this bias that people seem to have against AMD.

How good is AMD's ROCM on Windows?

https://community.amd.com/t5/discussions/how-good-is-amd-s-rocm-on-windows/td-p/623132

I'am planning to upgrade my GPU to further learn about ML/DL. I'am planning to buy Nvidia RTX 3080 but it only has 10GB of VRAM which according to videos/information that I saw it's not really enough for ML/DL. However for AMD card they have rx 6800xt with 16GB of VRAM and also they're currently dev...

Read the reply - one guy who explains in depth - the other one is just a cheerleader?
There's other discussions about it.... take your pick....here's one e.g.:

https://www.reddit.com/r/StableDiffusion/comments/12iiovy/run_stablediffusion_locally_with_a_amd_gpu_7900xt/

https://www.reddit.com/r/learnmachinelearning/comments/u45bqf/are_amd_graphic_cards_good_for_machine_learning/

There's just too many ppl who yell 'go with Nvidia' and many of those ppl HAVE used AMD gpus with rocm and related software.

I don't care about the gaming-related discussions - I believe that either 'side' is good for gaming and in Linux, AMD is sufficient although I do think it sucks that you lack options - since I have a TV - but, that's my own personal 'dilemma' - others who use monitors won't care - and to have the open source option in the Linux ecosystem is convenient and has obvious advantages.

My concern or beef is with the ROCm system in general with associated software in productivity - and in this case, with ML/AI etc. Many companies want to get away from the proprietary system of Nvidia's even in this sphere but AMD is not making it easy for them.

**blackiwid** · 26 February 2024, 02:36 PM

Originally posted by Panix View Post

My concern or beef is with the ROCm system in general with associated software in productivity - and in this case, with ML/AI etc. Many companies want to get away from the proprietary system of Nvidia's even in this sphere but AMD is not making it easy for them.

I am not sure the first link when I overfly it they seem to be all pretty positive about amd and rocm, and the second is 2 years old, and pretty much points to this article as source why you should use Nvidia, which the question is from when this article really is because the permalink suggest:

The Best GPUs for Deep Learning in 2023 — An In-depth Analysis

https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/#Will_AMD_GPUs_ROCm_ever_catch_up_with_NVIDIA_GPUs_CUDA

Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.

Suggest 2020 but in the article it states January 2023 probably some (maybe minor update), but even if we assume it's written or valid at the start of 2023 the prediction already is reached:

Not in the next 1-2 years.

That is today or in a few months depending on which end it lands, and again probably that article is from end of 2020

.

So it's not really strong even anecdotal evidence that this problems are big today. Nvidia has lot's of money, as they already said they are a software company or what is their new marketing gag, we are a A.I. Company? For now Nvidia is the new Apple, but that does not mean they can crush the competition Android dominates the Smartphone market and PC is bigger than Apple-walled-garden-computers. So yes having a strong hold in the premium market and like apple selling soon probably nearly exclusively directly to the cutomer after ripping of and throwing all their oems before a truck earns good money, but there will be always resistance to absurdly high prices they ask for.
And AMD did it once they crushed Intel, defeated the goliath as a david. They could not crush Intel and Nvidia at the same time in the last 10 years, but they seem to play often the long game, get things right, while Nvidia always goals for the flashy shiny marketing win with some shortcuts, like with GSync, something a developer would call a Proof of Concept prototype expecting people to buy some strange expensive Monitors because they could not do it with the Software and a cheaper hardware.

Doing it right will always take longer than do some ugly expensive shortcuts. Also making something run on all gpus all plattforms etc takes longer than just do it for 1 gen of your newest products. People that are easily persuated by some flashy cars salesman will buy Nvidia it is what it is.

**ssokolow** · 26 February 2024, 04:53 PM

Originally posted by sobrus View Post

I like Intel's approach on this matter. First generation hardware that barely can run any game and BOOM: opencl, pytorch, tensorflow, openvino, oneapi, etc, and even blender using hardware ray-tracing cores. They didn't even need to say a word.

Meanwhile AMD is silently removing any information that older cards were ever oficially supported by ROCm 5.x and creating a blog

*nod* I suspect, next time I buy a new GPU in 2035, I'll be going for an AMD CPU and an Intel GPU.

(Seriously on the 2035 part. I jumped from an Athlon II X2 270 and a GTX750 to a Ryzen 5 7600 and an RTX 3060 and only because the GTX750 couldn't do CUDA well enough and putting the RTX 3060 into the Athlon II was starting to bump up against how difficult it is to retrofit a conda environment with a non-AVX version of TensorFlow if the upstream didn't choose to provide one.)

**ssokolow** · 26 February 2024, 04:59 PM

Originally posted by hwertz View Post

The problem I saw when I messed with rocm in the past (meaning not just used it, I was having a go at compiling in support for the Picasso GPU on my notebook), the instructions I found involved compiling 43 different items. And you had to list which GPUs you were intending to support all the way from bottom to top, like building rocm-supporting version of tensorflow the build still involved setting an environment variable with the list of GPUs it would support. There was not a clean seperation like with CUDA where the client runs some CUDA lib that makes PTX bytecode then the Nvidia driver recompiles the PTX bytecode for the particular card it's running on (to the extent that you can usually run a driver supporting a newer CUDA version and run binaries intended for an older CUDA version and have everything work.) It seemed with rocm there were code paths linked in all the way up the stack to generate card-specific bytecode for every card the software was intended to support.

I did get support for Picasso built, and it "worked" but not really -- everything actually worked, but it would not simultaneously compute AND act as a GPU so the GUI (including mouse cursor) would totally lock up when a job was running, and if I ran a job that took more than like 5 or 10 seconds the video driver would decide the GPU had locked up and reset it.

*nod* This isn't 2007 and I'm not on Gentoo anymore. I want to just run the install script for a GPU compute thing in Firejail and have it work.

**tosunpasha** · 28 February 2024, 12:13 AM

Originally posted by Beach View Post

We need less talking and "blogging" and more hardware support.
It's that simple, AMD.

I totally agree with you. The least they should do is to talk.

Announcement

AMD's Latest ROCm Effort: More Blogging With A New Blog Platform

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment