Originally posted by pegasus
View Post
Announcement
Collapse
No announcement yet.
Lisa Su Reaffirms Commitment To Improving AMD ROCm Support, Engaging The Community
Collapse
X
-
Last edited by CochainComplex; 18 June 2023, 08:26 AM.
- Likes 3
-
Originally posted by CochainComplex View Post
Thats great but a lot of cuda projects do start as small (hobby) project on consuner cards. I have the access to a pretty nice HPC at a well known researche Institute. Good for us. But do you know how many individual users are on this on daily bases? Maybe 50?! Physicist analysing huge particle accelerator data sets. So its always pretty loaded despite the low user count. The outcome of this data is great and yes it will be mentioned in papers what hpc was used...this will give an image boost to amd. But to gain market shares back from cuda? I don't know. Yes sure for this particular field of interest. But how many IT professional will it convince to use Rocm/HIP instead of CUDA? Most students/researchers just gonna buy a mobile computer with consumer grade nvidia card and start tinkering with this python-> cublas+pytorch + tensorflow there you go. Once they need more power and/or if they are settled in a research position they will upgraded to some nvidia quadro system ...that's how nvidia is getting and keeping their cuda marketshares
- Likes 2
Comment
-
I don't quite understand what's the main problem with ROCm. Is it a complicated install/deploy procedure or is it being virtually non-functional?
The former one is a semi-issue i'd say. To this date AMD is more of a hardware/components company, not a complete solution company. IMHO it's OK to provide only core tools/framework support instead complete solution if you price your hardware accordingly. It becomes a problem only when AMD tries to position itself as a equivalent competitive premium brand but actually lacks competitiveness in a software/features.
Comment
-
Originally posted by sophisticles View Post
No they don't.
Their net income was a loss of $139 million dollars GAAP and $970 million non-GAAP.
The problem is that corporations of this size have balance sheets that are misleading but if you download their P&L sheet you will see that AMD had a net income of negative $139 million and the P&L is what ultimately counts when judging the financial stability of a company.
On the processor side, it looks like a lot of us are staying on AM4 and riding it out. I know that, for me, the lowest end of AM5 is more than I'd care to pay due to needing new ram, motherboard, and possibly a heat sync if I upgrade too much. If it was just the MB/CPU I'd pull the trigger. Since I'm happy with my current performance, for the most part, I'd rather upgrade to a 5800x3d and get a Noctua d15....or as of a few days ago, hold tight and see if the 5600x3d rumors are true.
In regards to the consoles, they're working on Sony and Microsoft time. There really ain't nothing AMD can do about that other than offer kickass APUs in the interim so companies like Asus or Lenovo will want to pick them to build their handheld gaming devices which will help to keep up their "street cred" to try to pick up Nintendo in addition to Sony, Microsoft, and Valve. Unless AMD and Nintendo are working on the Switch 2, I wouldn't expect the console division to pick up until the PS6 era.
- Likes 3
Comment
-
Originally posted by drakonas777 View PostI don't quite understand what's the main problem with ROCm. Is it a complicated install/deploy procedure or is it being virtually non-functional?
The former one is a semi-issue i'd say. To this date AMD is more of a hardware/components company, not a complete solution company. IMHO it's OK to provide only core tools/framework support instead complete solution if you price your hardware accordingly. It becomes a problem only when AMD tries to position itself as a equivalent competitive premium brand but actually lacks competitiveness in a software/features.
AMD, please wake up. The world needs real competition. Please invest in +30 system developers and stop the ROCm joke, it's already a too old one.
- Likes 1
Comment
-
Originally posted by X_m7 View Post
Aren't those the ones that AMD officially supports anyway (I see Ubuntu 20.04+22.04, SLES 15 SP4 and RHEL 8.6+8.7+9.1 on their website), so you can just get the stuff from them directly? Unless the problem is that you want to use ROCm but with whatever ancient kernel driver the distro has instead of the one AMD offers, which just sounds like the age old stable distro problem (you want "stable" stuff, you get old stuff...).
Also, afaik you can't even run RHEL or SLES without paying for a license first.
- Likes 1
Comment
-
Originally posted by M@yeulC View PostROCm was a pain to install. There is a PKGBUILD, but no stable releases, complex dependencies, and a lot of code to download. Code often didn't compile when building the package.
I ended up successfully installing it for my R9 Fury (gfx803) some time ago, but there has been some breakage since, and IIRC compute kernels just segfault. I've seen a few repositories trying to fix this, but I threw the towel before successfully fixing it.
How can I take AMD seriously? If I haven't been able to make their stack work on my consumer card after days of fiddling, I certainly won't recommend their stack at work. Consumer products are an entry point to the pro market. A lot of pytorch, etc, contributors only have access to consumer cards.
There is a serious lack of regression testing and interest in consumer hardware with the ROCm team. They should look at what Espressif built with the community. SDKs and runtime should be as easy to install as possible, and work on any consumer hardware (even if slowly).
As matter of fact I run this morning OPENAI-Whisper on my rx6800 (with GPU support by ROCm) via python and pytorch
OpenAI-Whisper is provided by AUR package 'whisper-git'
The only 2 PKGBUILD I had to manually create are not ROCm related, but huggingface (transformers and safetensors) and are needed only if you plan to re-train (or fine-tune) the model (the AUR equivalent did not compile).
Below the 2 PKGBUILD.
Note that the' depends' may not be complete -- the PKGBUILD are meant for my own system
Code:pkgname=huggingface-safetensors-git _pkgname='safetensors' _pkgver='git' pkgver=git pkgrel=1 pkgdesc="Safetensors is a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy)" arch=('x86_64') url="https://github.com/huggingface/safetensors" license=('Apache License 2.0') depends=(python-setuptools-rust) privides=('hugginface-safetensors=git') source=("${_pkgname}::git+https://github.com/huggingface/safetensors.git") sha512sums=('SKIP') build() { cd "${_pkgname}/bindings/python" python setup.py build } package() { #cd transformers-$pkgver cd "${_pkgname}/bindings/python/" python setup.py install --root="$pkgdir" }
Code:pkgname=huggingface-transformers-git _pkgname='transformers' _pkgver='git' pkgver=git pkgrel=1 pkgdesc="State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow" arch=('x86_64') url="https://github.com/huggingface/transformers" license=('Apache License 2.0') depends=(python-huggingface-hub python-regex python-tokenizers huggingface-safetensors-git) provides=('huggingface-safetensors=git') source=("${_pkgname}::git+https://github.com/huggingface/transformers.git") sha512sums=('SKIP') build() { cd "${_pkgname}" python setup.py build } package() { #cd transformers-$pkgver cd "${_pkgname}" python setup.py install --root="$pkgdir" --optimize=1
- Likes 1
Comment
-
Originally posted by drakonas777 View PostI don't quite understand what's the main problem with ROCm. Is it a complicated install/deploy procedure or is it being virtually non-functional?
First issue is that while individual components have documented build procedures building the entire stack can still be challenging due to inter-component dependencies. This is not an issue for large customers using enterprise distros and our pre-built / QA'ed packages but has been a problem for users of consumer distros. Our goal for non-enterprise distros is to help get ROCm components included in distro packaging (as we do for graphics) rather than following the enterprise model with separately delivered drivers.
Second issue is that our development and testing initially focused on Instinct parts rather than consumer/workstation parts.
In the early years the MI* parts were much closer to consumer parts and so consumer and workstation cards were able to benefit directly from MI* testing, but Radeon VII was the last consumer part before the CDNA/RDNA divergence.
There has been a lot of work over the last couple of years to bring ROCm support to our RDNA-based workstation products and so we are getting back to the earlier state where consumer parts can leverage the development and testing done for commercial SKUs, and the results from that are already starting to be visible.Last edited by bridgman; 18 June 2023, 03:08 PM.Test signature
- Likes 7
Comment
-
Originally posted by wsippel View Post
HIP has supported runtime compilation for quite some time now. If you check libMIOpen.so with roc-obj-ls for example, it's a fat binary with no gfx1100 section, yet it still works on RDNA3 cards. Just takes a few seconds longer on the first run to compile everything.
Comment
-
Originally posted by CochainComplex View PostThats great but a lot of cuda projects do start as small (hobby) project on consuner cards. I have the access to a pretty nice HPC at a well known researche Institute. Good for us. But do you know how many individual users are on this on daily bases? Maybe 50?! Physicist analysing huge particle accelerator data sets. So its always pretty loaded despite the low user count. The outcome of this data is great and yes it will be mentioned in papers what hpc was used...this will give an image boost to amd. But to gain market shares back from cuda? I don't know. Yes sure for this particular field of interest. But how many IT professional will it convince to use Rocm/HIP instead of CUDA? Most students/researchers just gonna buy a mobile computer with consumer grade nvidia card and start tinkering with this python-> cublas+pytorch + tensorflow there you go. Once they need more power and/or if they are settled in a research position they will upgraded to some nvidia quadro system ...that's how nvidia is getting and keeping their cuda marketshares
So market is ripe for a disruption and if AMD is smart, they will attack full force. Lots of these well knows institutes already have some Mi210 cards here and there and are porting their stuff and I expect Mi300 to sell very well as it will bring the next step in perf/$ and perf/W. (Nobody cares about absolute performance of single gpus anymore)
- Likes 4
Comment
Comment