Originally posted by perpetually high
View Post
Announcement
Collapse
No announcement yet.
AMD Stages Latest Radeon/AMDGPU Changes For Linux 4.21 Kernel
Collapse
X
-
Originally posted by perpetually high View PostAhh, it was the clocks!
Etherman and aufkrawall, you guys called it. Thank you for the suggestion to lower the clocks.
I ended up using rocm-smi to manually set sclk to level 5: $ rocm-smi --setsclk 5
For reference on my card:
Code:GPU[0] : Supported GPU clock frequencies on GPU0 GPU[0] : 0: 300Mhz GPU[0] : 1: 608Mhz GPU[0] : 2: 910Mhz GPU[0] : 3: 1077Mhz GPU[0] : 4: 1145Mhz GPU[0] : 5: 1191Mhz * GPU[0] : 6: 1236Mhz GPU[0] : 7: 1303Mhz
Annoyed I didn't think to try this sooner. Thank you guys again. This makes sense why it's affecting certain cards and not others. For the record, I have very good system cooling and air flow, and a 700W PSU. I know my system can handle the RX 480 at full load.
raonlinux, would be great if you could test out this theory, too. Let me know if you need any help with rocm-smi or getting the card to downclock.
btw, if you have dual-bios with different clocks it might be also a possible solution if you don't want to be changing clocks with scripts or manually.
- Likes 1
Comment
-
Originally posted by perpetually high View PostAhh, it was the clocks!
Etherman and aufkrawall, you guys called it. Thank you for the suggestion to lower the clocks.
I ended up using rocm-smi to manually set sclk to level 5: $ rocm-smi --setsclk 5
For reference on my card:
Code:GPU[0] : Supported GPU clock frequencies on GPU0 GPU[0] : 0: 300Mhz GPU[0] : 1: 608Mhz GPU[0] : 2: 910Mhz GPU[0] : 3: 1077Mhz GPU[0] : 4: 1145Mhz GPU[0] : 5: 1191Mhz * GPU[0] : 6: 1236Mhz GPU[0] : 7: 1303Mhz
raonlinux, would be great if you could test out this theory, too. Let me know if you need any help with rocm-smi or getting the card to downclock.Code:OD_SCLK: 0: 300MHz 800mV 1: 608MHz 818mV 2: 910MHz 824mV 3: 1077MHz 906mV 4: 1145MHz 968mV 5: 1191MHz 1012mV 6: 1236MHz 1062mV 7: 1303MHz 1143mV OD_MCLK: 0: 300MHz 800mV 1: 2000MHz 975mV OD_RANGE: SCLK: 300MHz 2000MHz MCLK: 300MHz 2250MHz VDDC: 800mV 1175mV
Comment
-
Originally posted by raonlinux View Post
Code:OD_SCLK: 0: 300MHz 800mV 1: 608MHz 818mV 2: 910MHz 824mV 3: 1077MHz 906mV 4: 1145MHz 968mV 5: 1191MHz 1012mV 6: 1236MHz 1062mV 7: 1303MHz 1143mV OD_MCLK: 0: 300MHz 800mV 1: 2000MHz 975mV OD_RANGE: SCLK: 300MHz 2000MHz MCLK: 300MHz 2250MHz VDDC: 800mV 1175mV
Don't forget to enable amdgpu.ppfeaturemask=0xffffffff on kernel parameters.
Comment
-
Man, I just have to say, it's sooo nice to be able to game again worry-free of hangs. I played for hours today and zero hangs. Passed all the previous checkpoints in BioShock Infinite, Metro 2033 Redux, etc that I couldn't get to before... perpetually high is back, baby!
1191MHz on the core clock is only a compromise of a 112 MHz from the default 1303, I can live with that. I'm going to revisit upping the voltage on the 1303 MHz setting at a later time. Will update my post with results as well when I do.
I check my device/pp_od_clk_voltage so is set like that, how I must set the state 6 to test out. Do I need to install rocm for state the gpu clock, or can I do it with commands? Let me know if you don' t have problem with level 6.Originally posted by clapbr View PostYou don't need rocm, follow this guide (its arch wiki but valid for other distros) https://wiki.archlinux.org/index.php/AMDGPU
Don't forget to enable amdgpu.ppfeaturemask=0xffffffff on kernel parameters.
rocm-smi is really nice, and doesn't require that flag to be set. I highly recommend it in general.
Comment
-
Thanks for the help guys, so far try to run unigine heaven without a crash I set the maximum state as the lvl 6 without problem. I should try more games for know if this work with others games.
At the end I set that the set allow are from 0 ~ 6.
Code:echo "0 1 2 3 4 5 6" > pp_dpm_sclk
Comment
-
Originally posted by perpetually high View PostMan, I just have to say, it's sooo nice to be able to game again worry-free of hangs. I played for hours today and zero hangs. Passed all the previous checkpoints in BioShock Infinite, Metro 2033 Redux, etc that I couldn't get to before... perpetually high is back, baby!
1191MHz on the core clock is only a compromise of a 112 MHz from the default 1303, I can live with that. I'm going to revisit upping the voltage on the 1303 MHz setting at a later time. Will update my post with results as well when I do.
Yeah, you could go that route also. As a warning though- I had issues with setting the amdgpu.ppfeaturemask=0xffffffff flag. Others have also from what I've seen online. You might not, but then again we have the same exact card so you likely will.
rocm-smi is really nice, and doesn't require that flag to be set. I highly recommend it in general.
From all the issues I ever got I luckily never had a full system crash with AMD drivers except when I messed with overclocking. Good to know about rocm-smi, I will try it.
I don't know if you can set custom clock states using rocm-smi but if you can it might be worth trying a simple trial-and-error method to find the maximum clock that doesn't crash for you.
Comment
-
Originally posted by debianxfce View Post
The auto setting should use the bios of the GPU card. Poor bios you might have in your GPU card. I hope you have latest drivers, Linux amdgpu firmware and bios.
Comment
-
Originally posted by perpetually high View PostTook a photo of a GPU hang occurring in Metro 2033 Redux with GALLIUM_HUD env var set:
- GPU temp: 66c
- GPU load: 99%
- CPU's were at 3.6 GHz (Turbo Boost from base 3.4 apparently)
- CPU loads were 71, 57, 50, 75
- FPS was at 163
- VRAM usage was reasonable at 1.175GB
So the GPU load at 99 is the only thing that sticks out here. Also, you'll also see on the bottom left the textures became screwed up. Usually when that happens, about 1 or 2 seconds later the hang happens, as the case here.
Comment
Comment