Announcement

Collapse
No announcement yet.

Linux 6.10 Improves AMD ROCm Compute Support For "Small" Ryzen APUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by agd5f View Post

    It avoids potential duplicate locking an prevents locking splat in the kernel log.



    It allows additional apps to run that look for a certain amount of VRAM. On APUs, VRAM is just system memory so whether you use system memory or VRAM is irrelevant performance-wise. However, a number of applications don't take this into account and just always use VRAM. Since the VRAM carve out is relatively small on APUs, apps that require large amounts of VRAM won't run.



    It's not applicable to dGPUs. On dGPUs, VRAM is significantly more performant than system memory so you can't use the pools interchangeably. It's currently enabled on all APUs.



    GFX9 is still well supported. All CDNA parts are based on gfx9. Kernel driver issues can be reported here:
    amd (amdgpu, amdkfd, radeon) drm project, currently for issues only.

    Kernel driver patches should be submitted to:

    Patches or bug reports for ROCm user mode components should be filed here:
    AMD ROCm™ Software - GitHub Home. Contribute to ROCm/ROCm development by creating an account on GitHub.

    You are a beautiful soul, thank you so much for replying! I asked some questions because:

    - A Reddit user in the /r/AMD group stated Vega APUs are not included in this, and that they bought an MI50 and it was no longer supported with ROCm.

    Silliness, I suppose.

    - The /r/LocalLLaMA group wants to be able to enable their GPUs to use system RAM in low-VRAM situations, such as running a 32GB model on a 24GB GPU, and hoped this would also cover their situation. (There was also discussion there of using their GFX8 GPUs in a similar fashion - RX 580 8GB GPUs + using more VRAM )

    - There was also discussion about using their GFX8 APUs on some older HP “ROCm certified” mini PCs as well as other APUs of that generation.

    Does
    Code:
    ROC_ENABLE_PRE_VEGA=1
    work for those older APUs + this latest fix or is there something else needed to enable those?

    Comment


    • #12
      Originally posted by filbo View Post
      Eirikr1848 those sound like excellent questions and a great opening offer to do useful testing for them. I hope you are attempting to contact the players by some more direct channel than the comments forum under a Phoronix article!
      Looks like the biggest, baddest, coolest playa replied, so there ya have it. 🥳

      Comment


      • #13
        I installed Linux kernel 6.10-rc1 on my 6800U laptop and run stable diffusion without problem. However when I tried to play video on browser the system freeze. I have to hold the power button to turn off my laptop. Here is the kern.log message.

        Code:
        2024-05-29T09:08:35.105988+08:00 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring vcn_dec_0 timeout, signaled seq=9117, emitted seq=9120
        2024-05-29T09:08:35.106015+08:00 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process RDD Process pid 18573 thread f
        irefox-bi:cs0 pid 20093
        2024-05-:35.106018+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
        2024-05-29T09:08:36.296719+08:00 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
        2024-05-29T09:08:36.585714+08:00 kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x000002c0 != 0x00000200n
        2024-05-29T09:08:36.639720+08:00 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
        2024-05-29T09:08:36.639735+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: MODE2 reset
        2024-05-29T09:08:36.651699+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
        2024-05-29T09:08:36.652723+08:00 kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
        2024-05-29T09:08:36.652739+08:00 kernel: [drm] VRAM is lost due to GPU reset!
        2024-05-29T09:08:36.652743+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: PSP is resuming...
        2024-05-29T09:08:36.674720+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR
        2024-05-29T09:08:37.003715+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available
        2024-05-29T09:08:37.014719+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: RAP: optional rap ta ucode is not available
        2024-05-29T09:08:37.014734+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
        2024-05-29T09:08:37.014737+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resuming...
        2024-05-29T09:08:37.017713+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: SMU is resumed successfully!
        2024-05-29T09:08:37.018712+08:00 kernel: [drm] DMUB hardware initialized: version=0x04000044
        2024-05-29T09:08:37.684726+08:00 kernel: [drm] kiq ring mec 2 pipe 1 q 0
        2024-05-29T09:08:37.974903+08:00 kernel: amdgpu 0000:03:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
        2024-05-29T09:08:37.974925+08:00 kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vcn_v3_0> failed -110
        2024-05-29T09:08:37.974929+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset(1) failed
        2024-05-29T09:08:37.974937+08:00 kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset end with ret = -110
        2024-05-29T09:08:37.975743+08:00 kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
        2024-05-29T09:08:39.260728+08:00 kernel: [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002n
        2024-05-29T09:08:39.545722+08:00 kernel: [drm] Register(0) [mmUVD_RBC_RB_RPTR] failed to reach value 0x00000010 != 0x00000000n​
        I am not sure where should I post this to but hopefully someone can forward this to AMD Linux kernel developer. Thanks

        Comment

        Working...
        X