Announcement

Collapse
No announcement yet.

AmdGPU crashes with memory issue

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AmdGPU crashes with memory issue

    I have problem with OS drivers to AMD GPU (intagrated into Ryzen 5). It manifests in crashing Civ6. I'm not have user of graphic interfaces, so maybe there are other.

    From logs it looks like the GPU driver has problem with handling memory. Any advices?

    My kernel is:
    Code:
    Linux manila 5.3.6 #2 SMP Thu Oct 17 23:37:12 BST 2019 x86_64 AMD Ryzen 5 2400G with Radeon Vega Graphics AuthenticAMD GNU/Linux
    and my hardware is
    Code:
    Hardware name: Micro-Star International Co., Ltd. MS-7A38/B450M PRO-VDH PLUS (MS-7A38), BIOS 9.00 01/07/2019

    In Xorg I see:
    Code:
    [241183.219] (WW) AMDGPU(0): flip queue failed: Cannot allocate memory
    [241183.219] (WW) AMDGPU(0): Page flip failed: Cannot allocate memory
    [241183.219] (EE) AMDGPU(0): present flip failed
    In the kernel logs I see plenty of following entries:

    Code:
    Oct 20 19:37:32 manila kernel: [240791.720571] [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
    Oct 20 19:37:32 manila kernel: [240791.721520] [TTM] Failed to find memory space for buffer 0x0000000041db0218 eviction
    Oct 20 19:37:32 manila kernel: [240791.721527] [TTM]  No space for 0000000041db0218 (768 pages, 3072K, 3M)
    Oct 20 19:37:32 manila kernel: [240791.721530] [TTM]    placement[0]=0x00060002 (1)
    Oct 20 19:37:32 manila kernel: [240791.721531] [TTM]      has_type: 1
    Oct 20 19:37:32 manila kernel: [240791.721532] [TTM]      use_type: 1
    Oct 20 19:37:32 manila kernel: [240791.721534] [TTM]      flags: 0x0000000A
    Oct 20 19:37:32 manila kernel: [240791.721536] [TTM]      gpu_offset: 0x00000000
    Oct 20 19:37:32 manila kernel: [240791.721538] [TTM]      size: 786432
    Oct 20 19:37:32 manila kernel: [240791.721539] [TTM]      available_caching: 0x00070000
    Oct 20 19:37:32 manila kernel: [240791.721540] [TTM]      default_caching: 0x00010000
    Oct 20 19:37:32 manila kernel: [240791.721542] [TTM]  0x0000000000000400-0x0000000000000401: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721543] [TTM]  0x0000000000000401-0x0000000000000443: 66: used
    Oct 20 19:37:32 manila kernel: [240791.721545] [TTM]  0x0000000000000443-0x0000000000000445: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721546] [TTM]  0x0000000000000445-0x0000000000000447: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721547] [TTM]  0x0000000000000447-0x0000000000000449: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721548] [TTM]  0x0000000000000449-0x000000000000044b: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721549] [TTM]  0x000000000000044b-0x000000000000044d: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721550] [TTM]  0x000000000000044d-0x000000000000044f: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721550] [TTM]  0x000000000000044f-0x0000000000000451: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721551] [TTM]  0x0000000000000451-0x0000000000000453: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721552] [TTM]  0x0000000000000453-0x0000000000000455: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721553] [TTM]  0x0000000000000455-0x0000000000000456: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721554] [TTM]  0x0000000000000456-0x0000000000000556: 256: used
    Oct 20 19:37:32 manila kernel: [240791.721555] [TTM]  0x0000000000000556-0x0000000000000557: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721556] [TTM]  0x0000000000000557-0x0000000000000558: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721556] [TTM]  0x0000000000000558-0x0000000000000559: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721557] [TTM]  0x0000000000000559-0x000000000000055a: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721558] [TTM]  0x000000000000055a-0x000000000000055b: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721559] [TTM]  0x000000000000055b-0x000000000000055c: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721559] [TTM]  0x000000000000055c-0x000000000000055d: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721560] [TTM]  0x000000000000055d-0x000000000000055e: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721561] [TTM]  0x000000000000055e-0x0000000000000560: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721562] [TTM]  0x0000000000000560-0x0000000000000561: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721563] [TTM]  0x0000000000000561-0x0000000000000562: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721563] [TTM]  0x0000000000000562-0x0000000000000563: 1: used
    Oct 20 19:37:32 manila kernel: [240791.721566] [TTM]  0x0000000000000563-0x0000000000000565: 2: used
    Oct 20 19:37:32 manila kernel: [240791.721566] [TTM]  0x0000000000000565-0x0000000000000665: 256: used
    Oct 20 19:37:32 manila kernel: [240791.721567] [TTM]  0x0000000000000665-0x000000000000075f: 250: used
    Oct 20 19:37:32 manila kernel: [240791.721568] [TTM]  0x000000000000075f-0x0000000000000800: 161: free
    Oct 20 19:37:32 manila kernel: [240791.721569] [TTM]  0x0000000000000800-0x0000000000000900: 256: used
    Oct 20 19:37:32 manila kernel: [240791.721570] [TTM]  0x0000000000000900-0x0000000000040000: 259840: free
    Oct 20 19:37:32 manila kernel: [240791.721571] [TTM]  total: 261120, used 1119 free 260001
    Oct 20 19:37:32 manila kernel: [240791.721573] [TTM]  man size:786432 pages, gtt available:3 pages, usage:3071MB
    Oct 20 19:37:32 manila kernel: [240791.721578] amdgpu 0000:38:00.0: 00000000351c08d9 pin failed
    Oct 20 19:37:32 manila kernel: [240791.721585] [drm:dm_plane_helper_prepare_fb] *ERROR* Failed to pin framebuffer with error -12
    Oct 20 19:37:32 manila kernel: [240791.723955] [drm:amdgpu_cs_ioctl] *ERROR* Not enough memory for command submission!
    Oct 20 19:37:32 manila kernel: [240791.726364] [TTM] Failed to find memory space for buffer 0x000000004cf968b6 eviction
    Oct 20 19:37:32 manila kernel: [240791.726370] [TTM]  No space for 000000004cf968b6 (3072 pages, 12288K, 12M)

    They are followed by entries similar to one below:

    Code:
    Oct 20 19:44:04 manila kernel: [241182.718765] ------------[ cut here ]------------
    Oct 20 19:44:04 manila kernel: [241182.718770] WARNING: CPU: 7 PID: 2241 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hw_sequencer.c:932 dcn10_verify_allow_pstate_change_high+0x2c/0x228
    Oct 20 19:44:04 manila kernel: [241182.718770] Modules linked in: ctr ccm iptable_nat nf_nat nf_conntrack nf_defrag_ipv4 ipv6 nf_defrag_ipv6 rtl8192ee btcoexist rtl_pci snd_hda_codec_realtek rtlwifi snd_hda_codec_generic mac80211 snd_hda_codec_hdmi snd_hda_intel sha256_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm cfg80211 snd_timer snd k10temp video acpi_cpufreq xhci_pci xhci_hcd 8250 8250_base serial_core
    Oct 20 19:44:04 manila kernel: [241182.718776] CPU: 7 PID: 2241 Comm: InputThread Tainted: G        W         5.3.6 #2
    Oct 20 19:44:04 manila kernel: [241182.718776] Hardware name: Micro-Star International Co., Ltd. MS-7A38/B450M PRO-VDH PLUS (MS-7A38), BIOS 9.00 01/07/2019
    Oct 20 19:44:04 manila kernel: [241182.718777] RIP: 0010:dcn10_verify_allow_pstate_change_high+0x2c/0x228
    Oct 20 19:44:04 manila kernel: [241182.718778] Code: 48 8b 87 f8 02 00 00 48 89 fb 48 8b b8 b0 01 00 00 e8 af ef 00 00 84 c0 0f 85 05 02 00 00 48 c7 c7 33 ca 00 82 e8 f7 5a af ff <0f> 0b 80 bb 9f 01 00 00 00 0f 84 ea 01 00 00 31 ed 48 8b 83 f8 02
    Oct 20 19:44:04 manila kernel: [241182.718779] RSP: 0018:ffffc90000c07a18 EFLAGS: 00010246
    Oct 20 19:44:04 manila kernel: [241182.718779] RAX: 0000000000000024 RBX: ffff888409200000 RCX: 0000000000000000
    Oct 20 19:44:04 manila kernel: [241182.718780] RDX: 0000000000000000 RSI: ffffc90000c07904 RDI: ffffffff8272482c
    Oct 20 19:44:04 manila kernel: [241182.718780] RBP: ffff888409200000 R08: 00000000000017f8 R09: 0000000000000004
    Oct 20 19:44:04 manila kernel: [241182.718781] R10: 0000000000001700 R11: 00031661ef6d9184 R12: ffff8882cbb901b8
    Oct 20 19:44:04 manila kernel: [241182.718782] R13: ffff8882cbb91d18 R14: ffff88835e474000 R15: ffff88838f75ed00
    Oct 20 19:44:04 manila kernel: [241182.718783] FS:  00007f2ce77fe700(0000) GS:ffff888410bc0000(0000) knlGS:0000000000000000
    Oct 20 19:44:04 manila kernel: [241182.718783] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 20 19:44:04 manila kernel: [241182.718784] CR2: 00007f41d415ffe0 CR3: 0000000408bd6000 CR4: 00000000003406e0
    Oct 20 19:44:04 manila kernel: [241182.718784] Call Trace:
    Oct 20 19:44:04 manila kernel: [241182.718786]  dc_stream_set_cursor_attributes+0xf4/0xff
    Oct 20 19:44:04 manila kernel: [241182.718787]  handle_cursor_update+0x1e9/0x289
    Oct 20 19:44:04 manila kernel: [241182.718788]  drm_atomic_helper_async_commit+0x5d/0xaf
    Oct 20 19:44:04 manila kernel: [241182.718790]  drm_atomic_helper_commit+0x2c/0xec
    Oct 20 19:44:04 manila kernel: [241182.718791]  drm_atomic_helper_update_plane+0xc4/0xe1
    Oct 20 19:44:04 manila kernel: [241182.718792]  drm_mode_cursor_universal+0x177/0x1ee
    Oct 20 19:44:04 manila kernel: [241182.718794]  drm_mode_cursor_common+0x104/0x1e1
    Oct 20 19:44:04 manila kernel: [241182.718795]  ? drm_mode_setplane+0x211/0x211
    Oct 20 19:44:04 manila kernel: [241182.718796]  drm_mode_cursor_ioctl+0x3a/0x54
    Oct 20 19:44:04 manila kernel: [241182.718798]  drm_ioctl_kernel+0x8d/0xe1
    Oct 20 19:44:04 manila kernel: [241182.718799]  drm_ioctl+0x1f6/0x2d5
    Oct 20 19:44:04 manila kernel: [241182.718800]  ? drm_mode_setplane+0x211/0x211
    Oct 20 19:44:04 manila kernel: [241182.718802]  ? __wake_up_common_lock+0x82/0xac
    Oct 20 19:44:04 manila kernel: [241182.718803]  amdgpu_drm_ioctl+0x45/0x72
    Oct 20 19:44:04 manila kernel: [241182.718804]  vfs_ioctl+0x19/0x26
    Oct 20 19:44:04 manila kernel: [241182.718806]  do_vfs_ioctl+0x51d/0x545
    Oct 20 19:44:04 manila kernel: [241182.718807]  ksys_ioctl+0x4b/0x6b
    Oct 20 19:44:04 manila kernel: [241182.718809]  __x64_sys_ioctl+0x11/0x14
    Oct 20 19:44:04 manila kernel: [241182.718810]  do_syscall_64+0x49/0x56
    Oct 20 19:44:04 manila kernel: [241182.718811]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    Oct 20 19:44:04 manila kernel: [241182.718811] RIP: 0033:0x7f2d1b1db167
    Oct 20 19:44:04 manila kernel: [241182.718812] Code: 00 00 90 48 8b 05 29 9d 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f9 9c 0c 00 f7 d8 64 89 01 48
    Oct 20 19:44:04 manila kernel: [241182.718813] RSP: 002b:00007f2ce77fc498 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
    Oct 20 19:44:04 manila kernel: [241182.718814] RAX: ffffffffffffffda RBX: 000055a8a777d290 RCX: 00007f2d1b1db167
    Oct 20 19:44:04 manila kernel: [241182.718814] RDX: 00007f2ce77fc4d0 RSI: 00000000c01c64a3 RDI: 000000000000000b
    Oct 20 19:44:05 manila kernel: [241182.718815] RBP: 00007f2ce77fc4d0 R08: 0000000000000001 R09: 0000000000000001
    Oct 20 19:44:05 manila kernel: [241182.718815] R10: 0000000000000780 R11: 0000000000003246 R12: 00000000c01c64a3
    Oct 20 19:44:05 manila kernel: [241182.718816] R13: 000000000000000b R14: 0000000000000330 R15: 000000000000025c
    Oct 20 19:44:05 manila kernel: [241182.718817] ---[ end trace 603c680d6992af3a ]---
    Oct 20 19:44:05 manila kernel: [241182.719290] [drm] pstate TEST_DEBUG_DATA: 0x3EF6000F
Working...
X