I've been suffering from hard freeze since I upgraded my system to a Ryzen 1800X. Most of the time the system didn't even respond to REISUB emergency instructions, however, I managed to get the kernel logs and from the stacktrace, it actually looks like an AMDGPU bug
Symptoms:
Hard freeze when I leave the system for more than half an hour. It doesn't even respond to REISUB if I come back after 1 hour, but I can use that if I come back soon enough.
I thought it was because the system was going idle, but I left it with "stress -c6" running in the background and it froze anyway.
My config:
I tried Ubuntu 17.10 - 18.04, and all stable kernel versions, since the official that comes with 17.10 to the latest 4.16.1 (always installed from the ubuntu kernel ppa)
CPU: Ryzen 1800X - 32GB RAM
Motherboard: Gigabyte AX370 Gaming K7
GPU: AMD 390X 8GB
I didn't get any freeze before the upgrade. I didn't reinstall the OS.
I thought it was one of these bugs, but my stack trace didn't look anything like that:
https://bugs.launchpad.net/ubuntu/+s...x/+bug/1690085
Stack trace:
Stack trace 2, one day later:
AMDGPU vs Radeon
Right now I blacklisted AMDGPU and reverted my kernel parameters to use radeon instead. I also updated the initramfs to make sure AMDGPU is not loaded. lsmod shows that I'm successfully using radeon now, and the other module was correctly blacklisted. I will test this new configuration today to see if I can finally get a stable system, and I'll update this thread if it works.
The question is. Should I file this as a bug in AMDGPU? or as a Ryzen bug? Where?
############
Update: I left the PC for a few hours, and it didn't freeze. Now I don't get kernel panics. I'll test it a little bit more, but it looks like it's pretty much related to AMDGPU after all. So I'll just report the bug later.
############
Symptoms:
Hard freeze when I leave the system for more than half an hour. It doesn't even respond to REISUB if I come back after 1 hour, but I can use that if I come back soon enough.
I thought it was because the system was going idle, but I left it with "stress -c6" running in the background and it froze anyway.
My config:
I tried Ubuntu 17.10 - 18.04, and all stable kernel versions, since the official that comes with 17.10 to the latest 4.16.1 (always installed from the ubuntu kernel ppa)
CPU: Ryzen 1800X - 32GB RAM
Motherboard: Gigabyte AX370 Gaming K7
GPU: AMD 390X 8GB
I didn't get any freeze before the upgrade. I didn't reinstall the OS.
I thought it was one of these bugs, but my stack trace didn't look anything like that:
https://bugs.launchpad.net/ubuntu/+s...x/+bug/1690085
Stack trace:
Code:
Apr 9 09:43:11 net kernel: [ 3902.972311] ------------[ cut here ]------------ Apr 9 09:43:11 net kernel: [ 3902.972313] kernel BUG at /home/kernel/COD/linux/mm/slub.c:296! Apr 9 09:43:11 net kernel: [ 3902.972320] invalid opcode: 0000 [#1] SMP NOPTI Apr 9 09:43:11 net kernel: [ 3902.972321] Modules linked in: pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) edac_mce_amd kvm_amd snd_hda_codec_realtek kvm snd_hda_codec_generic irqbypass snd_hda_codec_hdmi crct10dif_pclmul snd_hda_intel crc32_pclmul snd_seq_midi ghash_clmulni_intel snd_seq_midi_event pcbc snd_rawmidi snd_hda_codec aesni_intel snd_seq snd_hda_core snd_hwdep snd_seq_device aes_x86_64 snd_pcm crypto_simd glue_helper joydev input_leds cryptd wmi_bmof k10temp snd_timer ccp i2c_piix4 snd soundcore shpchp mac_hid binfmt_misc parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid uas usb_storage amdkfd amd_iommu_v2 amdgpu chash gpu_sched radeon ttm igb drm_kms_helper syscopyarea dca sysfillrect sysimgblt ptp fb_sys_fops mxm_wmi alx pps_core drm i2c_algo_bit mdio ahci libahci wmi Apr 9 09:43:11 net kernel: [ 3902.972355] gpio_amdpt gpio_generic Apr 9 09:43:11 net kernel: [ 3902.972358] CPU: 6 PID: 1361 Comm: Xorg Tainted: G OE 4.16.1-041601-generic #201804081334 Apr 9 09:43:11 net kernel: [ 3902.972359] Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming K7/AX370-Gaming K7, BIOS F22 03/15/2018 Apr 9 09:43:11 net kernel: [ 3902.972363] RIP: 0010:__slab_free+0x17a/0x2c0 Apr 9 09:43:11 net kernel: [ 3902.972365] RSP: 0018:ffffb3fd8927b980 EFLAGS: 00010246 Apr 9 09:43:11 net kernel: [ 3902.972366] RAX: ffff9b89929ac800 RBX: ffff9b89929ac800 RCX: 0000000180200017 Apr 9 09:43:11 net kernel: [ 3902.972367] RDX: ffff9b89929ac800 RSI: ffffd88a204a6a00 RDI: ffff9b899e806e80 Apr 9 09:43:11 net kernel: [ 3902.972368] RBP: ffffb3fd8927ba20 R08: 0000000000000001 R09: ffffffffc07468e4 Apr 9 09:43:11 net kernel: [ 3902.972369] R10: ffffb3fd8927ba40 R11: ffff9b899413e000 R12: ffff9b899e806e80 Apr 9 09:43:11 net kernel: [ 3902.972370] R13: ffffd88a204a6a00 R14: ffff9b89929ac800 R15: ffff9b899413f800 Apr 9 09:43:11 net kernel: [ 3902.972372] FS: 00007f7b14886500(0000) GS:ffff9b899ed80000(0000) knlGS:0000000000000000 Apr 9 09:43:11 net kernel: [ 3902.972373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 9 09:43:11 net kernel: [ 3902.972374] CR2: 00007fad30a30000 CR3: 00000008181d6000 CR4: 00000000003406e0 Apr 9 09:43:11 net kernel: [ 3902.972375] Call Trace: Apr 9 09:43:11 net kernel: [ 3902.972407] ? dc_sink_free+0x34/0x40 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972409] kfree+0x166/0x180 Apr 9 09:43:11 net kernel: [ 3902.972411] ? kfree+0x166/0x180 Apr 9 09:43:11 net kernel: [ 3902.972438] dc_sink_free+0x34/0x40 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972464] dc_sink_release+0x24/0x30 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972490] dc_stream_free+0x22/0x50 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972515] dc_stream_release+0x2c/0x30 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972544] dm_update_crtcs_state+0x126/0x370 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972571] amdgpu_dm_atomic_check+0x2ad/0x4d0 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972580] drm_atomic_check_only+0x389/0x550 [drm] Apr 9 09:43:11 net kernel: [ 3902.972588] drm_atomic_commit+0x18/0x60 [drm] Apr 9 09:43:11 net kernel: [ 3902.972596] drm_atomic_connector_commit_dpms+0xef/0x100 [drm] Apr 9 09:43:11 net kernel: [ 3902.972603] drm_mode_obj_set_property_ioctl+0x176/0x280 [drm] Apr 9 09:43:11 net kernel: [ 3902.972611] ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm] Apr 9 09:43:11 net kernel: [ 3902.972618] drm_mode_connector_property_set_ioctl+0x3f/0x60 [drm] Apr 9 09:43:11 net kernel: [ 3902.972625] drm_ioctl_kernel+0x5f/0xb0 [drm] Apr 9 09:43:11 net kernel: [ 3902.972631] drm_ioctl+0x31b/0x3d0 [drm] Apr 9 09:43:11 net kernel: [ 3902.972638] ? drm_mode_connector_set_obj_prop+0x80/0x80 [drm] Apr 9 09:43:11 net kernel: [ 3902.972640] ? __check_object_size+0xac/0x1a0 Apr 9 09:43:11 net kernel: [ 3902.972658] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] Apr 9 09:43:11 net kernel: [ 3902.972661] do_vfs_ioctl+0xa8/0x620 Apr 9 09:43:11 net kernel: [ 3902.972663] ? vfs_read+0x115/0x130 Apr 9 09:43:11 net kernel: [ 3902.972665] SyS_ioctl+0x79/0x90 Apr 9 09:43:11 net kernel: [ 3902.972668] do_syscall_64+0x73/0x130 Apr 9 09:43:11 net kernel: [ 3902.972670] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Apr 9 09:43:11 net kernel: [ 3902.972672] RIP: 0033:0x7f7b11cdfef7 Apr 9 09:43:11 net kernel: [ 3902.972673] RSP: 002b:00007ffc354544b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 9 09:43:11 net kernel: [ 3902.972675] RAX: ffffffffffffffda RBX: 000055800f2d87e0 RCX: 00007f7b11cdfef7 Apr 9 09:43:11 net kernel: [ 3902.972676] RDX: 00007ffc354544f0 RSI: 00000000c01064ab RDI: 000000000000000d Apr 9 09:43:11 net kernel: [ 3902.972677] RBP: 00007ffc354544f0 R08: 0000000000000001 R09: 0000000000000000 Apr 9 09:43:11 net kernel: [ 3902.972678] R10: 00007f7b11d64280 R11: 0000000000000246 R12: 00000000c01064ab Apr 9 09:43:11 net kernel: [ 3902.972679] R13: 000000000000000d R14: 000055800f2d81a0 R15: 000055800dde6e01 Apr 9 09:43:11 net kernel: [ 3902.972680] Code: 0f 84 ee fe ff ff 44 0f b6 7d 8b 80 7d ab 00 79 05 45 84 ff 74 61 48 83 c4 70 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <0f> 0b 4c 89 d0 4c 89 d7 45 89 fa 48 85 c0 44 0f b6 7d 8b 74 cb Apr 9 09:43:11 net kernel: [ 3902.972701] RIP: __slab_free+0x17a/0x2c0 RSP: ffffb3fd8927b980 Apr 9 09:43:11 net kernel: [ 3902.972703] ---[ end trace f87e1d03970b7d09 ]---
Stack trace 2, one day later:
Code:
Apr 10 08:30:11 net kernel: [ 1877.879005] ------------[ cut here ]------------ Apr 10 08:30:11 net kernel: [ 1877.879007] kernel BUG at /home/kernel/COD/linux/mm/slub.c:296! Apr 10 08:30:11 net kernel: [ 1877.879016] invalid opcode: 0000 [#1] SMP NOPTI Apr 10 08:30:11 net kernel: [ 1877.879018] Modules linked in: snd_opl3_synth snd_seq_midi_emul snd_cmipci snd_mpu401_uart snd_opl3_lib snd_hwdep gameport snd_pcm edac_mce_amd snd_seq_midi snd_seq_midi_event kvm_amd snd_rawmidi kvm irqbypass crct10dif_pclmul snd_seq crc32_pclmul ghash_clmulni_intel pcbc snd_seq_device snd_timer aesni_intel aes_x86_64 crypto_simd glue_helper input_leds joydev cryptd snd wmi_bmof soundcore k10temp ccp mac_hid shpchp binfmt_misc sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid amdkfd amd_iommu_v2 amdgpu chash gpu_sched radeon ttm drm_kms_helper syscopyarea igb sysfillrect sysimgblt fb_sys_fops dca ptp mxm_wmi drm i2c_piix4 alx pps_core ahci i2c_algo_bit mdio libahci gpio_amdpt gpio_generic wmi Apr 10 08:30:11 net kernel: [ 1877.879061] CPU: 14 PID: 1140 Comm: Xorg Not tainted 4.16.1-041601-generic #201804081334 Apr 10 08:30:11 net kernel: [ 1877.879063] Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming K7/AX370-Gaming K7, BIOS F22 03/15/2018 Apr 10 08:30:11 net kernel: [ 1877.879068] RIP: 0010:kfree+0x16b/0x180 Apr 10 08:30:11 net kernel: [ 1877.879070] RSP: 0018:ffffbc7dc85abab0 EFLAGS: 00010246 Apr 10 08:30:11 net kernel: [ 1877.879072] RAX: ffff9e6b3331f000 RBX: ffff9e6b3331f000 RCX: ffff9e6b3331f000 Apr 10 08:30:11 net kernel: [ 1877.879074] RDX: 000000000002260d RSI: ffff9e6b5efa7160 RDI: ffff9e6b5e806e80 Apr 10 08:30:11 net kernel: [ 1877.879075] RBP: ffffbc7dc85abac8 R08: 00000000000024ad R09: ffffffffc06b08e4 Apr 10 08:30:11 net kernel: [ 1877.879077] R10: ffffedfa9fccc600 R11: 0000000000000000 R12: ffff9e6b3331f000 Apr 10 08:30:11 net kernel: [ 1877.879078] R13: ffffffffc06b08e4 R14: ffff9e6b2e3d0000 R15: ffff9e6b3029d900 Apr 10 08:30:11 net kernel: [ 1877.879081] FS: 00007f53e37f3580(0000) GS:ffff9e6b5ef80000(0000) knlGS:0000000000000000 Apr 10 08:30:11 net kernel: [ 1877.879082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 10 08:30:11 net kernel: [ 1877.879084] CR2: 000055acfea32210 CR3: 00000007efb8e000 CR4: 00000000003406e0 Apr 10 08:30:11 net kernel: [ 1877.879085] Call Trace: Apr 10 08:30:11 net kernel: [ 1877.879132] dc_sink_free+0x34/0x40 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879173] dc_sink_release+0x24/0x30 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879212] dc_stream_free+0x22/0x50 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879251] dc_stream_release+0x2c/0x30 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879294] amdgpu_dm_connector_mode_valid+0xd1/0x240 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879307] ? drm_mode_connector_list_update+0xec/0x180 [drm] Apr 10 08:30:11 net kernel: [ 1877.879314] drm_helper_probe_single_connector_modes+0x418/0x710 [drm_kms_helper] Apr 10 08:30:11 net kernel: [ 1877.879326] drm_mode_getconnector+0x15d/0x340 [drm] Apr 10 08:30:11 net kernel: [ 1877.879330] ? netlink_recvmsg+0x244/0x420 Apr 10 08:30:11 net kernel: [ 1877.879341] ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm] Apr 10 08:30:11 net kernel: [ 1877.879343] [drm] SADs count is: -2, don't need to read it Apr 10 08:30:11 net kernel: [ 1877.879352] drm_ioctl_kernel+0x5f/0xb0 [drm] Apr 10 08:30:11 net kernel: [ 1877.879362] drm_ioctl+0x31b/0x3d0 [drm] Apr 10 08:30:11 net kernel: [ 1877.879372] ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm] Apr 10 08:30:11 net kernel: [ 1877.879400] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] Apr 10 08:30:11 net kernel: [ 1877.879404] do_vfs_ioctl+0xa8/0x620 Apr 10 08:30:11 net kernel: [ 1877.879408] ? handle_mm_fault+0xe3/0x220 Apr 10 08:30:11 net kernel: [ 1877.879411] ? __do_page_fault+0x270/0x4d0 Apr 10 08:30:11 net kernel: [ 1877.879414] SyS_ioctl+0x79/0x90 Apr 10 08:30:11 net kernel: [ 1877.879417] do_syscall_64+0x73/0x130 Apr 10 08:30:11 net kernel: [ 1877.879421] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Apr 10 08:30:11 net kernel: [ 1877.879424] RIP: 0033:0x7f53e0bf45d7 Apr 10 08:30:11 net kernel: [ 1877.879425] RSP: 002b:00007fff82f5b1a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 10 08:30:11 net kernel: [ 1877.879427] RAX: ffffffffffffffda RBX: 000055acfe8bdf10 RCX: 00007f53e0bf45d7 Apr 10 08:30:11 net kernel: [ 1877.879429] RDX: 00007fff82f5b1e0 RSI: 00000000c05064a7 RDI: 000000000000000d Apr 10 08:30:11 net kernel: [ 1877.879430] RBP: 00007fff82f5b1e0 R08: 0000000000000008 R09: 0000000000000008 Apr 10 08:30:11 net kernel: [ 1877.879432] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000c05064a7 Apr 10 08:30:11 net kernel: [ 1877.879433] R13: 000000000000000d R14: 000000000000000d R15: 00007fff82f5b1e0 Apr 10 08:30:11 net kernel: [ 1877.879435] Code: 80 74 05 41 0f b6 72 69 4c 89 d7 e8 00 1e f9 ff eb 85 41 b8 01 00 00 00 48 89 d9 48 89 da 4c 89 d6 e8 9a f6 ff ff e9 6c ff ff ff <0f> 0b 48 8b 3d 7c 97 1c 01 e9 c8 fe ff ff 0f 1f 80 00 00 00 00 Apr 10 08:30:11 net kernel: [ 1877.879466] RIP: kfree+0x16b/0x180 RSP: ffffbc7dc85abab0 Apr 10 08:30:11 net kernel: [ 1877.879468] ---[ end trace 480d2dfde7a7e9da ]--- Apr 10 08:30:11 net kernel: [ 1878.326140] BUG: unable to handle kernel paging request at ffffbc7dc85abbe0 Apr 10 08:30:11 net kernel: [ 1878.326151] IP: __ww_mutex_lock.isra.3+0x283/0x670 Apr 10 08:30:11 net kernel: [ 1878.326153] PGD 81e87e067 P4D 81e87e067 PUD 81e87f067 PMD 7ee091067 PTE 0 Apr 10 08:30:11 net kernel: [ 1878.326159] Oops: 0000 [#2] SMP NOPTI Apr 10 08:30:11 net kernel: [ 1878.326162] Modules linked in: snd_opl3_synth snd_seq_midi_emul snd_cmipci snd_mpu401_uart snd_opl3_lib snd_hwdep gameport snd_pcm edac_mce_amd snd_seq_midi snd_seq_midi_event kvm_amd snd_rawmidi kvm irqbypass crct10dif_pclmul snd_seq crc32_pclmul ghash_clmulni_intel pcbc snd_seq_device snd_timer aesni_intel aes_x86_64 crypto_simd glue_helper input_leds joydev cryptd snd wmi_bmof soundcore k10temp ccp mac_hid shpchp binfmt_misc sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid amdkfd amd_iommu_v2 amdgpu chash gpu_sched radeon ttm drm_kms_helper syscopyarea igb sysfillrect sysimgblt fb_sys_fops dca ptp mxm_wmi drm i2c_piix4 alx pps_core ahci i2c_algo_bit mdio libahci gpio_amdpt gpio_generic wmi Apr 10 08:30:11 net kernel: [ 1878.326209] CPU: 12 PID: 1318 Comm: InputThread Tainted: G D 4.16.1-041601-generic #201804081334 Apr 10 08:30:11 net kernel: [ 1878.326211] Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming K7/AX370-Gaming K7, BIOS F22 03/15/2018 Apr 10 08:30:11 net kernel: [ 1878.326214] RIP: 0010:__ww_mutex_lock.isra.3+0x283/0x670 Apr 10 08:30:11 net kernel: [ 1878.326216] RSP: 0018:ffffbc7dc90ab870 EFLAGS: 00010286 Apr 10 08:30:11 net kernel: [ 1878.326218] RAX: ffffbc7dc85abbd8 RBX: ffff9e6b3b116a20 RCX: 000000000001c1be Apr 10 08:30:11 net kernel: [ 1878.326220] RDX: ffff9e6b347a1701 RSI: ffff9e6b3601dc00 RDI: ffffbc7dc90ab890 Apr 10 08:30:11 net kernel: [ 1878.326222] RBP: ffffbc7dc90ab8f0 R08: 0000000000000000 R09: ffff9e6aafdf0100 Apr 10 08:30:11 net kernel: [ 1878.326223] R10: ffffbc7dc90ab908 R11: 0000000000000000 R12: 0000000000000001 Apr 10 08:30:11 net kernel: [ 1878.326225] R13: ffff9e6b3b116a18 R14: ffffbc7dc90abc48 R15: 0000000000000000 Apr 10 08:30:11 net kernel: [ 1878.326228] FS: 00007f53b3fff700(0000) GS:ffff9e6b5ef00000(0000) knlGS:0000000000000000 Apr 10 08:30:11 net kernel: [ 1878.326230] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 10 08:30:11 net kernel: [ 1878.326231] CR2: ffffbc7dc85abbe0 CR3: 00000007efb8e000 CR4: 00000000003406e0 Apr 10 08:30:11 net kernel: [ 1878.326233] Call Trace: Apr 10 08:30:11 net kernel: [ 1878.326238] __ww_mutex_lock_interruptible_slowpath+0x16/0x20 Apr 10 08:30:11 net kernel: [ 1878.326241] ? __ww_mutex_lock_interruptible_slowpath+0x16/0x20 Apr 10 08:30:11 net kernel: [ 1878.326244] ww_mutex_lock_interruptible+0x5a/0x70 Apr 10 08:30:11 net kernel: [ 1878.326261] drm_modeset_lock+0x9a/0xb0 [drm] Apr 10 08:30:11 net kernel: [ 1878.326274] drm_modeset_lock_all_ctx+0x24/0xb0 [drm] Apr 10 08:30:11 net kernel: [ 1878.326327] amdgpu_dm_atomic_check+0x39f/0x4d0 [amdgpu] Apr 10 08:30:11 net kernel: [ 1878.326340] drm_atomic_check_only+0x389/0x550 [drm] Apr 10 08:30:11 net kernel: [ 1878.326352] drm_atomic_commit+0x18/0x60 [drm] Apr 10 08:30:11 net kernel: [ 1878.326361] drm_atomic_helper_update_plane+0xe9/0x100 [drm_kms_helper] Apr 10 08:30:11 net kernel: [ 1878.326374] __setplane_internal+0x1e5/0x270 [drm] Apr 10 08:30:11 net kernel: [ 1878.326386] drm_mode_cursor_universal+0xfd/0x210 [drm] Apr 10 08:30:11 net kernel: [ 1878.326399] drm_mode_cursor_common+0x187/0x200 [drm] Apr 10 08:30:11 net kernel: [ 1878.326411] ? drm_mode_setplane+0x240/0x240 [drm] Apr 10 08:30:11 net kernel: [ 1878.326422] drm_mode_cursor_ioctl+0x4a/0x60 [drm] Apr 10 08:30:11 net kernel: [ 1878.326432] drm_ioctl_kernel+0x5f/0xb0 [drm] Apr 10 08:30:11 net kernel: [ 1878.326443] drm_ioctl+0x31b/0x3d0 [drm] Apr 10 08:30:11 net kernel: [ 1878.326454] ? drm_mode_setplane+0x240/0x240 [drm] Apr 10 08:30:11 net kernel: [ 1878.326485] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] Apr 10 08:30:11 net kernel: [ 1878.326490] do_vfs_ioctl+0xa8/0x620 Apr 10 08:30:11 net kernel: [ 1878.326493] ? vfs_read+0x8e/0x130 Apr 10 08:30:11 net kernel: [ 1878.326496] SyS_ioctl+0x79/0x90 Apr 10 08:30:11 net kernel: [ 1878.326500] do_syscall_64+0x73/0x130 Apr 10 08:30:11 net kernel: [ 1878.326504] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Apr 10 08:30:11 net kernel: [ 1878.326506] RIP: 0033:0x7f53e0bf45d7 Apr 10 08:30:11 net kernel: [ 1878.326508] RSP: 002b:00007f53b3ffd318 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Apr 10 08:30:11 net kernel: [ 1878.326510] RAX: ffffffffffffffda RBX: 000055acfea35170 RCX: 00007f53e0bf45d7 Apr 10 08:30:11 net kernel: [ 1878.326512] RDX: 00007f53b3ffd350 RSI: 00000000c01c64a3 RDI: 000000000000000d Apr 10 08:30:11 net kernel: [ 1878.326514] RBP: 00007f53b3ffd350 R08: 000055acfea35d30 R09: 0000000000000780 Apr 10 08:30:11 net kernel: [ 1878.326515] R10: 000055acfea6ce20 R11: 0000000000000246 R12: 00000000c01c64a3 Apr 10 08:30:11 net kernel: [ 1878.326517] R13: 000000000000000d R14: 00000000000009ef R15: 0000000000000001 Apr 10 08:30:11 net kernel: [ 1878.326519] Code: cd 03 00 00 45 84 ff 0f 85 43 02 00 00 48 89 df e8 43 34 00 00 e9 39 ff ff ff 49 8b 45 20 48 85 c0 0f 84 38 03 00 00 49 8b 4e 08 <48> 8b 50 08 48 39 d1 0f 88 27 03 00 00 48 39 d1 75 09 49 39 c6 Apr 10 08:30:11 net kernel: [ 1878.326554] RIP: __ww_mutex_lock.isra.3+0x283/0x670 RSP: ffffbc7dc90ab870 Apr 10 08:30:11 net kernel: [ 1878.326555] CR2: ffffbc7dc85abbe0 Apr 10 08:30:11 net kernel: [ 1878.326558] ---[ end trace 480d2dfde7a7e9db ]---
Right now I blacklisted AMDGPU and reverted my kernel parameters to use radeon instead. I also updated the initramfs to make sure AMDGPU is not loaded. lsmod shows that I'm successfully using radeon now, and the other module was correctly blacklisted. I will test this new configuration today to see if I can finally get a stable system, and I'll update this thread if it works.
The question is. Should I file this as a bug in AMDGPU? or as a Ryzen bug? Where?
############
Update: I left the PC for a few hours, and it didn't freeze. Now I don't get kernel panics. I'll test it a little bit more, but it looks like it's pretty much related to AMDGPU after all. So I'll just report the bug later.
############
Comment