From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 200139] New: amdgpu lockup after resume from sleep
Date: Tue, 19 Jun 2018 13:05:24 +0000 [thread overview]
Message-ID: <bug-200139-2300@https.bugzilla.kernel.org/> (raw)
https://bugzilla.kernel.org/show_bug.cgi?id=200139
Bug ID: 200139
Summary: amdgpu lockup after resume from sleep
Product: Drivers
Version: 2.5
Kernel Version: 4.17.2
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri@kernel-bugs.osdl.org
Reporter: j.hoffmann@quapona.com
Regression: No
Created attachment 276689
--> https://bugzilla.kernel.org/attachment.cgi?id=276689&action=edit
HWInfo
I have observed a GPU lockup when the systems resumes after a sleep. The
duration of the sleep dosn't care. The problem occurs every time putting the
system to sleep.
I was able to narrow the problem a little bit. When I switch to the console and
then putting the system to sleep, the system will come up properly (with a
trace on a amgpu fuction). If I then switch back to the login manager or to the
desktop, the gpu fault and eventually hangs. See logs below.
I can reproduce the problem with kernel 4.16.13. Further it dosn't matter if
amdgpu.dc is enabled or disable.
System
----------
Linux 4.17.2
Debian Unstable
X.Org 1.20
Mesa 18.1.1
Radeon RX 580 Series (POLARIS10, DRM 3.25.0, 4.17.2, LLVM 6.0.0)
CPU Intel Core i7-8700k
MB Asus Prime z380-A
Kernel log after the resume from console:
-----------------------------------------
Jun 19 14:24:39 moc kernel: sd 0:0:0:0: [sda] Starting disk
Jun 19 14:24:39 moc kernel: [drm] PCIE GART of 256M enabled (table at
0x000000F400040000).
Jun 19 14:24:39 moc kernel: WARNING: CPU: 7 PID: 28047 at
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:725
amdgpu_dm_display_resume+0x213/0x220 [amdgpu]
Jun 19 14:24:39 moc kernel: Modules linked in: vmnet(OE)
vmw_vsock_vmci_transport(E) vsock(E) vmw_vmci(E) vmmon(OE) fuse(E) joydev(E)
hid_cherry(E) hid_generic(E) usbhid(E) hid(E) intel_rapl(E)
x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) ir
Jun 19 14:24:39 moc kernel: asus_wmi(E) evdev(E) efi_pstore(E) intel_uncore(E)
sparse_keymap(E) wmi_bmof(E) mxm_wmi(E) i2c_algo_bit(E) rfkill(E) sg(E)
intel_rapl_perf(E) iTCO_wdt(E) efivars(E) snd(E) mei_me(E)
iTCO_vendor_support(E) soundcore(E) mei(E) shpchp(E) wmi(E) v
Jun 19 14:24:39 moc kernel: btrfs(E) zstd_decompress(E) zstd_compress(E)
xxhash(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E)
async_xor(E) async_tx(E) xor(E) raid6_pq(E) libcrc32c(E) crc32c_generic(E)
raid1(E) raid0(E) multipath(E) linear(E) md
Jun 19 14:24:39 moc kernel: CPU: 7 PID: 28047 Comm: kworker/u24:7 Tainted: G
OE 4.17.2 #1
Jun 19 14:24:39 moc kernel: Hardware name: System manufacturer System Product
Name/PRIME Z370-A, BIOS 0805 05/18/2018
Jun 19 14:24:39 moc kernel: Workqueue: events_unbound async_run_entry_fn
Jun 19 14:24:39 moc kernel: RIP: 0010:amdgpu_dm_display_resume+0x213/0x220
[amdgpu]
Jun 19 14:24:39 moc kernel: RSP: 0000:ffffaadd4447fd60 EFLAGS: 00010202
Jun 19 14:24:39 moc kernel: RAX: 0000000000000002 RBX: ffff96d7a48b0000 RCX:
0000000000000006
Jun 19 14:24:39 moc kernel: RDX: 0000000000000006 RSI: ffff96d6915a2c80 RDI:
ffff96d7898f7800
Jun 19 14:24:39 moc kernel: RBP: ffff96d79fb9d800 R08: 0000000000000000 R09:
ffffffffc14a7174
Jun 19 14:24:39 moc kernel: R10: ffffe4dea0a9a840 R11: 0000000000000001 R12:
0000000000000000
Jun 19 14:24:39 moc kernel: R13: ffff96d7a5e43800 R14: ffff96d7a9ca8d40 R15:
ffffffffb4695dbb
Jun 19 14:24:39 moc kernel: FS: 0000000000000000(0000)
GS:ffff96d7ae3c0000(0000) knlGS:0000000000000000
Jun 19 14:24:39 moc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 19 14:24:39 moc kernel: CR2: 0000000000000000 CR3: 00000003aa80a001 CR4:
00000000003606e0
Jun 19 14:24:39 moc kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
Jun 19 14:24:39 moc kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
Jun 19 14:24:39 moc kernel: Call Trace:
Jun 19 14:24:39 moc kernel: amdgpu_device_ip_resume_phase2+0x45/0xb0 [amdgpu]
Jun 19 14:24:39 moc kernel: amdgpu_device_resume+0xbf/0x380 [amdgpu]
Jun 19 14:24:39 moc kernel: ? pci_pm_freeze+0xd0/0xd0
Jun 19 14:24:39 moc kernel: ? pci_pm_freeze+0xd0/0xd0
Jun 19 14:24:39 moc kernel: dpm_run_callback+0x4d/0x130
Jun 19 14:24:39 moc kernel: device_resume+0x97/0x190
Jun 19 14:24:39 moc kernel: async_resume+0x19/0x40
Jun 19 14:24:39 moc kernel: async_run_entry_fn+0x39/0x160
Jun 19 14:24:39 moc kernel: process_one_work+0x17b/0x360
Jun 19 14:24:39 moc kernel: worker_thread+0x2e/0x390
Jun 19 14:24:39 moc kernel: ? process_one_work+0x360/0x360
Jun 19 14:24:39 moc kernel: kthread+0x113/0x130
Jun 19 14:24:39 moc kernel: ? kthread_create_worker_on_cpu+0x70/0x70
Jun 19 14:24:39 moc kernel: ret_from_fork+0x35/0x40
Jun 19 14:24:39 moc kernel: Code: 00 7f ac 48 89 ef e8 dd df a5 ff 48 c7 83 90
aa 00 00 00 00 00 00 89 c5 48 89 df e8 c8 17 00 00 89 e8 5b 5d 41 5c 41 5d 41
5e c3 <0f> 0b e9 48 ff ff ff 0f 0b eb a5 66 90 0f 1f 44 00 00 53 48 89
Jun 19 14:24:39 moc kernel: ---[ end trace c39336409cdb2ae3 ]---
Jun 19 14:24:39 moc kernel: [drm] UVD and UVD ENC initialized successfully.
Jun 19 14:24:39 moc kernel: ixgbe 0000:03:00.0: Multiqueue Enabled: Rx Queue
count = 12, Tx Queue count = 12 XDP Queue count = 0
Log after switching to X11
---------------------------
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 146
0x0000480c
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:13 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 146
0x0000480c
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0E40C60C
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08048001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 239126028, read from 'TC4' (0x54433400) (72)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 146
0x0000480c
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0804800C
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x0c, vmid 4, pasid
0) at page 0, read from 'TC4' (0x54433400) (72)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 146
0x0000480c
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0804800C
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x0c, vmid 4, pasid
0) at page 0, read from 'TC4' (0x54433400) (72)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: GPU fault detected: 147
0x0a304401
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x08404D46
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0:
VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x08044001
Jun 19 14:29:14 moc kernel: amdgpu 0000:01:00.0: VM fault (0x01, vmid 4, pasid
0) at page 138431814, read from 'TC5' (0x54433500) (68)
Jun 19 14:29:24 moc kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, last signaled seq=384604, last emitted seq=384605
Jun 19 14:29:24 moc kernel: [drm] IP block:gfx_v8_0 is hung!
Jun 19 14:29:24 moc kernel: [drm] GPU recovery disabled.
-- Reboot --
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
next reply other threads:[~2018-06-19 13:05 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-19 13:05 bugzilla-daemon [this message]
2018-06-19 13:24 ` [Bug 200139] amdgpu lockup after resume from sleep bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-200139-2300@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@bugzilla.kernel.org \
--cc=dri-devel@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.