* [PATCH 1/2] drm/amdgpu: fix gpu reset crash
@ 2017-04-24 9:40 Chunming Zhou
[not found] ` <1493026850-29604-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Chunming Zhou @ 2017-04-24 9:40 UTC (permalink / raw)
To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Chunming Zhou
[ 413.687439] BUG: unable to handle kernel NULL pointer dereference at 0000000000000548
[ 413.687479] IP: [<ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.687507] PGD 1efd12067
[ 413.687519] PUD 1efd11067
[ 413.687531] PMD 0
[ 413.687543] Oops: 0000 [#1] SMP
[ 413.687557] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) eeepc_wmi(E) snd_hda_codec(E) asus_wmi(E) snd_hda_core(E) sparse_keymap(E) snd_hwdep(E) video(E) snd_pcm(E) snd_seq_midi(E) joydev(E) snd_seq_midi_event(E) snd_rawmidi(E) snd_seq(E) snd_seq_device(E) snd_timer(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd(E) crc32_pclmul(E) ghash_clmulni_intel(E) soundcore(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) shpchp(E) serio_raw(E) i2c_piix4(E) 8250_dw(E) i2c_designware_platform(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[ 413.687894] parport_pc(E) ppdev(E) lp(E) parport(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) r8169(E) mii(E) libahci(E) wmi(E)
[ 413.687989] CPU: 13 PID: 1134 Comm: kworker/13:2 Tainted: G OE 4.9.0-custom #4
[ 413.688019] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[ 413.688089] Workqueue: events amd_sched_job_timedout [amdgpu]
[ 413.688116] task: ffff88020f9657c0 task.stack: ffffc90001a88000
[ 413.688139] RIP: 0010:[<ffffffff8109b175>] [<ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.688171] RSP: 0018:ffffc90001a8bd60 EFLAGS: 00010282
[ 413.688191] RAX: ffff88020f0073f8 RBX: ffff88020f000000 RCX: 0000000000000000
[ 413.688217] RDX: 0000000000000001 RSI: ffff88020f9670c0 RDI: 0000000000000000
[ 413.688243] RBP: ffffc90001a8bd78 R08: 0000000000000000 R09: 0000000000001000
[ 413.688269] R10: 0000006051b11a82 R11: 0000000000000001 R12: 0000000000000000
[ 413.688295] R13: ffff88020f002770 R14: ffff88020f004838 R15: ffff8801b23c2c60
[ 413.688321] FS: 0000000000000000(0000) GS:ffff88021ef40000(0000) knlGS:0000000000000000
[ 413.688352] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 413.688373] CR2: 0000000000000548 CR3: 00000001efd0f000 CR4: 00000000003406e0
[ 413.688399] Stack:
[ 413.688407] ffffffff8109b304 ffff88020f000000 0000000000000070 ffffc90001a8bdf0
[ 413.688439] ffffffffa05ce29d ffffffffa052feb7 ffffffffa07b5820 ffffc90001a8bda0
[ 413.688470] ffffffff00000018 ffff8801bb88f060 0000000001a8bdb8 ffff88021ef59280
[ 413.688502] Call Trace:
[ 413.688514] [<ffffffff8109b304>] ? kthread_park+0x14/0x60
[ 413.688555] [<ffffffffa05ce29d>] amdgpu_gpu_reset+0x7d/0x670 [amdgpu]
[ 413.688589] [<ffffffffa052feb7>] ? drm_printk+0x97/0xa0 [drm]
[ 413.688643] [<ffffffffa0698136>] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[ 413.688700] [<ffffffffa06969e7>] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[ 413.688727] [<ffffffff81095493>] process_one_work+0x153/0x3f0
[ 413.688751] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0
[ 413.688773] [<ffffffff8100392e>] ? do_syscall_64+0x6e/0x180
[ 413.688795] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350
[ 413.688818] [<ffffffff8100392e>] ? do_syscall_64+0x6e/0x180
[ 413.688839] [<ffffffff8109b423>] kthread+0xd3/0xf0
[ 413.688858] [<ffffffff8109b350>] ? kthread_park+0x60/0x60
[ 413.688881] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[ 413.688901] Code: 25 40 d3 00 00 48 8b 80 48 05 00 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b b7 48 05 00 00 55 48 89 e5 48 85 f6 74 31 8b 97 f8 18 00
[ 413.689045] RIP [<ffffffff8109b175>] to_live_kthread+0x5/0x60
[ 413.689064] RSP <ffffc90001a8bd60>
[ 413.689076] CR2: 0000000000000548
[ 413.697985] ---[ end trace 0a314a64821f84e9 ]---
The root cause is some ring doesn't have scheduler, like KIQ ring
Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9993085..168a9de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2675,7 +2675,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
- if (!ring)
+ if (!ring || !ring->sched.thread)
continue;
kcl_kthread_park(ring->sched.thread);
amd_sched_hw_job_reset(&ring->sched);
@@ -2770,7 +2770,8 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
}
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
- if (!ring)
+
+ if (!ring || !ring->sched.thread)
continue;
amd_sched_job_recovery(&ring->sched);
@@ -2779,7 +2780,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
} else {
dev_err(adev->dev, "asic resume failed (%d).\n", r);
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
- if (adev->rings[i]) {
+ if (adev->rings[i] && adev->rings[i]->sched.thread) {
kcl_kthread_unpark(adev->rings[i]->sched.thread);
}
}
--
1.9.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread[parent not found: <1493026850-29604-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>]
* [PATCH 2/2] drm/amdgpu: fix NULL pointer error [not found] ` <1493026850-29604-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org> @ 2017-04-24 9:40 ` Chunming Zhou 2017-04-24 9:47 ` [PATCH 1/2] drm/amdgpu: fix gpu reset crash Christian König 1 sibling, 0 replies; 7+ messages in thread From: Chunming Zhou @ 2017-04-24 9:40 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Chunming Zhou [ 141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 141.420532] IP: [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.420563] PGD 20a030067 [ 141.420575] PUD 2088ca067 [ 141.420587] PMD 0 [ 141.420599] Oops: 0000 [#1] SMP [ 141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E) [ 141.420948] nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E) [ 141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G OE 4.9.0-custom #4 [ 141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017 [ 141.421146] Workqueue: events amd_sched_job_timedout [amdgpu] [ 141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000 [ 141.421193] RIP: 0010:[<ffffffff81579ee1>] [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.421229] RSP: 0018:ffffc900016f7d30 EFLAGS: 00010202 [ 141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000 [ 141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000 [ 141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000 [ 141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000 [ 141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60 [ 141.421390] FS: 0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000 [ 141.421421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0 [ 141.421471] Stack: [ 141.421480] ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78 [ 141.421513] ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770 [ 141.421549] ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7 [ 141.421583] Call Trace: [ 141.421628] [<ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu] [ 141.421676] [<ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu] [ 141.421712] [<ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm] [ 141.421770] [<ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu] [ 141.421829] [<ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu] [ 141.421859] [<ffffffff81095493>] process_one_work+0x153/0x3f0 [ 141.421884] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0 [ 141.421907] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350 [ 141.421931] [<ffffffff8109b423>] kthread+0xd3/0xf0 [ 141.421951] [<ffffffff8109b350>] ? kthread_park+0x60/0x60 [ 141.421975] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30 [ 141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95 [ 141.422156] RIP [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.422183] RSP <ffffc900016f7d30> [ 141.422197] CR2: 0000000000000030 [ 141.433483] ---[ end trace bc0949bf7ddd6d4b ]--- if the job is reset twice, then the parent could be NULL. Change-Id: I234887f5c26cf1fb9c7bdec3fc6c25a75f6dd3c0 Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> --- drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c index 8ab345d..80bc4f7 100644 --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c @@ -385,7 +385,9 @@ void amd_sched_hw_job_reset(struct amd_gpu_scheduler *sched) spin_lock(&sched->job_list_lock); list_for_each_entry_reverse(s_job, &sched->ring_mirror_list, node) { - if (fence_remove_callback(s_job->s_fence->parent, &s_job->s_fence->cb)) { + if (s_job->s_fence->parent && + fence_remove_callback(s_job->s_fence->parent, + &s_job->s_fence->cb)) { fence_put(s_job->s_fence->parent); s_job->s_fence->parent = NULL; } -- 1.9.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash [not found] ` <1493026850-29604-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org> 2017-04-24 9:40 ` [PATCH 2/2] drm/amdgpu: fix NULL pointer error Chunming Zhou @ 2017-04-24 9:47 ` Christian König [not found] ` <a43a71f9-b8b1-4ff3-3532-d13bdd539761-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> 1 sibling, 1 reply; 7+ messages in thread From: Christian König @ 2017-04-24 9:47 UTC (permalink / raw) To: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Am 24.04.2017 um 11:40 schrieb Chunming Zhou: > [ 413.687439] BUG: unable to handle kernel NULL pointer dereference at 0000000000000548 > [ 413.687479] IP: [<ffffffff8109b175>] to_live_kthread+0x5/0x60 > [ 413.687507] PGD 1efd12067 > [ 413.687519] PUD 1efd11067 > [ 413.687531] PMD 0 > > [ 413.687543] Oops: 0000 [#1] SMP > [ 413.687557] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) eeepc_wmi(E) snd_hda_codec(E) asus_wmi(E) snd_hda_core(E) sparse_keymap(E) snd_hwdep(E) video(E) snd_pcm(E) snd_seq_midi(E) joydev(E) snd_seq_midi_event(E) snd_rawmidi(E) snd_seq(E) snd_seq_device(E) snd_timer(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd(E) crc32_pclmul(E) ghash_clmulni_intel(E) soundcore(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) shpchp(E) serio_raw(E) i2c_piix4(E) 8250_dw(E) i2c_designware_platform(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E) > [ 413.687894] parport_pc(E) ppdev(E) lp(E) parport(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) r8169(E) mii(E) libahci(E) wmi(E) > [ 413.687989] CPU: 13 PID: 1134 Comm: kworker/13:2 Tainted: G OE 4.9.0-custom #4 > [ 413.688019] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017 > [ 413.688089] Workqueue: events amd_sched_job_timedout [amdgpu] > [ 413.688116] task: ffff88020f9657c0 task.stack: ffffc90001a88000 > [ 413.688139] RIP: 0010:[<ffffffff8109b175>] [<ffffffff8109b175>] to_live_kthread+0x5/0x60 > [ 413.688171] RSP: 0018:ffffc90001a8bd60 EFLAGS: 00010282 > [ 413.688191] RAX: ffff88020f0073f8 RBX: ffff88020f000000 RCX: 0000000000000000 > [ 413.688217] RDX: 0000000000000001 RSI: ffff88020f9670c0 RDI: 0000000000000000 > [ 413.688243] RBP: ffffc90001a8bd78 R08: 0000000000000000 R09: 0000000000001000 > [ 413.688269] R10: 0000006051b11a82 R11: 0000000000000001 R12: 0000000000000000 > [ 413.688295] R13: ffff88020f002770 R14: ffff88020f004838 R15: ffff8801b23c2c60 > [ 413.688321] FS: 0000000000000000(0000) GS:ffff88021ef40000(0000) knlGS:0000000000000000 > [ 413.688352] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 413.688373] CR2: 0000000000000548 CR3: 00000001efd0f000 CR4: 00000000003406e0 > [ 413.688399] Stack: > [ 413.688407] ffffffff8109b304 ffff88020f000000 0000000000000070 ffffc90001a8bdf0 > [ 413.688439] ffffffffa05ce29d ffffffffa052feb7 ffffffffa07b5820 ffffc90001a8bda0 > [ 413.688470] ffffffff00000018 ffff8801bb88f060 0000000001a8bdb8 ffff88021ef59280 > [ 413.688502] Call Trace: > [ 413.688514] [<ffffffff8109b304>] ? kthread_park+0x14/0x60 > [ 413.688555] [<ffffffffa05ce29d>] amdgpu_gpu_reset+0x7d/0x670 [amdgpu] > [ 413.688589] [<ffffffffa052feb7>] ? drm_printk+0x97/0xa0 [drm] > [ 413.688643] [<ffffffffa0698136>] amdgpu_job_timedout+0x46/0x50 [amdgpu] > [ 413.688700] [<ffffffffa06969e7>] amd_sched_job_timedout+0x17/0x20 [amdgpu] > [ 413.688727] [<ffffffff81095493>] process_one_work+0x153/0x3f0 > [ 413.688751] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0 > [ 413.688773] [<ffffffff8100392e>] ? do_syscall_64+0x6e/0x180 > [ 413.688795] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350 > [ 413.688818] [<ffffffff8100392e>] ? do_syscall_64+0x6e/0x180 > [ 413.688839] [<ffffffff8109b423>] kthread+0xd3/0xf0 > [ 413.688858] [<ffffffff8109b350>] ? kthread_park+0x60/0x60 > [ 413.688881] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30 > [ 413.688901] Code: 25 40 d3 00 00 48 8b 80 48 05 00 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <48> 8b b7 48 05 00 00 55 48 89 e5 48 85 f6 74 31 8b 97 f8 18 00 > [ 413.689045] RIP [<ffffffff8109b175>] to_live_kthread+0x5/0x60 > [ 413.689064] RSP <ffffc90001a8bd60> > [ 413.689076] CR2: 0000000000000548 > [ 413.697985] ---[ end trace 0a314a64821f84e9 ]--- > > The root cause is some ring doesn't have scheduler, like KIQ ring > > Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058 > Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> for both. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 9993085..168a9de 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -2675,7 +2675,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev) > for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { > struct amdgpu_ring *ring = adev->rings[i]; > > - if (!ring) > + if (!ring || !ring->sched.thread) > continue; > kcl_kthread_park(ring->sched.thread); > amd_sched_hw_job_reset(&ring->sched); > @@ -2770,7 +2770,8 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev) > } > for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { > struct amdgpu_ring *ring = adev->rings[i]; > - if (!ring) > + > + if (!ring || !ring->sched.thread) > continue; > > amd_sched_job_recovery(&ring->sched); > @@ -2779,7 +2780,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev) > } else { > dev_err(adev->dev, "asic resume failed (%d).\n", r); > for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { > - if (adev->rings[i]) { > + if (adev->rings[i] && adev->rings[i]->sched.thread) { > kcl_kthread_unpark(adev->rings[i]->sched.thread); > } > } _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <a43a71f9-b8b1-4ff3-3532-d13bdd539761-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>]
* Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash [not found] ` <a43a71f9-b8b1-4ff3-3532-d13bdd539761-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org> @ 2017-04-25 2:20 ` zhoucm1 [not found] ` <58FEB257.2010002-5C7GfCeVMHo@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: zhoucm1 @ 2017-04-25 2:20 UTC (permalink / raw) To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW On 2017年04月24日 17:47, Christian König wrote: >> The root cause is some ring doesn't have scheduler, like KIQ ring >> >> Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058 >> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> > > Reviewed-by: Christian König <christian.koenig@amd.com> for both. I forgot to add RB when pushing patches, How can I add it again? Sorry for that. David Zhou _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <58FEB257.2010002-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash [not found] ` <58FEB257.2010002-5C7GfCeVMHo@public.gmane.org> @ 2017-04-25 8:07 ` Christian König 2017-04-25 13:34 ` Deucher, Alexander 1 sibling, 0 replies; 7+ messages in thread From: Christian König @ 2017-04-25 8:07 UTC (permalink / raw) To: zhoucm1, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Am 25.04.2017 um 04:20 schrieb zhoucm1: > > > On 2017年04月24日 17:47, Christian König wrote: >>> The root cause is some ring doesn't have scheduler, like KIQ ring >>> >>> Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058 >>> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> >> >> Reviewed-by: Christian König <christian.koenig@amd.com> for both. > I forgot to add RB when pushing patches, How can I add it again? Does gerrit actually accept the commit in this case? If not you could add the rb and try to push it again. If yes you should ping Alex to add it when he upstreams the patch. (Not so much of an issue Alex is probably taking care of that anyway while upstreaming). Christian. > > Sorry for that. > David Zhou _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 1/2] drm/amdgpu: fix gpu reset crash [not found] ` <58FEB257.2010002-5C7GfCeVMHo@public.gmane.org> 2017-04-25 8:07 ` Christian König @ 2017-04-25 13:34 ` Deucher, Alexander [not found] ` <BN6PR12MB1652CAFE2B4D576733D2EBA3F71E0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> 1 sibling, 1 reply; 7+ messages in thread From: Deucher, Alexander @ 2017-04-25 13:34 UTC (permalink / raw) To: Zhou, David(ChunMing), Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > -----Original Message----- > From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf > Of zhoucm1 > Sent: Monday, April 24, 2017 10:20 PM > To: Christian König; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash > > > > On 2017年04月24日 17:47, Christian König wrote: > >> The root cause is some ring doesn't have scheduler, like KIQ ring > >> > >> Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058 > >> Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> > > > > Reviewed-by: Christian König <christian.koenig@amd.com> for both. > I forgot to add RB when pushing patches, How can I add it again? I'll add it before sending upstream. Alex > > Sorry for that. > David Zhou > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <BN6PR12MB1652CAFE2B4D576733D2EBA3F71E0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>]
* Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash [not found] ` <BN6PR12MB1652CAFE2B4D576733D2EBA3F71E0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org> @ 2017-04-26 4:57 ` zhoucm1 0 siblings, 0 replies; 7+ messages in thread From: zhoucm1 @ 2017-04-26 4:57 UTC (permalink / raw) To: Deucher, Alexander, Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org [-- Attachment #1.1: Type: text/plain, Size: 1153 bytes --] On 2017年04月25日 21:34, Deucher, Alexander wrote: > > -----Original Message----- > > From: amd-gfx [mailto:amd-gfx-bounces-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org] On Behalf > > Of zhoucm1 > > Sent: Monday, April 24, 2017 10:20 PM > > To: Christian König; amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > > Subject: Re: [PATCH 1/2] drm/amdgpu: fix gpu reset crash > > > > > > > > On 2017年04月24日 17:47, Christian König wrote: > > >> The root cause is some ring doesn't have scheduler, like KIQ ring > > >> > > >> Change-Id: I420e84add9cdd9a7fd1f9921b8a5d0afa3dd2058 > > >> Signed-off-by: Chunming Zhou <David1.Zhou-5C7GfCeVMHo@public.gmane.org> > > > > > > Reviewed-by: Christian König <christian.koenig-5C7GfCeVMHo@public.gmane.org> for both. > > I forgot to add RB when pushing patches, How can I add it again? > > I'll add it before sending upstream. Thanks. David > > Alex > > > > > Sorry for that. > > David Zhou > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx [-- Attachment #1.2: Type: text/html, Size: 3565 bytes --] [-- Attachment #2: Type: text/plain, Size: 154 bytes --] _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-04-26 4:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-24 9:40 [PATCH 1/2] drm/amdgpu: fix gpu reset crash Chunming Zhou
[not found] ` <1493026850-29604-1-git-send-email-David1.Zhou-5C7GfCeVMHo@public.gmane.org>
2017-04-24 9:40 ` [PATCH 2/2] drm/amdgpu: fix NULL pointer error Chunming Zhou
2017-04-24 9:47 ` [PATCH 1/2] drm/amdgpu: fix gpu reset crash Christian König
[not found] ` <a43a71f9-b8b1-4ff3-3532-d13bdd539761-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-04-25 2:20 ` zhoucm1
[not found] ` <58FEB257.2010002-5C7GfCeVMHo@public.gmane.org>
2017-04-25 8:07 ` Christian König
2017-04-25 13:34 ` Deucher, Alexander
[not found] ` <BN6PR12MB1652CAFE2B4D576733D2EBA3F71E0-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-04-26 4:57 ` zhoucm1
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.