* Need help for dead lock of nvme shutdown_lock
@ 2016-10-12 7:44 陈晓光
2016-10-12 9:15 ` Shan Hai
[not found] ` <57fdff29.c6efca0a.f2af0.9897SMTPIN_ADDED_BROKEN@mx.google.com>
0 siblings, 2 replies; 5+ messages in thread
From: 陈晓光 @ 2016-10-12 7:44 UTC (permalink / raw)
Dear maintainers of NVME
I met a dead lock issue when I was using NVME.
the shutdown_lock may cause dead lock when nvme_reset_work is working
and nvme_timeout happens.
I checked the latest mainline code, it seems still have such issue,
please help, thank you!
below is the log:
[ 122.266109] nvme 0000:01:00.0: I/O 4 QID 2 timeout, reset controller
[ 122.270260]
[ 122.270760] =================================
[ 122.271649] [ INFO: inconsistent lock state ]
[ 122.272586] 4.4.7-g7e84a8a #1403 Not tainted
[ 122.273457] ---------------------------------
[ 122.274378] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 122.275592] swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 122.276600] (&dev->shutdown_lock){+.?...}, at: [<c032cb44>]
nvme_dev_shutdown+0x20/0x254
[ 122.278906] {SOFTIRQ-ON-W} state was registered at:
[ 122.279882] [<c0508640>] mutex_lock_nested+0x50/0x43c
[ 122.281187] [<c032cb44>] nvme_dev_shutdown+0x20/0x254
[ 122.282419] [<c032d808>] nvme_reset_work+0x40/0xe00
[ 122.283628] [<c003eb2c>] process_one_work+0x3a0/0x78c
[ 122.284938] [<c003fbec>] worker_thread+0x2b0/0x3ec
[ 122.286136] [<c0044ba4>] kthread+0xdc/0xf0
[ 122.287249] [<c000fa50>] ret_from_fork+0x14/0x24
[ 122.288505] irq event stamp: 293762
[ 122.289276] hardirqs last enabled at (293762): [<c050b8e4>]
_raw_spin_unlock_irqrestore+0x3c/0x74
[ 122.291149] hardirqs last disabled at (293761): [<c050b738>]
_raw_spin_lock_irqsave+0x18/0x54
[ 122.292919] softirqs last enabled at (293750): [<c0029fe8>]
irq_enter+0x4c/0x7c
[ 122.294533] softirqs last disabled at (293751): [<c002a09c>]
irq_exit+0x84/0x110
[ 122.296107]
[ 122.296107] other info that might help us debug this:
[ 122.297394] Possible unsafe locking scenario:
[ 122.297394]
[ 122.298567] CPU0
[ 122.299154] ----
[ 122.299731] lock(&dev->shutdown_lock);
[ 122.300805] <Interrupt>
[ 122.301409] lock(&dev->shutdown_lock);
[ 122.302508]
[ 122.302508] *** DEADLOCK ***
[ 122.302508]
[ 122.303774] 1 lock held by swapper/1/0:
[ 122.304571] #0: (((&q->timeout))){+.-...}, at: [<c008eb88>]
call_timer_fn+0x0/0x478
[ 122.306856]
[ 122.306856] stack backtrace:
[ 122.307937] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403
[ 122.310244] [<c0018b74>] (unwind_backtrace) from [<c0014318>]
(show_stack+0x10/0x14)
[ 122.311918] [<c0014318>] (show_stack) from [<c027acf4>]
(dump_stack+0xa0/0xd8)
[ 122.313507] [<c027acf4>] (dump_stack) from [<c00f2ed0>]
(print_usage_bug+0x25c/0x2d0)
[ 122.315196] [<c00f2ed0>] (print_usage_bug) from [<c006dff4>]
(mark_lock+0x3d0/0x6b8)
[ 122.316854] [<c006dff4>] (mark_lock) from [<c006efcc>]
(__lock_acquire+0x790/0x19f4)
[ 122.318498] [<c006efcc>] (__lock_acquire) from [<c0070b18>]
(lock_acquire+0x1d8/0x2a0)
[ 122.320179] [<c0070b18>] (lock_acquire) from [<c0508640>]
(mutex_lock_nested+0x50/0x43c)
[ 122.321879] [<c0508640>] (mutex_lock_nested) from [<c032cb44>]
(nvme_dev_shutdown+0x20/0x254)
[ 122.323617] [<c032cb44>] (nvme_dev_shutdown) from [<c032cf80>]
(nvme_timeout+0x6c/0x1a8)
[ 122.325291] [<c032cf80>] (nvme_timeout) from [<c0266b18>]
(blk_mq_rq_timed_out+0x2c/0x7c)
[ 122.326972] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>]
(bt_for_each+0x84/0xb0)
[ 122.328665] [<c0269f70>] (bt_for_each) from [<c026a400>]
(blk_mq_queue_tag_busy_iter+0x8c/0x9c)
[ 122.330424] [<c026a400>] (blk_mq_queue_tag_busy_iter) from
[<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8)
[ 122.332249] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>]
(call_timer_fn+0x1b8/0x478)
[ 122.333931] [<c008ed40>] (call_timer_fn) from [<c008f30c>]
(run_timer_softirq+0x30c/0x3c8)
[ 122.335653] [<c008f30c>] (run_timer_softirq) from [<c0029a28>]
(__do_softirq+0x23c/0x5c8)
[ 122.337342] [<c0029a28>] (__do_softirq) from [<c002a09c>]
(irq_exit+0x84/0x110)
[ 122.338909] [<c002a09c>] (irq_exit) from [<c007c548>]
(__handle_domain_irq+0x8c/0xb0)
[ 122.340561] [<c007c548>] (__handle_domain_irq) from [<c0009498>]
(gic_handle_irq+0x4c/0x90)
[ 122.342238] [<c0009498>] (gic_handle_irq) from [<c0014df8>]
(__irq_svc+0x58/0x98)
[ 122.343666] Exception stack(0xe00fbf80 to 0xe00fbfc8)
[ 122.344855] bf80: 00000001 00000004 00000000 e00f0c00 e00fa000
00000051 10c0387d c0bab424
[ 122.346507] bfa0: e00fbfe0 c050ee80 00000000 c0bb478c c0bab448
e00fbfd0 c006e4e8 c0010568
[ 122.348018] bfc0: 20000013 ffffffff
[ 122.349016] [<c0014df8>] (__irq_svc) from [<c0010568>]
(arch_cpu_idle+0x30/0x3c)
[ 122.350618] [<c0010568>] (arch_cpu_idle) from [<c00671d0>]
(default_idle_call+0x30/0x38)
[ 122.352298] [<c00671d0>] (default_idle_call) from [<c00673f4>]
(cpu_startup_entry+0x21c/0x380)
[ 122.354007] [<c00673f4>] (cpu_startup_entry) from [<0000956c>] (0x956c)
[ 122.357684] BUG: scheduling while atomic: swapper/1/0/0x00000102
[ 122.358856] INFO: lockdep is turned off.
[ 122.359672] Modules linked in:
[ 122.360606] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403
[ 122.362886] [<c0018b74>] (unwind_backtrace) from [<c0014318>]
(show_stack+0x10/0x14)
[ 122.364506] [<c0014318>] (show_stack) from [<c027acf4>]
(dump_stack+0xa0/0xd8)
[ 122.366056] [<c027acf4>] (dump_stack) from [<c00f2590>]
(__schedule_bug+0x60/0x7c)
[ 122.367682] [<c00f2590>] (__schedule_bug) from [<c0505e08>]
(__schedule+0x84/0x9c4)
[ 122.369297] [<c0505e08>] (__schedule) from [<c0506974>] (schedule+0xa8/0xc0)
[ 122.370853] [<c0506974>] (schedule) from [<c050aa88>]
(schedule_timeout+0x1c/0x3a8)
[ 122.372494] [<c050aa88>] (schedule_timeout) from [<c0507294>]
(wait_for_common+0x110/0x150)
[ 122.374239] [<c0507294>] (wait_for_common) from [<c0507328>]
(wait_for_completion_killable+0x10/0x28)
[ 122.376138] [<c0507328>] (wait_for_completion_killable) from
[<c0044a10>] (kthread_create_on_node+0xe0/0x198)
[ 122.378128] [<c0044a10>] (kthread_create_on_node) from [<c032c8f0>]
(nvme_disable_io_queues+0x74/0x2a8)
[ 122.380004] [<c032c8f0>] (nvme_disable_io_queues) from [<c032cc24>]
(nvme_dev_shutdown+0x100/0x254)
[ 122.381811] [<c032cc24>] (nvme_dev_shutdown) from [<c032cf80>]
(nvme_timeout+0x6c/0x1a8)
[ 122.383473] [<c032cf80>] (nvme_timeout) from [<c0266b18>]
(blk_mq_rq_timed_out+0x2c/0x7c)
[ 122.385166] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>]
(bt_for_each+0x84/0xb0)
[ 122.386843] [<c0269f70>] (bt_for_each) from [<c026a400>]
(blk_mq_queue_tag_busy_iter+0x8c/0x9c)
[ 122.388610] [<c026a400>] (blk_mq_queue_tag_busy_iter) from
[<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8)
[ 122.390407] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>]
(call_timer_fn+0x1b8/0x478)
[ 122.392099] [<c008ed40>] (call_timer_fn) from [<c008f30c>]
(run_timer_softirq+0x30c/0x3c8)
[ 122.393816] [<c008f30c>] (run_timer_softirq) from [<c0029a28>]
(__do_softirq+0x23c/0x5c8)
[ 122.395502] [<c0029a28>] (__do_softirq) from [<c002a09c>]
(irq_exit+0x84/0x110)
[ 122.397070] [<c002a09c>] (irq_exit) from [<c007c548>]
(__handle_domain_irq+0x8c/0xb0)
[ 122.398716] [<c007c548>] (__handle_domain_irq) from [<c0009498>]
(gic_handle_irq+0x4c/0x90)
[ 122.400395] [<c0009498>] (gic_handle_irq) from [<c0014df8>]
(__irq_svc+0x58/0x98)
^ permalink raw reply [flat|nested] 5+ messages in thread* Need help for dead lock of nvme shutdown_lock 2016-10-12 7:44 Need help for dead lock of nvme shutdown_lock 陈晓光 @ 2016-10-12 9:15 ` Shan Hai [not found] ` <57fdff29.c6efca0a.f2af0.9897SMTPIN_ADDED_BROKEN@mx.google.com> 1 sibling, 0 replies; 5+ messages in thread From: Shan Hai @ 2016-10-12 9:15 UTC (permalink / raw) On Wed, Oct 12, 2016@03:44:34PM +0800, ??? wrote: > Dear maintainers of NVME > I met a dead lock issue when I was using NVME. > the shutdown_lock may cause dead lock when nvme_reset_work is working > and nvme_timeout happens. > I checked the latest mainline code, it seems still have such issue, > please help, thank you! > below is the log: > Hi Xiaoguang, Would you please try the below patch? >From 930fde076dd982d10ee7613214c1e2745551f1dd Mon Sep 17 00:00:00 2001 From: Shan Hai <shan.hai@oracle.com> Date: Wed, 12 Oct 2016 16:52:51 +0800 Subject: [PATCH 1/1] drivers/nvme: remove redundant nvme_dev_disable calls from nvme_timeout Currently the nvme_dev_disable will be called from both process and softirq context respectively from nvme_reset_work and nvme_timeout and it will hold a mutex during its run, this is a <softirq-safe> to <softirq-unsafe> lockdep senario and could cause a dead lock. Fix it by removing redundant calls of the nvme_dev_disable from the nvme_timeout, it's feasible for below reasons: o it's not necessary to call nvme_dev_disable in the nvme_timeout when the controller state is NVME_CTRL_RESETTING, the nvme_reset_work has called nvme_dev_disable and then set the controller state to NVME_CTRL_RESETTING o it's not necessary to call nvme_dev_disable before scheduling reset work in the nvme_timeout, sine it will be called from nvme_reset_work at last Signed-off-by: Shan Hai <shan.hai at oracle.com> --- drivers/nvme/host/pci.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 0fc99f0..6cd855a 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -878,7 +878,6 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) dev_warn(dev->ctrl.device, "I/O %d QID %d timeout, disable controller\n", req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); req->errors = NVME_SC_CANCELLED; return BLK_EH_HANDLED; } @@ -892,7 +891,6 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) dev_warn(dev->ctrl.device, "I/O %d QID %d timeout, reset controller\n", req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); queue_work(nvme_workq, &dev->reset_work); /* -- 2.7.4 Thanks Shan Hai > [ 122.266109] nvme 0000:01:00.0: I/O 4 QID 2 timeout, reset controller > [ 122.270260] > [ 122.270760] ================================= > [ 122.271649] [ INFO: inconsistent lock state ] > [ 122.272586] 4.4.7-g7e84a8a #1403 Not tainted > [ 122.273457] --------------------------------- > [ 122.274378] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > [ 122.275592] swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > [ 122.276600] (&dev->shutdown_lock){+.?...}, at: [<c032cb44>] > nvme_dev_shutdown+0x20/0x254 > [ 122.278906] {SOFTIRQ-ON-W} state was registered at: > [ 122.279882] [<c0508640>] mutex_lock_nested+0x50/0x43c > [ 122.281187] [<c032cb44>] nvme_dev_shutdown+0x20/0x254 > [ 122.282419] [<c032d808>] nvme_reset_work+0x40/0xe00 > [ 122.283628] [<c003eb2c>] process_one_work+0x3a0/0x78c > [ 122.284938] [<c003fbec>] worker_thread+0x2b0/0x3ec > [ 122.286136] [<c0044ba4>] kthread+0xdc/0xf0 > [ 122.287249] [<c000fa50>] ret_from_fork+0x14/0x24 > [ 122.288505] irq event stamp: 293762 > [ 122.289276] hardirqs last enabled at (293762): [<c050b8e4>] > _raw_spin_unlock_irqrestore+0x3c/0x74 > [ 122.291149] hardirqs last disabled at (293761): [<c050b738>] > _raw_spin_lock_irqsave+0x18/0x54 > [ 122.292919] softirqs last enabled at (293750): [<c0029fe8>] > irq_enter+0x4c/0x7c > [ 122.294533] softirqs last disabled at (293751): [<c002a09c>] > irq_exit+0x84/0x110 > [ 122.296107] > [ 122.296107] other info that might help us debug this: > [ 122.297394] Possible unsafe locking scenario: > [ 122.297394] > [ 122.298567] CPU0 > [ 122.299154] ---- > [ 122.299731] lock(&dev->shutdown_lock); > [ 122.300805] <Interrupt> > [ 122.301409] lock(&dev->shutdown_lock); > [ 122.302508] > [ 122.302508] *** DEADLOCK *** > [ 122.302508] > [ 122.303774] 1 lock held by swapper/1/0: > [ 122.304571] #0: (((&q->timeout))){+.-...}, at: [<c008eb88>] > call_timer_fn+0x0/0x478 > [ 122.306856] > [ 122.306856] stack backtrace: > [ 122.307937] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403 > [ 122.310244] [<c0018b74>] (unwind_backtrace) from [<c0014318>] > (show_stack+0x10/0x14) > [ 122.311918] [<c0014318>] (show_stack) from [<c027acf4>] > (dump_stack+0xa0/0xd8) > [ 122.313507] [<c027acf4>] (dump_stack) from [<c00f2ed0>] > (print_usage_bug+0x25c/0x2d0) > [ 122.315196] [<c00f2ed0>] (print_usage_bug) from [<c006dff4>] > (mark_lock+0x3d0/0x6b8) > [ 122.316854] [<c006dff4>] (mark_lock) from [<c006efcc>] > (__lock_acquire+0x790/0x19f4) > [ 122.318498] [<c006efcc>] (__lock_acquire) from [<c0070b18>] > (lock_acquire+0x1d8/0x2a0) > [ 122.320179] [<c0070b18>] (lock_acquire) from [<c0508640>] > (mutex_lock_nested+0x50/0x43c) > [ 122.321879] [<c0508640>] (mutex_lock_nested) from [<c032cb44>] > (nvme_dev_shutdown+0x20/0x254) > [ 122.323617] [<c032cb44>] (nvme_dev_shutdown) from [<c032cf80>] > (nvme_timeout+0x6c/0x1a8) > [ 122.325291] [<c032cf80>] (nvme_timeout) from [<c0266b18>] > (blk_mq_rq_timed_out+0x2c/0x7c) > [ 122.326972] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>] > (bt_for_each+0x84/0xb0) > [ 122.328665] [<c0269f70>] (bt_for_each) from [<c026a400>] > (blk_mq_queue_tag_busy_iter+0x8c/0x9c) > [ 122.330424] [<c026a400>] (blk_mq_queue_tag_busy_iter) from > [<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8) > [ 122.332249] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>] > (call_timer_fn+0x1b8/0x478) > [ 122.333931] [<c008ed40>] (call_timer_fn) from [<c008f30c>] > (run_timer_softirq+0x30c/0x3c8) > [ 122.335653] [<c008f30c>] (run_timer_softirq) from [<c0029a28>] > (__do_softirq+0x23c/0x5c8) > [ 122.337342] [<c0029a28>] (__do_softirq) from [<c002a09c>] > (irq_exit+0x84/0x110) > [ 122.338909] [<c002a09c>] (irq_exit) from [<c007c548>] > (__handle_domain_irq+0x8c/0xb0) > [ 122.340561] [<c007c548>] (__handle_domain_irq) from [<c0009498>] > (gic_handle_irq+0x4c/0x90) > [ 122.342238] [<c0009498>] (gic_handle_irq) from [<c0014df8>] > (__irq_svc+0x58/0x98) > [ 122.343666] Exception stack(0xe00fbf80 to 0xe00fbfc8) > [ 122.344855] bf80: 00000001 00000004 00000000 e00f0c00 e00fa000 > 00000051 10c0387d c0bab424 > [ 122.346507] bfa0: e00fbfe0 c050ee80 00000000 c0bb478c c0bab448 > e00fbfd0 c006e4e8 c0010568 > [ 122.348018] bfc0: 20000013 ffffffff > [ 122.349016] [<c0014df8>] (__irq_svc) from [<c0010568>] > (arch_cpu_idle+0x30/0x3c) > [ 122.350618] [<c0010568>] (arch_cpu_idle) from [<c00671d0>] > (default_idle_call+0x30/0x38) > [ 122.352298] [<c00671d0>] (default_idle_call) from [<c00673f4>] > (cpu_startup_entry+0x21c/0x380) > [ 122.354007] [<c00673f4>] (cpu_startup_entry) from [<0000956c>] (0x956c) > [ 122.357684] BUG: scheduling while atomic: swapper/1/0/0x00000102 > [ 122.358856] INFO: lockdep is turned off. > [ 122.359672] Modules linked in: > [ 122.360606] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403 > [ 122.362886] [<c0018b74>] (unwind_backtrace) from [<c0014318>] > (show_stack+0x10/0x14) > [ 122.364506] [<c0014318>] (show_stack) from [<c027acf4>] > (dump_stack+0xa0/0xd8) > [ 122.366056] [<c027acf4>] (dump_stack) from [<c00f2590>] > (__schedule_bug+0x60/0x7c) > [ 122.367682] [<c00f2590>] (__schedule_bug) from [<c0505e08>] > (__schedule+0x84/0x9c4) > [ 122.369297] [<c0505e08>] (__schedule) from [<c0506974>] (schedule+0xa8/0xc0) > [ 122.370853] [<c0506974>] (schedule) from [<c050aa88>] > (schedule_timeout+0x1c/0x3a8) > [ 122.372494] [<c050aa88>] (schedule_timeout) from [<c0507294>] > (wait_for_common+0x110/0x150) > [ 122.374239] [<c0507294>] (wait_for_common) from [<c0507328>] > (wait_for_completion_killable+0x10/0x28) > [ 122.376138] [<c0507328>] (wait_for_completion_killable) from > [<c0044a10>] (kthread_create_on_node+0xe0/0x198) > [ 122.378128] [<c0044a10>] (kthread_create_on_node) from [<c032c8f0>] > (nvme_disable_io_queues+0x74/0x2a8) > [ 122.380004] [<c032c8f0>] (nvme_disable_io_queues) from [<c032cc24>] > (nvme_dev_shutdown+0x100/0x254) > [ 122.381811] [<c032cc24>] (nvme_dev_shutdown) from [<c032cf80>] > (nvme_timeout+0x6c/0x1a8) > [ 122.383473] [<c032cf80>] (nvme_timeout) from [<c0266b18>] > (blk_mq_rq_timed_out+0x2c/0x7c) > [ 122.385166] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>] > (bt_for_each+0x84/0xb0) > [ 122.386843] [<c0269f70>] (bt_for_each) from [<c026a400>] > (blk_mq_queue_tag_busy_iter+0x8c/0x9c) > [ 122.388610] [<c026a400>] (blk_mq_queue_tag_busy_iter) from > [<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8) > [ 122.390407] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>] > (call_timer_fn+0x1b8/0x478) > [ 122.392099] [<c008ed40>] (call_timer_fn) from [<c008f30c>] > (run_timer_softirq+0x30c/0x3c8) > [ 122.393816] [<c008f30c>] (run_timer_softirq) from [<c0029a28>] > (__do_softirq+0x23c/0x5c8) > [ 122.395502] [<c0029a28>] (__do_softirq) from [<c002a09c>] > (irq_exit+0x84/0x110) > [ 122.397070] [<c002a09c>] (irq_exit) from [<c007c548>] > (__handle_domain_irq+0x8c/0xb0) > [ 122.398716] [<c007c548>] (__handle_domain_irq) from [<c0009498>] > (gic_handle_irq+0x4c/0x90) > [ 122.400395] [<c0009498>] (gic_handle_irq) from [<c0014df8>] > (__irq_svc+0x58/0x98) > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme at lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 5+ messages in thread
[parent not found: <57fdff29.c6efca0a.f2af0.9897SMTPIN_ADDED_BROKEN@mx.google.com>]
* Need help for dead lock of nvme shutdown_lock [not found] ` <57fdff29.c6efca0a.f2af0.9897SMTPIN_ADDED_BROKEN@mx.google.com> @ 2016-10-12 9:33 ` 陈晓光 2016-10-12 13:53 ` Keith Busch 0 siblings, 1 reply; 5+ messages in thread From: 陈晓光 @ 2016-10-12 9:33 UTC (permalink / raw) 2016-10-12 17:15 GMT+08:00 Shan Hai <shan.hai at oracle.com>: > > On Wed, Oct 12, 2016@03:44:34PM +0800, ??? wrote: > > Dear maintainers of NVME > > I met a dead lock issue when I was using NVME. > > the shutdown_lock may cause dead lock when nvme_reset_work is working > > and nvme_timeout happens. > > I checked the latest mainline code, it seems still have such issue, > > please help, thank you! > > below is the log: > > > > Hi Xiaoguang, > > Would you please try the below patch? > > Thank you very much for the quick fix, Shan This can absolutely fix the dead lock issue I met. Will this be submitted to mainline? Thanks Xiaoguang > From 930fde076dd982d10ee7613214c1e2745551f1dd Mon Sep 17 00:00:00 2001 > From: Shan Hai <shan.hai at oracle.com> > Date: Wed, 12 Oct 2016 16:52:51 +0800 > Subject: [PATCH 1/1] drivers/nvme: remove redundant nvme_dev_disable calls > from nvme_timeout > > Currently the nvme_dev_disable will be called from both process and > softirq context respectively from nvme_reset_work and nvme_timeout > and it will hold a mutex during its run, this is a <softirq-safe> to > <softirq-unsafe> lockdep senario and could cause a dead lock. > > Fix it by removing redundant calls of the nvme_dev_disable from the > nvme_timeout, it's feasible for below reasons: > o it's not necessary to call nvme_dev_disable in the nvme_timeout > when the controller state is NVME_CTRL_RESETTING, the > nvme_reset_work has called nvme_dev_disable and then > set the controller state to NVME_CTRL_RESETTING > o it's not necessary to call nvme_dev_disable before scheduling > reset work in the nvme_timeout, sine it will be called > from nvme_reset_work at last > > Signed-off-by: Shan Hai <shan.hai at oracle.com> > --- > drivers/nvme/host/pci.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 0fc99f0..6cd855a 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -878,7 +878,6 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) > dev_warn(dev->ctrl.device, > "I/O %d QID %d timeout, disable controller\n", > req->tag, nvmeq->qid); > - nvme_dev_disable(dev, false); > req->errors = NVME_SC_CANCELLED; > return BLK_EH_HANDLED; > } > @@ -892,7 +891,6 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) > dev_warn(dev->ctrl.device, > "I/O %d QID %d timeout, reset controller\n", > req->tag, nvmeq->qid); > - nvme_dev_disable(dev, false); > queue_work(nvme_workq, &dev->reset_work); > > /* > -- > 2.7.4 > > Thanks > Shan Hai > > > [ 122.266109] nvme 0000:01:00.0: I/O 4 QID 2 timeout, reset controller > > [ 122.270260] > > [ 122.270760] ================================= > > [ 122.271649] [ INFO: inconsistent lock state ] > > [ 122.272586] 4.4.7-g7e84a8a #1403 Not tainted > > [ 122.273457] --------------------------------- > > [ 122.274378] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > [ 122.275592] swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > > [ 122.276600] (&dev->shutdown_lock){+.?...}, at: [<c032cb44>] > > nvme_dev_shutdown+0x20/0x254 > > [ 122.278906] {SOFTIRQ-ON-W} state was registered at: > > [ 122.279882] [<c0508640>] mutex_lock_nested+0x50/0x43c > > [ 122.281187] [<c032cb44>] nvme_dev_shutdown+0x20/0x254 > > [ 122.282419] [<c032d808>] nvme_reset_work+0x40/0xe00 > > [ 122.283628] [<c003eb2c>] process_one_work+0x3a0/0x78c > > [ 122.284938] [<c003fbec>] worker_thread+0x2b0/0x3ec > > [ 122.286136] [<c0044ba4>] kthread+0xdc/0xf0 > > [ 122.287249] [<c000fa50>] ret_from_fork+0x14/0x24 > > [ 122.288505] irq event stamp: 293762 > > [ 122.289276] hardirqs last enabled at (293762): [<c050b8e4>] > > _raw_spin_unlock_irqrestore+0x3c/0x74 > > [ 122.291149] hardirqs last disabled at (293761): [<c050b738>] > > _raw_spin_lock_irqsave+0x18/0x54 > > [ 122.292919] softirqs last enabled at (293750): [<c0029fe8>] > > irq_enter+0x4c/0x7c > > [ 122.294533] softirqs last disabled at (293751): [<c002a09c>] > > irq_exit+0x84/0x110 > > [ 122.296107] > > [ 122.296107] other info that might help us debug this: > > [ 122.297394] Possible unsafe locking scenario: > > [ 122.297394] > > [ 122.298567] CPU0 > > [ 122.299154] ---- > > [ 122.299731] lock(&dev->shutdown_lock); > > [ 122.300805] <Interrupt> > > [ 122.301409] lock(&dev->shutdown_lock); > > [ 122.302508] > > [ 122.302508] *** DEADLOCK *** > > [ 122.302508] > > [ 122.303774] 1 lock held by swapper/1/0: > > [ 122.304571] #0: (((&q->timeout))){+.-...}, at: [<c008eb88>] > > call_timer_fn+0x0/0x478 > > [ 122.306856] > > [ 122.306856] stack backtrace: > > [ 122.307937] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403 > > [ 122.310244] [<c0018b74>] (unwind_backtrace) from [<c0014318>] > > (show_stack+0x10/0x14) > > [ 122.311918] [<c0014318>] (show_stack) from [<c027acf4>] > > (dump_stack+0xa0/0xd8) > > [ 122.313507] [<c027acf4>] (dump_stack) from [<c00f2ed0>] > > (print_usage_bug+0x25c/0x2d0) > > [ 122.315196] [<c00f2ed0>] (print_usage_bug) from [<c006dff4>] > > (mark_lock+0x3d0/0x6b8) > > [ 122.316854] [<c006dff4>] (mark_lock) from [<c006efcc>] > > (__lock_acquire+0x790/0x19f4) > > [ 122.318498] [<c006efcc>] (__lock_acquire) from [<c0070b18>] > > (lock_acquire+0x1d8/0x2a0) > > [ 122.320179] [<c0070b18>] (lock_acquire) from [<c0508640>] > > (mutex_lock_nested+0x50/0x43c) > > [ 122.321879] [<c0508640>] (mutex_lock_nested) from [<c032cb44>] > > (nvme_dev_shutdown+0x20/0x254) > > [ 122.323617] [<c032cb44>] (nvme_dev_shutdown) from [<c032cf80>] > > (nvme_timeout+0x6c/0x1a8) > > [ 122.325291] [<c032cf80>] (nvme_timeout) from [<c0266b18>] > > (blk_mq_rq_timed_out+0x2c/0x7c) > > [ 122.326972] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>] > > (bt_for_each+0x84/0xb0) > > [ 122.328665] [<c0269f70>] (bt_for_each) from [<c026a400>] > > (blk_mq_queue_tag_busy_iter+0x8c/0x9c) > > [ 122.330424] [<c026a400>] (blk_mq_queue_tag_busy_iter) from > > [<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8) > > [ 122.332249] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>] > > (call_timer_fn+0x1b8/0x478) > > [ 122.333931] [<c008ed40>] (call_timer_fn) from [<c008f30c>] > > (run_timer_softirq+0x30c/0x3c8) > > [ 122.335653] [<c008f30c>] (run_timer_softirq) from [<c0029a28>] > > (__do_softirq+0x23c/0x5c8) > > [ 122.337342] [<c0029a28>] (__do_softirq) from [<c002a09c>] > > (irq_exit+0x84/0x110) > > [ 122.338909] [<c002a09c>] (irq_exit) from [<c007c548>] > > (__handle_domain_irq+0x8c/0xb0) > > [ 122.340561] [<c007c548>] (__handle_domain_irq) from [<c0009498>] > > (gic_handle_irq+0x4c/0x90) > > [ 122.342238] [<c0009498>] (gic_handle_irq) from [<c0014df8>] > > (__irq_svc+0x58/0x98) > > [ 122.343666] Exception stack(0xe00fbf80 to 0xe00fbfc8) > > [ 122.344855] bf80: 00000001 00000004 00000000 e00f0c00 e00fa000 > > 00000051 10c0387d c0bab424 > > [ 122.346507] bfa0: e00fbfe0 c050ee80 00000000 c0bb478c c0bab448 > > e00fbfd0 c006e4e8 c0010568 > > [ 122.348018] bfc0: 20000013 ffffffff > > [ 122.349016] [<c0014df8>] (__irq_svc) from [<c0010568>] > > (arch_cpu_idle+0x30/0x3c) > > [ 122.350618] [<c0010568>] (arch_cpu_idle) from [<c00671d0>] > > (default_idle_call+0x30/0x38) > > [ 122.352298] [<c00671d0>] (default_idle_call) from [<c00673f4>] > > (cpu_startup_entry+0x21c/0x380) > > [ 122.354007] [<c00673f4>] (cpu_startup_entry) from [<0000956c>] (0x956c) > > [ 122.357684] BUG: scheduling while atomic: swapper/1/0/0x00000102 > > [ 122.358856] INFO: lockdep is turned off. > > [ 122.359672] Modules linked in: > > [ 122.360606] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.7-g7e84a8a #1403 > > [ 122.362886] [<c0018b74>] (unwind_backtrace) from [<c0014318>] > > (show_stack+0x10/0x14) > > [ 122.364506] [<c0014318>] (show_stack) from [<c027acf4>] > > (dump_stack+0xa0/0xd8) > > [ 122.366056] [<c027acf4>] (dump_stack) from [<c00f2590>] > > (__schedule_bug+0x60/0x7c) > > [ 122.367682] [<c00f2590>] (__schedule_bug) from [<c0505e08>] > > (__schedule+0x84/0x9c4) > > [ 122.369297] [<c0505e08>] (__schedule) from [<c0506974>] (schedule+0xa8/0xc0) > > [ 122.370853] [<c0506974>] (schedule) from [<c050aa88>] > > (schedule_timeout+0x1c/0x3a8) > > [ 122.372494] [<c050aa88>] (schedule_timeout) from [<c0507294>] > > (wait_for_common+0x110/0x150) > > [ 122.374239] [<c0507294>] (wait_for_common) from [<c0507328>] > > (wait_for_completion_killable+0x10/0x28) > > [ 122.376138] [<c0507328>] (wait_for_completion_killable) from > > [<c0044a10>] (kthread_create_on_node+0xe0/0x198) > > [ 122.378128] [<c0044a10>] (kthread_create_on_node) from [<c032c8f0>] > > (nvme_disable_io_queues+0x74/0x2a8) > > [ 122.380004] [<c032c8f0>] (nvme_disable_io_queues) from [<c032cc24>] > > (nvme_dev_shutdown+0x100/0x254) > > [ 122.381811] [<c032cc24>] (nvme_dev_shutdown) from [<c032cf80>] > > (nvme_timeout+0x6c/0x1a8) > > [ 122.383473] [<c032cf80>] (nvme_timeout) from [<c0266b18>] > > (blk_mq_rq_timed_out+0x2c/0x7c) > > [ 122.385166] [<c0266b18>] (blk_mq_rq_timed_out) from [<c0269f70>] > > (bt_for_each+0x84/0xb0) > > [ 122.386843] [<c0269f70>] (bt_for_each) from [<c026a400>] > > (blk_mq_queue_tag_busy_iter+0x8c/0x9c) > > [ 122.388610] [<c026a400>] (blk_mq_queue_tag_busy_iter) from > > [<c0265d3c>] (blk_mq_rq_timer+0x20/0xb8) > > [ 122.390407] [<c0265d3c>] (blk_mq_rq_timer) from [<c008ed40>] > > (call_timer_fn+0x1b8/0x478) > > [ 122.392099] [<c008ed40>] (call_timer_fn) from [<c008f30c>] > > (run_timer_softirq+0x30c/0x3c8) > > [ 122.393816] [<c008f30c>] (run_timer_softirq) from [<c0029a28>] > > (__do_softirq+0x23c/0x5c8) > > [ 122.395502] [<c0029a28>] (__do_softirq) from [<c002a09c>] > > (irq_exit+0x84/0x110) > > [ 122.397070] [<c002a09c>] (irq_exit) from [<c007c548>] > > (__handle_domain_irq+0x8c/0xb0) > > [ 122.398716] [<c007c548>] (__handle_domain_irq) from [<c0009498>] > > (gic_handle_irq+0x4c/0x90) > > [ 122.400395] [<c0009498>] (gic_handle_irq) from [<c0014df8>] > > (__irq_svc+0x58/0x98) > > > > _______________________________________________ > > Linux-nvme mailing list > > Linux-nvme at lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
* Need help for dead lock of nvme shutdown_lock 2016-10-12 9:33 ` 陈晓光 @ 2016-10-12 13:53 ` Keith Busch 2016-10-13 1:19 ` 陈晓光 0 siblings, 1 reply; 5+ messages in thread From: Keith Busch @ 2016-10-12 13:53 UTC (permalink / raw) On Wed, Oct 12, 2016@05:33:01PM +0800, ??? wrote: > 2016-10-12 17:15 GMT+08:00 Shan Hai <shan.hai at oracle.com>: > > > > On Wed, Oct 12, 2016@03:44:34PM +0800, ??? wrote: > > > Dear maintainers of NVME > > > I met a dead lock issue when I was using NVME. > > > the shutdown_lock may cause dead lock when nvme_reset_work is working > > > and nvme_timeout happens. > > > I checked the latest mainline code, it seems still have such issue, > > > please help, thank you! > > > below is the log: > > > > > > > Hi Xiaoguang, > > > > Would you please try the below patch? > > > > > Thank you very much for the quick fix, Shan > This can absolutely fix the dead lock issue I met. > Will this be submitted to mainline? NAK! First of all, you can't complete a request while the controller still owns it. You're risking data and/or memory corruption from doing that. Secondly, there is no such mainline kernel that tries to disable the controller in a softirq context. Whoever built your kernel needs to cherry pick 287922eb, "block: defer timeouts to a workqueue". ^ permalink raw reply [flat|nested] 5+ messages in thread
* Need help for dead lock of nvme shutdown_lock 2016-10-12 13:53 ` Keith Busch @ 2016-10-13 1:19 ` 陈晓光 0 siblings, 0 replies; 5+ messages in thread From: 陈晓光 @ 2016-10-13 1:19 UTC (permalink / raw) 2016-10-12 21:53 GMT+08:00 Keith Busch <keith.busch at intel.com>: > On Wed, Oct 12, 2016@05:33:01PM +0800, ??? wrote: >> 2016-10-12 17:15 GMT+08:00 Shan Hai <shan.hai at oracle.com>: >> > >> > On Wed, Oct 12, 2016@03:44:34PM +0800, ??? wrote: >> > > Dear maintainers of NVME >> > > I met a dead lock issue when I was using NVME. >> > > the shutdown_lock may cause dead lock when nvme_reset_work is working >> > > and nvme_timeout happens. >> > > I checked the latest mainline code, it seems still have such issue, >> > > please help, thank you! >> > > below is the log: >> > > >> > >> > Hi Xiaoguang, >> > >> > Would you please try the below patch? >> > >> > >> Thank you very much for the quick fix, Shan >> This can absolutely fix the dead lock issue I met. >> Will this be submitted to mainline? > > NAK! > > First of all, you can't complete a request while the controller still > owns it. You're risking data and/or memory corruption from doing that. > > Secondly, there is no such mainline kernel that tries to disable the > controller in a softirq context. Whoever built your kernel needs to > cherry pick 287922eb, "block: defer timeouts to a workqueue". Thank you Keith, this patch works for me :) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-10-13 1:19 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-12 7:44 Need help for dead lock of nvme shutdown_lock 陈晓光
2016-10-12 9:15 ` Shan Hai
[not found] ` <57fdff29.c6efca0a.f2af0.9897SMTPIN_ADDED_BROKEN@mx.google.com>
2016-10-12 9:33 ` 陈晓光
2016-10-12 13:53 ` Keith Busch
2016-10-13 1:19 ` 陈晓光
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.