All of lore.kernel.org
 help / color / mirror / Atom feed
From: mlin@kernel.org (Ming Lin)
Subject: [PATCH v2] nvme: add missing lock nesting notation
Date: Tue, 05 Apr 2016 10:32:04 -0700	[thread overview]
Message-ID: <1459877524-15471-1-git-send-email-mlin@kernel.org> (raw)

From: Ming Lin <ming.l@ssi.samsung.com>

When unloading driver, nvme_disable_io_queues() calls nvme_delete_queue()
that sends nvme_admin_delete_cq command to admin sq. So when the command
completed, the lock acquired by nvme_irq() actually belongs to admin queue.

While the lock that nvme_del_cq_end() trying to acquire belongs to io queue.
So it will not deadlock.

This patch adds lock nesting notation to fix following report.

[  109.840952] =============================================
[  109.846379] [ INFO: possible recursive locking detected ]
[  109.851806] 4.5.0+ #180 Tainted: G            E
[  109.856533] ---------------------------------------------
[  109.861958] swapper/0/0 is trying to acquire lock:
[  109.866771]  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[  109.876535]
[  109.876535] but task is already holding lock:
[  109.882398]  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[  109.891547]
[  109.891547] other info that might help us debug this:
[  109.898107]  Possible unsafe locking scenario:
[  109.898107]
[  109.904056]        CPU0
[  109.906515]        ----
[  109.908974]   lock(&(&nvmeq->q_lock)->rlock);
[  109.913381]   lock(&(&nvmeq->q_lock)->rlock);
[  109.917787]
[  109.917787]  *** DEADLOCK ***
[  109.917787]
[  109.923738]  May be due to missing lock nesting notation
[  109.923738]
[  109.930558] 1 lock held by swapper/0/0:
[  109.934413]  #0:  (&(&nvmeq->q_lock)->rlock){-.....}, at: [<ffffffffc0820c2b>] nvme_irq+0x1b/0x50 [nvme]
[  109.944010]
[  109.944010] stack backtrace:
[  109.948389] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.5.0+ #180
[  109.955734] Hardware name: Dell Inc. OptiPlex 7010/0YXT71, BIOS A15 08/12/2013
[  109.962989]  0000000000000000 ffff88011e203c38 ffffffff81383d9c ffffffff81c13540
[  109.970478]  ffffffff826711d0 ffff88011e203ce8 ffffffff810bb429 0000000000000046
[  109.977964]  0000000000000046 0000000000000000 0000000000b2e597 ffffffff81f4cb00
[  109.985453] Call Trace:
[  109.987911]  <IRQ>  [<ffffffff81383d9c>] dump_stack+0x85/0xc9
[  109.993711]  [<ffffffff810bb429>] __lock_acquire+0x19b9/0x1c60
[  109.999575]  [<ffffffff810b6d1d>] ? trace_hardirqs_off+0xd/0x10
[  110.005524]  [<ffffffff810b386d>] ? complete+0x3d/0x50
[  110.010688]  [<ffffffff810bb760>] lock_acquire+0x90/0xf0
[  110.016029]  [<ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[  110.022418]  [<ffffffff81772afb>] _raw_spin_lock_irqsave+0x4b/0x60
[  110.028632]  [<ffffffffc0820bc6>] ? nvme_del_cq_end+0x26/0x70 [nvme]
[  110.035019]  [<ffffffffc0820bc6>] nvme_del_cq_end+0x26/0x70 [nvme]
[  110.041232]  [<ffffffff8135b485>] blk_mq_end_request+0x35/0x60
[  110.047095]  [<ffffffffc0821ad8>] nvme_complete_rq+0x68/0x190 [nvme]
[  110.053481]  [<ffffffff8135b53f>] __blk_mq_complete_request+0x8f/0x130
[  110.060043]  [<ffffffff8135b611>] blk_mq_complete_request+0x31/0x40
[  110.066343]  [<ffffffffc08209e3>] __nvme_process_cq+0x83/0x240 [nvme]
[  110.072818]  [<ffffffffc0820c35>] nvme_irq+0x25/0x50 [nvme]
[  110.078419]  [<ffffffff810cdb66>] handle_irq_event_percpu+0x36/0x110
[  110.084804]  [<ffffffff810cdc77>] handle_irq_event+0x37/0x60
[  110.090491]  [<ffffffff810d0ea3>] handle_edge_irq+0x93/0x150
[  110.096180]  [<ffffffff81012306>] handle_irq+0xa6/0x130
[  110.101431]  [<ffffffff81011abe>] do_IRQ+0x5e/0x120
[  110.106333]  [<ffffffff8177384c>] common_interrupt+0x8c/0x8c

Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Signed-off-by: Ming Lin <ming.l at ssi.samsung.com>
---
 drivers/nvme/host/pci.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

v2:
  - add a little comment
  - add Johannes's review tag

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e578362..d141692 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1571,7 +1571,13 @@ static void nvme_del_cq_end(struct request *req, int error)
 	if (!error) {
 		unsigned long flags;
 
-		spin_lock_irqsave(&nvmeq->q_lock, flags);
+		/*
+		 * We might be called with the AQ q_lock held
+		 * and the I/O queue q_lock should always
+		 * nest inside the AQ one.
+		 */
+		spin_lock_irqsave_nested(&nvmeq->q_lock, flags,
+					SINGLE_DEPTH_NESTING);
 		nvme_process_cq(nvmeq);
 		spin_unlock_irqrestore(&nvmeq->q_lock, flags);
 	}
-- 
1.9.1

             reply	other threads:[~2016-04-05 17:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 17:32 Ming Lin [this message]
2016-04-05 17:33 ` [PATCH v2] nvme: add missing lock nesting notation Christoph Hellwig
2016-04-06  8:29 ` sagig
2016-04-12 19:01 ` Christoph Hellwig
2016-04-12 19:04   ` Jens Axboe
2016-04-12 19:08     ` Christoph Hellwig
2016-04-12 19:13       ` Jens Axboe
2016-04-12 19:15         ` Christoph Hellwig
2016-04-12 19:17           ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459877524-15471-1-git-send-email-mlin@kernel.org \
    --to=mlin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.