From: Balbir Singh <sblbir@amzn.com>
To: <linux-nvme@lists.infradead.org>
Cc: kbusch@kernel.org, axboe@fb.com, Balbir Singh <sblbir@amzn.com>,
hch@lst.de, sagi@grimberg.me
Subject: [PATCH v2 1/2] nvme/host/pci: Fix a race in controller removal
Date: Fri, 13 Sep 2019 23:36:30 +0000 [thread overview]
Message-ID: <20190913233631.15352-1-sblbir@amzn.com> (raw)
This race is hard to hit in general, now that we
have the shutdown_lock in both nvme_reset_work() and
nvme_dev_disable()
The real issue is that after doing all the setup work
in nvme_reset_work(), when get another timeout (nvme_timeout()),
then we proceed to disable the controller. This causes
the reset work to only partially progress and then fail.
Depending on the progress made, we call into
nvme_remove_dead_ctrl(), which does another
nvme_dev_disable() freezing the block-mq queues.
I've noticed a race with udevd with udevd trying to re-read
the partition table, it ends up with the bd_mutex held and
it ends up waiting in blk_queue_enter(), since we froze
the queues in nvme_dev_disable(). nvme_kill_queues() calls
revalidate_disk() and ends up waiting on the bd_mutex
resulting in a deadlock.
Allow the hung tasks a chance by unfreezing the queues after
setting dying bits on the queue, then call revalidate_disk()
to update the disk size.
NOTE: I've seen this race when the controller does not
respond to IOs or abort requests, but responds to other
commands and even signals it's ready after its reset,
but still drops IO. I've tested this by emulating the
behaviour in the driver.
Signed-off-by: Balbir Singh <sblbir@amzn.com>
---
Changelog:
- Rely on blk_set_queue_dying to do the wake_all()
drivers/nvme/host/core.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b45f82d58be8..f6ddb58a7013 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -103,10 +103,16 @@ static void nvme_set_queue_dying(struct nvme_ns *ns)
*/
if (!ns->disk || test_and_set_bit(NVME_NS_DEAD, &ns->flags))
return;
- revalidate_disk(ns->disk);
blk_set_queue_dying(ns->queue);
/* Forcibly unquiesce queues to avoid blocking dispatch */
blk_mq_unquiesce_queue(ns->queue);
+ /*
+ * revalidate_disk, after all pending IO is cleaned up
+ * by blk_set_queue_dying, largely any races with blk parittion
+ * reads that might come in after freezing the queues, otherwise
+ * we'll end up waiting up on bd_mutex, creating a deadlock.
+ */
+ revalidate_disk(ns->disk);
}
static void nvme_queue_scan(struct nvme_ctrl *ctrl)
--
2.16.5
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next reply other threads:[~2019-09-13 23:36 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-13 23:36 Balbir Singh [this message]
2019-09-13 23:36 ` [PATCH v2 2/2] nvme/host/core: Allow overriding of wait_ready timeout Balbir Singh
2019-09-16 7:41 ` Christoph Hellwig
2019-09-16 12:33 ` Singh, Balbir
2019-09-16 16:01 ` hch
2019-09-16 21:04 ` Singh, Balbir
2019-09-17 1:14 ` Keith Busch
2019-09-17 2:56 ` Singh, Balbir
2019-09-17 3:17 ` Bart Van Assche
2019-09-17 5:02 ` Singh, Balbir
2019-09-17 17:21 ` James Smart
2019-09-17 20:08 ` James Smart
2019-09-17 3:54 ` Keith Busch
2019-09-16 7:49 ` [PATCH v2 1/2] nvme/host/pci: Fix a race in controller removal Christoph Hellwig
2019-09-16 12:07 ` Singh, Balbir
2019-09-16 15:40 ` Bart Van Assche
2019-09-16 19:38 ` Singh, Balbir
2019-09-16 19:56 ` Bart Van Assche
2019-09-16 20:40 ` Singh, Balbir
2019-09-17 17:55 ` Bart Van Assche
2019-09-17 20:30 ` Keith Busch
2019-09-17 20:44 ` Singh, Balbir
2019-09-16 20:07 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190913233631.15352-1-sblbir@amzn.com \
--to=sblbir@amzn.com \
--cc=axboe@fb.com \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox