public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org, Daniel Wagner <dwagner@suse.de>,
	Sagi Grimberg <sagi@grimberg.me>,
	Keith Busch <keith.busch@wdc.com>, Hannes Reinecke <hare@suse.de>
Subject: [PATCH 1/2] nvme: fixup kato deadlock
Date: Tue, 23 Feb 2021 13:07:27 +0100	[thread overview]
Message-ID: <20210223120728.104699-2-hare@suse.de> (raw)
In-Reply-To: <20210223120728.104699-1-hare@suse.de>

A customer of ours has run into this deadlock with RDMA:
- The ka_work workqueue item is executed
- A new ka_work workqueue item is scheduled just after that.
- Now both, the kato request timeout _and_ the workqueue delay
  will execute at roughly the same time
- If the timing is correct the workqueue executes _before_
  the kato request timeout triggers
- Kato request timeout triggers, and starts error recovery
- error recovery deadlocks, as it needs to flush the kato
  workqueue item; this is stuck in nvme_alloc_request() as all
  reserved tags are in use.

The reserved tags would have been freed up later when cancelling all
outstanding requests in the queue:

	nvme_stop_keep_alive(&ctrl->ctrl);
	nvme_rdma_teardown_io_queues(ctrl, false);
	nvme_start_queues(&ctrl->ctrl);
	nvme_rdma_teardown_admin_queue(ctrl, false);
	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);

but as we're stuck in nvme_stop_keep_alive() we'll never get this far.

To fix this a new controller flag 'NVME_CTRL_KATO_RUNNING' is added
which will short-circuit the nvme_keep_alive() function if one
keep-alive command is already running.

Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/nvme/host/core.c | 8 +++++++-
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ea40a3c511da..9b8596eb4047 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1211,6 +1211,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
 	bool startka = false;
 
 	blk_mq_free_request(rq);
+	clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
 
 	if (status) {
 		dev_err(ctrl->device,
@@ -1233,10 +1234,15 @@ static int nvme_keep_alive(struct nvme_ctrl *ctrl)
 {
 	struct request *rq;
 
+	if (test_and_set_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags))
+		return 0;
+
 	rq = nvme_alloc_request(ctrl->admin_q, &ctrl->ka_cmd,
 			BLK_MQ_REQ_RESERVED);
-	if (IS_ERR(rq))
+	if (IS_ERR(rq)) {
+		clear_bit(NVME_CTRL_KATO_RUNNING, &ctrl->flags);
 		return PTR_ERR(rq);
+	}
 
 	rq->timeout = ctrl->kato * HZ;
 	rq->end_io_data = ctrl;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index e6efa085f08a..e00e3400c8b6 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -344,6 +344,7 @@ struct nvme_ctrl {
 	int nr_reconnects;
 	unsigned long flags;
 #define NVME_CTRL_FAILFAST_EXPIRED	0
+#define NVME_CTRL_KATO_RUNNING		1
 	struct nvmf_ctrl_options *opts;
 
 	struct page *discard_page;
-- 
2.29.2


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-02-23 12:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-23 12:07 [PATCH 0/2] nvme: sanitize KATO handling Hannes Reinecke
2021-02-23 12:07 ` Hannes Reinecke [this message]
2021-02-24 16:22   ` [PATCH 1/2] nvme: fixup kato deadlock Christoph Hellwig
2021-02-23 12:07 ` [PATCH 2/2] nvme: sanitize KATO setting Hannes Reinecke
2021-02-24 16:23   ` Christoph Hellwig
2021-02-24  6:42 ` [PATCH 0/2] nvme: sanitize KATO handling Chao Leng
2021-02-24  7:06   ` Hannes Reinecke
2021-02-24  7:20     ` Chao Leng
2021-02-24  7:27       ` Chao Leng
2021-02-24  7:59       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210223120728.104699-2-hare@suse.de \
    --to=hare@suse.de \
    --cc=dwagner@suse.de \
    --cc=hch@lst.de \
    --cc=keith.busch@wdc.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox