[PATCH 2/2] nvme: make keep-alive synchronous operation

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nilay Shroff <nilay@linux.ibm.com>
To: linux-nvme@lists.infradead.org
Cc: kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, axboe@fb.com,
	chaitanyak@nvidia.com, gjoyce@linux.ibm.com,
	Nilay Shroff <nilay@linux.ibm.com>
Subject: [PATCH 2/2] nvme: make keep-alive synchronous operation
Date: Fri,  4 Oct 2024 17:16:57 +0530	[thread overview]
Message-ID: <20241004114711.780809-3-nilay@linux.ibm.com> (raw)
In-Reply-To: <20241004114711.780809-1-nilay@linux.ibm.com>

The nvme keep-alive operation, which executes at a periodic interval,
could potentially sneak in while shutting down a fabric controller.
This may lead to a race between the fabric controller admin queue
destroy code path (while shutting down controller) and the blk-mq
hw/hctx queuing from the keep-alive thread.

This fix helps avoid race by implementing keep-alive as a synchronous
operation so that admin queue-usage ref counter is decremented only
after keep-alive command finish execution and returns its status. This
would ensure that we don't inadvertently destroy the fabric admin queue
until we finish processing of nvme keep-alive request and its status and
hence it's safe to delete the queue.

Also, while we are at it, instead of first acquiring ctrl lock and then
accessing NVMe controller state, lets use the helper function
nvme_ctrl_state() in nvme_keep_alive_end_io() and get rid of the
lock.

Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
---
 drivers/nvme/host/core.c | 25 ++++++++++---------------
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 02897f0564a3..5a690cf16e5e 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1292,14 +1292,14 @@ static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl)
 	queue_delayed_work(nvme_wq, &ctrl->ka_work, delay);
 }
 
-static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
-						 blk_status_t status)
+static void nvme_keep_alive_finish(struct request *rq,
+				blk_status_t status,
+				struct nvme_ctrl *ctrl)
 {
-	struct nvme_ctrl *ctrl = rq->end_io_data;
-	unsigned long flags;
 	bool startka = false;
 	unsigned long rtt = jiffies - (rq->deadline - rq->timeout);
 	unsigned long delay = nvme_keep_alive_work_period(ctrl);
+	enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);
 
 	/*
 	 * Subtract off the keepalive RTT so nvme_keep_alive_work runs
@@ -1313,25 +1313,19 @@ static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,
 		delay = 0;
 	}
 
-	blk_mq_free_request(rq);
-
 	if (status) {
 		dev_err(ctrl->device,
 			"failed nvme_keep_alive_end_io error=%d\n",
 				status);
-		return RQ_END_IO_NONE;
+		return;
 	}
 
 	ctrl->ka_last_check_time = jiffies;
 	ctrl->comp_seen = false;
-	spin_lock_irqsave(&ctrl->lock, flags);
-	if (ctrl->state == NVME_CTRL_LIVE ||
-	    ctrl->state == NVME_CTRL_CONNECTING)
+	if (state == NVME_CTRL_LIVE || state == NVME_CTRL_CONNECTING)
 		startka = true;
-	spin_unlock_irqrestore(&ctrl->lock, flags);
 	if (startka)
 		queue_delayed_work(nvme_wq, &ctrl->ka_work, delay);
-	return RQ_END_IO_NONE;
 }
 
 static void nvme_keep_alive_work(struct work_struct *work)
@@ -1340,6 +1334,7 @@ static void nvme_keep_alive_work(struct work_struct *work)
 			struct nvme_ctrl, ka_work);
 	bool comp_seen = ctrl->comp_seen;
 	struct request *rq;
+	blk_status_t status;
 
 	ctrl->ka_last_check_time = jiffies;
 
@@ -1362,9 +1357,9 @@ static void nvme_keep_alive_work(struct work_struct *work)
 	nvme_init_request(rq, &ctrl->ka_cmd);
 
 	rq->timeout = ctrl->kato * HZ;
-	rq->end_io = nvme_keep_alive_end_io;
-	rq->end_io_data = ctrl;
-	blk_execute_rq_nowait(rq, false);
+	status = blk_execute_rq(rq, false);
+	nvme_keep_alive_finish(rq, status, ctrl);
+	blk_mq_free_request(rq);
 }
 
 static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
-- 
2.45.2

next prev parent reply	other threads:[~2024-10-04 11:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 11:46 [PATCH 0/2] nvme: system fault while shutting down fabric controller Nilay Shroff
2024-10-04 11:46 ` [PATCH 1/2] nvme-loop: flush off pending I/O while shutting down loop controller Nilay Shroff
2024-10-07  6:37   ` Christoph Hellwig
2024-10-04 11:46 ` Nilay Shroff [this message]
2024-10-07  6:41   ` [PATCH 2/2] nvme: make keep-alive synchronous operation Christoph Hellwig
2024-10-07  7:55     ` Nilay Shroff

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:02897f0564a dfblob:5a690cf16e5 )
 OR (
bs:"[PATCH 2/2] nvme: make keep-alive synchronous operation" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241004114711.780809-3-nilay@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@fb.com \
    --cc=chaitanyak@nvidia.com \
    --cc=gjoyce@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.