All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org, Hannes Reinecke <hare@kernel.org>
Subject: [PATCH 2/3] nvme-multipath: cannot disconnect controller on stuck partition scan
Date: Mon,  7 Oct 2024 12:01:33 +0200	[thread overview]
Message-ID: <20241007100134.21104-3-hare@kernel.org> (raw)
In-Reply-To: <20241007100134.21104-1-hare@kernel.org>

When a namespace state is changed during partition scan triggered via
nvme_scan_ns->nvme_mpath_set_live()->device_add_disk()
I/O might be returned with a path error, causing it to be retried on
other paths. But if this happens to be the last path the process will
be stuck.
Trying to disconnect this controller will call

nvme_unquiesce_io_queues()
flush_work(&ctrl_scan_work)

where the first should abort/retry all I/O pending during
scan such that the following 'flush_work' can succeeed.
However, we explicitly do _not_ ignore paths from deleted controllers
in nvme_mpath_is_disabled(), so that I/O on these devices
will be _retried_, not aborted, and the scanning process
continues to be stuck. So the process to disconnect the
controller will be stuck in flush_work(), and that controller
and all namespaces become unusable until the system is rebooted.

Fixes: ecca390e8056 ("nvme: fix deadlock in disconnect during scan_work and/or ana_work")

Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
 drivers/nvme/host/multipath.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 61f8ae199288..f03ef983a75f 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -239,6 +239,13 @@ static bool nvme_path_is_disabled(struct nvme_ns *ns)
 {
 	enum nvme_ctrl_state state = nvme_ctrl_state(ns->ctrl);
 
+	/*
+	 * Skip deleted controllers for I/O from partition scan
+	 */
+	if (state == NVME_CTRL_DELETING &&
+	    mutex_is_locked(&ns->ctrl->scan_lock))
+		return true;
+
 	/*
 	 * We don't treat NVME_CTRL_DELETING as a disabled path as I/O should
 	 * still be able to complete assuming that the controller is connected.
-- 
2.35.3



  parent reply	other threads:[~2024-10-07 10:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 10:01 [PATCH 0/3] nvme-multipath: fix deadlock in device_add_disk() Hannes Reinecke
2024-10-07 10:01 ` [PATCH 1/3] nvme-multipath: simplify loop in nvme_update_ana_state() Hannes Reinecke
2024-10-07 15:46   ` Keith Busch
2024-10-08  6:39     ` Christoph Hellwig
2024-10-20 23:33   ` Sagi Grimberg
2024-10-07 10:01 ` Hannes Reinecke [this message]
2024-10-07 18:19   ` [PATCH 2/3] nvme-multipath: cannot disconnect controller on stuck partition scan Keith Busch
2024-10-08  6:43     ` Christoph Hellwig
2024-10-08  7:17     ` Hannes Reinecke
2024-10-08 20:41       ` Keith Busch
2024-10-09  6:23         ` Hannes Reinecke
2024-10-09 16:33           ` Keith Busch
2024-10-09 17:10           ` Keith Busch
2024-10-09 17:32           ` Keith Busch
2024-10-10  6:16             ` Hannes Reinecke
2024-10-10  7:18               ` Hannes Reinecke
2024-10-10  8:17             ` Christoph Hellwig
2024-10-10  8:57             ` Hannes Reinecke
2024-10-15 14:33               ` Keith Busch
2024-10-15 14:56                 ` Hannes Reinecke
2024-10-15 15:10                   ` Keith Busch
2024-10-20 23:37                     ` Sagi Grimberg
2024-10-07 10:01 ` [PATCH 3/3] nvme-multipath: skip failed paths during " Hannes Reinecke
2024-10-08  6:40   ` Christoph Hellwig
2024-10-20 23:38     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241007100134.21104-3-hare@kernel.org \
    --to=hare@kernel.org \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.