linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>,
	Hannes Reinecke <hare@kernel.org>
Subject: [PATCH 3/3] nvme-multipath: skip failed paths during partition scan
Date: Mon,  7 Oct 2024 12:01:34 +0200	[thread overview]
Message-ID: <20241007100134.21104-4-hare@kernel.org> (raw)
In-Reply-To: <20241007100134.21104-1-hare@kernel.org>

From: Hannes Reinecke <hare@suse.de>

When an I/O error is encountered during scanning (ie when the
scan_lock is held) we should avoid using this path until scanning
is finished to avoid deadlocks with device_add_disk().
So set a new flag NVME_NS_SCAN_FAILED if a failover happened during
scanning, and skip this path in nvme_available_paths().
Then we can check if that bit is set after device_add_disk() returned,
and remove the disk again if no available paths are found.
That allows the device to be recreated via the 'rescan' sysfs attribute
once no I/O errors occur anymore.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
 drivers/nvme/host/multipath.c | 26 ++++++++++++++++++++++++++
 drivers/nvme/host/nvme.h      |  1 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index f03ef983a75f..4113d38606a4 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -102,6 +102,13 @@ void nvme_failover_req(struct request *req)
 		queue_work(nvme_wq, &ns->ctrl->ana_work);
 	}
 
+	/*
+	 * Do not use this path during scanning
+	 * to avoid deadlocks in device_add_disk()
+	 */
+	if (mutex_is_locked(&ns->ctrl->scan_lock))
+		set_bit(NVME_NS_SCAN_FAILED, &ns->flags);
+
 	spin_lock_irqsave(&ns->head->requeue_lock, flags);
 	for (bio = req->bio; bio; bio = bio->bi_next) {
 		bio_set_dev(bio, ns->head->disk->part0);
@@ -434,6 +441,10 @@ static bool nvme_available_path(struct nvme_ns_head *head)
 	list_for_each_entry_rcu(ns, &head->list, siblings) {
 		if (test_bit(NVME_CTRL_FAILFAST_EXPIRED, &ns->ctrl->flags))
 			continue;
+		if (test_bit(NVME_NS_SCAN_FAILED, &ns->flags) &&
+		    mutex_is_locked(&ns->ctrl->scan_lock))
+			continue;
+
 		switch (nvme_ctrl_state(ns->ctrl)) {
 		case NVME_CTRL_LIVE:
 		case NVME_CTRL_RESETTING:
@@ -659,6 +670,20 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
 			clear_bit(NVME_NSHEAD_DISK_LIVE, &head->flags);
 			return;
 		}
+		/*
+		 * If there is no available path and NVME_NS_SCAN_FAILED is
+		 * set an error occurred during partition scan triggered
+		 * by device_add_disk(), and the disk is most certainly
+		 * not live.
+		 */
+		if (!nvme_available_path(head) &&
+		    test_and_clear_bit(NVME_NS_SCAN_FAILED, &ns->flags)) {
+			dev_dbg(ns->ctrl->device, "delete gendisk for nsid %d\n",
+				head->ns_id);
+			clear_bit(NVME_NSHEAD_DISK_LIVE, &head->flags);
+			del_gendisk(head->disk);
+			return;
+		}
 		nvme_add_ns_head_cdev(head);
 	}
 
@@ -732,6 +757,7 @@ static void nvme_update_ns_ana_state(struct nvme_ana_group_desc *desc,
 	ns->ana_grpid = le32_to_cpu(desc->grpid);
 	ns->ana_state = desc->state;
 	clear_bit(NVME_NS_ANA_PENDING, &ns->flags);
+	clear_bit(NVME_NS_SCAN_FAILED, &ns->flags);
 	/*
 	 * nvme_mpath_set_live() will trigger I/O to the multipath path device
 	 * and in turn to this path device.  However we cannot accept this I/O
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 50515ad0f9d6..a4f99873ecb7 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -527,6 +527,7 @@ struct nvme_ns {
 #define NVME_NS_ANA_PENDING	2
 #define NVME_NS_FORCE_RO	3
 #define NVME_NS_READY		4
+#define NVME_NS_SCAN_FAILED	5
 
 	struct cdev		cdev;
 	struct device		cdev_device;
-- 
2.35.3



  parent reply	other threads:[~2024-10-07 10:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 10:01 [PATCH 0/3] nvme-multipath: fix deadlock in device_add_disk() Hannes Reinecke
2024-10-07 10:01 ` [PATCH 1/3] nvme-multipath: simplify loop in nvme_update_ana_state() Hannes Reinecke
2024-10-07 15:46   ` Keith Busch
2024-10-08  6:39     ` Christoph Hellwig
2024-10-20 23:33   ` Sagi Grimberg
2024-10-07 10:01 ` [PATCH 2/3] nvme-multipath: cannot disconnect controller on stuck partition scan Hannes Reinecke
2024-10-07 18:19   ` Keith Busch
2024-10-08  6:43     ` Christoph Hellwig
2024-10-08  7:17     ` Hannes Reinecke
2024-10-08 20:41       ` Keith Busch
2024-10-09  6:23         ` Hannes Reinecke
2024-10-09 16:33           ` Keith Busch
2024-10-09 17:10           ` Keith Busch
2024-10-09 17:32           ` Keith Busch
2024-10-10  6:16             ` Hannes Reinecke
2024-10-10  7:18               ` Hannes Reinecke
2024-10-10  8:17             ` Christoph Hellwig
2024-10-10  8:57             ` Hannes Reinecke
2024-10-15 14:33               ` Keith Busch
2024-10-15 14:56                 ` Hannes Reinecke
2024-10-15 15:10                   ` Keith Busch
2024-10-20 23:37                     ` Sagi Grimberg
2024-10-07 10:01 ` Hannes Reinecke [this message]
2024-10-08  6:40   ` [PATCH 3/3] nvme-multipath: skip failed paths during " Christoph Hellwig
2024-10-20 23:38     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241007100134.21104-4-hare@kernel.org \
    --to=hare@kernel.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).