From: Hannes Reinecke <hare@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
linux-nvme@lists.infradead.org, Hannes Reinecke <hare@kernel.org>
Subject: [PATCH 3/3] nvme: 'nvme disconnect' hangs after remapping namespaces
Date: Mon, 9 Sep 2024 09:19:30 +0200 [thread overview]
Message-ID: <20240909071930.146343-4-hare@kernel.org> (raw)
In-Reply-To: <20240909071930.146343-1-hare@kernel.org>
During repetitive namespace map and unmap operations on the target
(disabling the namespace, changing the UUID, enabling it again)
the initial scan will hang as the target will be returning
PATH_ERROR and the I/O is constantly retried:
[<0>] folio_wait_bit_common+0x12a/0x310
[<0>] filemap_read_folio+0x97/0xd0
[<0>] do_read_cache_folio+0x108/0x390
[<0>] read_part_sector+0x31/0xa0
[<0>] read_lba+0xc5/0x160
[<0>] efi_partition+0xd9/0x8f0
[<0>] bdev_disk_changed+0x23d/0x6d0
[<0>] blkdev_get_whole+0x78/0xc0
[<0>] bdev_open+0x2c6/0x3b0
[<0>] bdev_file_open_by_dev+0xcb/0x120
[<0>] disk_scan_partitions+0x5d/0x100
[<0>] device_add_disk+0x402/0x420
[<0>] nvme_mpath_set_live+0x4f/0x1f0 [nvme_core]
[<0>] nvme_mpath_add_disk+0x107/0x120 [nvme_core]
[<0>] nvme_alloc_ns+0xac6/0xe60 [nvme_core]
[<0>] nvme_scan_ns+0x2dd/0x3e0 [nvme_core]
[<0>] nvme_scan_work+0x1a3/0x490 [nvme_core]
Calling 'nvme disconnect' on controllers with these namespaces
will hang as the disconnect operation tries to flush scan_work:
[<0>] __flush_work+0x389/0x4b0
[<0>] nvme_remove_namespaces+0x4b/0x130 [nvme_core]
[<0>] nvme_do_delete_ctrl+0x72/0x90 [nvme_core]
[<0>] nvme_delete_ctrl_sync+0x2e/0x40 [nvme_core]
[<0>] nvme_sysfs_delete+0x35/0x40 [nvme_core]
[<0>] kernfs_fop_write_iter+0x13d/0x1b0
[<0>] vfs_write+0x404/0x510
before the namespaces are removed, and the controller state
DELETING_NOIO (which would abort any pending I/O) is set only
afterwards.
This patch calls 'nvme_kick_requeue_lists()' when entering
DELETING state for a controller to ensure all pending I/O
is flushed, and also disables failover for any commands which
are completed with an error afterwards, breaking the infinite
retry loop.
Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
drivers/nvme/host/core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 651073280f6f..142babce1963 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -381,6 +381,8 @@ enum nvme_disposition {
static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
{
+ struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
+
if (likely(nvme_req(req)->status == 0))
return COMPLETE;
@@ -393,6 +395,8 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req)
return AUTHENTICATE;
if (req->cmd_flags & REQ_NVME_MPATH) {
+ if (nvme_ctrl_state(ctrl) == NVME_CTRL_DELETING)
+ return COMPLETE;
if (nvme_is_path_error(nvme_req(req)->status) ||
blk_queue_dying(req->q))
return FAILOVER;
@@ -629,7 +633,8 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
} else if (new_state == NVME_CTRL_CONNECTING &&
old_state == NVME_CTRL_RESETTING) {
nvme_start_failfast_work(ctrl);
- }
+ } else if (new_state == NVME_CTRL_DELETING)
+ nvme_kick_requeue_lists(ctrl);
return changed;
}
EXPORT_SYMBOL_GPL(nvme_change_ctrl_state);
--
2.35.3
next prev parent reply other threads:[~2024-09-09 7:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-09 7:19 [PATCHv5 0/3] nvme: NSHEAD_DISK_LIVE fixes Hannes Reinecke
2024-09-09 7:19 ` [PATCH 1/3] nvme-multipath: fixup typo when clearing DISK_LIVE Hannes Reinecke
2024-09-09 7:19 ` [PATCH 2/3] nvme-multipath: avoid hang on inaccessible namespaces Hannes Reinecke
2024-09-09 7:19 ` Hannes Reinecke [this message]
2024-09-10 7:57 ` [PATCH 3/3] nvme: 'nvme disconnect' hangs after remapping namespaces Sagi Grimberg
2024-09-10 8:23 ` Hannes Reinecke
-- strict thread matches above, loose matches on Subject: below --
2024-09-06 10:16 [PATCHv4 0/3] nvme: NSHEAD_DISK_LIVE fixes Hannes Reinecke
2024-09-06 10:16 ` [PATCH 3/3] nvme: 'nvme disconnect' hangs after remapping namespaces Hannes Reinecke
2024-09-06 10:35 ` Damien Le Moal
2024-09-06 10:42 ` Hannes Reinecke
2024-09-08 7:21 ` Sagi Grimberg
2024-09-09 6:22 ` Hannes Reinecke
2024-09-09 6:59 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240909071930.146343-4-hare@kernel.org \
--to=hare@kernel.org \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).