From: Hannes Reinecke <hare@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Keith Busch <kbusch@kernel.org>,
linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>
Subject: [PATCHv2 0/2] nvme-multipath: fix deadlock in device_add_disk()
Date: Tue, 8 Oct 2024 15:57:27 +0200 [thread overview]
Message-ID: <20241008135729.68810-1-hare@kernel.org> (raw)
From: Hannes Reinecke <hare@suse.de>
Hi all,
I'm having a testcase which repeatedly disables namespaces on the target
assigning new UUID (to simulate namespace remapping) and enable that
namespace again.
To throw in more fun these namespaces have their ANA group ID changes
to simulate namespace moving around in a cluster, where only the paths
local to the cluster node are active, and all other paths are inaccessible.
Essentially it's doing something like:
echo 0 > ${ns}/enable
<random delay>
echo "<dev>" > ${ns}/device_path
echo "<grpid>" > ${ns}/ana_grpid
uuidgen > ${ns}/device_uuid
echo 1 > ${ns}/enable
ie a similar testcase than the previous patchset, only this time I'm
just doing an 'enable/disable' bit without removing the namespace from
the target.
This is causing lockups in device_add_disk(), as the partition scan is
constantly retrying I/O and never completes.
With this patchset I/O errors during partition scan will never be
retried but will cause nvme_mpath_set_live() to fail.
This allows us to retry nvme_mpath_set_live() on the next rescan
to fixup the situation.
As usual, comments and reviews are welcome.
Changes to the original submission:
- Drop patch to simplify the loop in nvme_update_ana_state()
- Rework patches to return I/O errors during partition scan
Hannes Reinecke (2):
nvme: propagate I/O errors during partition scan
nvme-multipath: retry partition scan on errors
drivers/nvme/host/core.c | 26 ++++++++++++++++++------
drivers/nvme/host/multipath.c | 38 +++++++++++++++++++++++++++++++++++
drivers/nvme/host/nvme.h | 2 ++
3 files changed, 60 insertions(+), 6 deletions(-)
--
2.35.3
next reply other threads:[~2024-10-08 13:57 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-08 13:57 Hannes Reinecke [this message]
2024-10-08 13:57 ` [PATCH 1/2] nvme: propagate I/O errors during partition scan Hannes Reinecke
2024-10-08 13:57 ` [PATCH 2/2] nvme-multipath: retry partition scan on errors Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241008135729.68810-1-hare@kernel.org \
--to=hare@kernel.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox