All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: fix possible io failures when removing multipathed ns
@ 2019-06-10 21:50 Sagi Grimberg
  2019-06-20  8:56 ` Christoph Hellwig
  0 siblings, 1 reply; 2+ messages in thread
From: Sagi Grimberg @ 2019-06-10 21:50 UTC (permalink / raw)


From: Anton Eidelman <anton@lightbitslabs.com>

When a shared namespace is removed, we call blk_cleanup_queue()
when the device can still be accessed as the current path and this can
result in submission to a dying queue. Hence, direct_make_request()
called by our mpath device may fail (propagating the failure to userspace).
Instead, we want to failover this I/O to a different path if one exists.
Thus, before we cleanup the request queue, we make sure that the device is
cleared from the current path nor it can be selected again as such.

Fix this by:
- clear the ns from the head->list and synchronize rcu to make sure there is
  no concurrent path search that restores it as the current path
- clear the mpath current path in order to trigger a subsequent path search
  and sync srcu to wait for any ongoing request submissions
- safely continue to namespace removal and blk_cleanup_queue

Signed-off-by: Anton Eidelman <anton at lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index b4e69a3fb845..024fb219de17 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3365,6 +3365,14 @@ static void nvme_ns_remove(struct nvme_ns *ns)
 		return;
 
 	nvme_fault_inject_fini(ns);
+
+	mutex_lock(&ns->ctrl->subsys->lock);
+	list_del_rcu(&ns->siblings);
+	mutex_unlock(&ns->ctrl->subsys->lock);
+	synchronize_rcu(); /* guarantee not available in head->list */
+	nvme_mpath_clear_current_path(ns);
+	synchronize_srcu(&ns->head->srcu); /* wait for concurrent submissions */
+
 	if (ns->disk && ns->disk->flags & GENHD_FL_UP) {
 		del_gendisk(ns->disk);
 		blk_cleanup_queue(ns->queue);
@@ -3372,16 +3380,10 @@ static void nvme_ns_remove(struct nvme_ns *ns)
 			blk_integrity_unregister(ns->disk);
 	}
 
-	mutex_lock(&ns->ctrl->subsys->lock);
-	list_del_rcu(&ns->siblings);
-	nvme_mpath_clear_current_path(ns);
-	mutex_unlock(&ns->ctrl->subsys->lock);
-
 	down_write(&ns->ctrl->namespaces_rwsem);
 	list_del_init(&ns->list);
 	up_write(&ns->ctrl->namespaces_rwsem);
 
-	synchronize_srcu(&ns->head->srcu);
 	nvme_mpath_check_last_path(ns);
 	nvme_put_ns(ns);
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH] nvme: fix possible io failures when removing multipathed ns
  2019-06-10 21:50 [PATCH] nvme: fix possible io failures when removing multipathed ns Sagi Grimberg
@ 2019-06-20  8:56 ` Christoph Hellwig
  0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2019-06-20  8:56 UTC (permalink / raw)


Thanks,

applied to nvme-5.3.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-06-20  8:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-10 21:50 [PATCH] nvme: fix possible io failures when removing multipathed ns Sagi Grimberg
2019-06-20  8:56 ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.