* [PATCH] nvme: fix nvme_ns_remove() deadlock @ 2016-04-18 18:21 Ming Lin 2016-04-18 18:32 ` Keith Busch 0 siblings, 1 reply; 5+ messages in thread From: Ming Lin @ 2016-04-18 18:21 UTC (permalink / raw) From: Ming Lin <ming.l@ssi.samsung.com> On receipt of a namespace attribute changed AER, we acquire the namespace mutex lock before proceeding to scan and validate the namespace list. In case of namespace detach/delete command, nvme_ns_remove function deadlocks trying to acquire the already held lock. All callers, except nvme_remove_namespaces(), of nvme_ns_remove() already held namespaces_mutex. So we can simply fix the deadlock by not acquiring the mutex in nvme_ns_remove() and acquiring it in nvme_remove_namespaces(). Reported-by: Sunad Bhandary S <sunad.s at samsung.com> Signed-off-by: Ming Lin <ming.l at ssi.samsung.com> --- drivers/nvme/host/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7dd9905..ab00f3d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1410,6 +1410,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid) static void nvme_ns_remove(struct nvme_ns *ns) { + lockdep_assert_held(&ns->ctrl->namespaces_mutex); + if (test_and_set_bit(NVME_NS_REMOVING, &ns->flags)) return; @@ -1422,9 +1424,7 @@ static void nvme_ns_remove(struct nvme_ns *ns) blk_mq_abort_requeue_list(ns->queue); blk_cleanup_queue(ns->queue); } - mutex_lock(&ns->ctrl->namespaces_mutex); list_del_init(&ns->list); - mutex_unlock(&ns->ctrl->namespaces_mutex); nvme_put_ns(ns); } @@ -1519,8 +1519,10 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl) { struct nvme_ns *ns, *next; + mutex_lock(&ctrl->namespaces_mutex); list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) nvme_ns_remove(ns); + mutex_unlock(&ctrl->namespaces_mutex); } EXPORT_SYMBOL_GPL(nvme_remove_namespaces); -- 1.9.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] nvme: fix nvme_ns_remove() deadlock 2016-04-18 18:21 [PATCH] nvme: fix nvme_ns_remove() deadlock Ming Lin @ 2016-04-18 18:32 ` Keith Busch 2016-04-18 20:37 ` Christoph Hellwig 0 siblings, 1 reply; 5+ messages in thread From: Keith Busch @ 2016-04-18 18:32 UTC (permalink / raw) On Mon, Apr 18, 2016@11:21:43AM -0700, Ming Lin wrote: > From: Ming Lin <ming.l at ssi.samsung.com> > > On receipt of a namespace attribute changed AER, we acquire the > namespace mutex lock before proceeding to scan and validate the > namespace list. In case of namespace detach/delete command, > nvme_ns_remove function deadlocks trying to acquire the already held > lock. > > All callers, except nvme_remove_namespaces(), of nvme_ns_remove() > already held namespaces_mutex. So we can simply fix the deadlock by > not acquiring the mutex in nvme_ns_remove() and acquiring it in > nvme_remove_namespaces(). This fixes one deadlock by introducing another. The reason we don't hold the namespace lock during nvme_ns_remove is because del_gendisk blocks until all IO is flushed. If the controller fails during this, you'll deadlock nvme_dev_disable as it tries to recover. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] nvme: fix nvme_ns_remove() deadlock 2016-04-18 18:32 ` Keith Busch @ 2016-04-18 20:37 ` Christoph Hellwig 2016-04-18 22:56 ` Ming Lin 0 siblings, 1 reply; 5+ messages in thread From: Christoph Hellwig @ 2016-04-18 20:37 UTC (permalink / raw) On Mon, Apr 18, 2016@06:32:15PM +0000, Keith Busch wrote: > This fixes one deadlock by introducing another. > > The reason we don't hold the namespace lock during nvme_ns_remove is > because del_gendisk blocks until all IO is flushed. If the controller > fails during this, you'll deadlock nvme_dev_disable as it tries to > recover. Should we switch to RCU freeing the namespace structure? If we do that nvme_start_queues, nvme_stop_queues and nvme_kill_queues would be able to get away with only a RCU read side critical section, and Mings should be fine on top of that. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] nvme: fix nvme_ns_remove() deadlock 2016-04-18 20:37 ` Christoph Hellwig @ 2016-04-18 22:56 ` Ming Lin 2016-04-19 18:25 ` Christoph Hellwig 0 siblings, 1 reply; 5+ messages in thread From: Ming Lin @ 2016-04-18 22:56 UTC (permalink / raw) On Mon, 2016-04-18@22:37 +0200, Christoph Hellwig wrote: > On Mon, Apr 18, 2016@06:32:15PM +0000, Keith Busch wrote: > > This fixes one deadlock by introducing another. > > > > The reason we don't hold the namespace lock during nvme_ns_remove is > > because del_gendisk blocks until all IO is flushed. If the controller > > fails during this, you'll deadlock nvme_dev_disable as it tries to > > recover. > > Should we switch to RCU freeing the namespace structure? If we do > that nvme_start_queues, nvme_stop_queues and nvme_kill_queues would > be able to get away with only a RCU read side critical section, > and Mings should be fine on top of that. You mean below changes(un-tested)? drivers/nvme/host/core.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 97a15e1..6942b04 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1569,7 +1569,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid) if (nvme_revalidate_disk(ns->disk)) goto out_free_disk; - list_add_tail(&ns->list, &ctrl->namespaces); + list_add_tail_rcu(&ns->list, &ctrl->namespaces); kref_get(&ctrl->kref); if (ns->type == NVME_NS_LIGHTNVM) return; @@ -1607,6 +1607,7 @@ static void nvme_ns_remove(struct nvme_ns *ns) blk_cleanup_queue(ns->queue); } list_del_init(&ns->list); + synchronize_rcu(); nvme_put_ns(ns); } @@ -1819,8 +1820,8 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns; - mutex_lock(&ctrl->namespaces_mutex); - list_for_each_entry(ns, &ctrl->namespaces, list) { + rcu_read_lock(); + list_for_each_entry_rcu(ns, &ctrl->namespaces, list) { if (!kref_get_unless_zero(&ns->kref)) continue; @@ -1837,7 +1838,7 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl) nvme_put_ns(ns); } - mutex_unlock(&ctrl->namespaces_mutex); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(nvme_kill_queues); @@ -1845,8 +1846,8 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns; - mutex_lock(&ctrl->namespaces_mutex); - list_for_each_entry(ns, &ctrl->namespaces, list) { + rcu_read_lock(); + list_for_each_entry_rcu(ns, &ctrl->namespaces, list) { spin_lock_irq(ns->queue->queue_lock); queue_flag_set(QUEUE_FLAG_STOPPED, ns->queue); spin_unlock_irq(ns->queue->queue_lock); @@ -1854,7 +1855,7 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl) blk_mq_cancel_requeue_work(ns->queue); blk_mq_stop_hw_queues(ns->queue); } - mutex_unlock(&ctrl->namespaces_mutex); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(nvme_stop_queues); @@ -1862,13 +1863,13 @@ void nvme_start_queues(struct nvme_ctrl *ctrl) { struct nvme_ns *ns; - mutex_lock(&ctrl->namespaces_mutex); - list_for_each_entry(ns, &ctrl->namespaces, list) { + rcu_read_lock(); + list_for_each_entry_rcu(ns, &ctrl->namespaces, list) { queue_flag_clear_unlocked(QUEUE_FLAG_STOPPED, ns->queue); blk_mq_start_stopped_hw_queues(ns->queue, true); blk_mq_kick_requeue_list(ns->queue); } - mutex_unlock(&ctrl->namespaces_mutex); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(nvme_start_queues); ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] nvme: fix nvme_ns_remove() deadlock 2016-04-18 22:56 ` Ming Lin @ 2016-04-19 18:25 ` Christoph Hellwig 0 siblings, 0 replies; 5+ messages in thread From: Christoph Hellwig @ 2016-04-19 18:25 UTC (permalink / raw) On Mon, Apr 18, 2016@03:56:18PM -0700, Ming Lin wrote: > You mean below changes(un-tested)? Yes, that's exactly what I had in mind. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-04-19 18:25 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-04-18 18:21 [PATCH] nvme: fix nvme_ns_remove() deadlock Ming Lin 2016-04-18 18:32 ` Keith Busch 2016-04-18 20:37 ` Christoph Hellwig 2016-04-18 22:56 ` Ming Lin 2016-04-19 18:25 ` Christoph Hellwig
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.