From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Israel Rukshin <israelr@nvidia.com>,
Sasha Levin <sashal@kernel.org>,
linux-nvme@lists.infradead.org, Christoph Hellwig <hch@lst.de>,
Chao Leng <lengchao@huawei.com>
Subject: [PATCH AUTOSEL 5.10 23/41] nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
Date: Fri, 29 Jan 2021 10:36:54 -0500 [thread overview]
Message-ID: <20210129153713.1592185-23-sashal@kernel.org> (raw)
In-Reply-To: <20210129153713.1592185-1-sashal@kernel.org>
From: Chao Leng <lengchao@huawei.com>
[ Upstream commit 7674073b2ed35ac951a49c425dec6b39d5a57140 ]
A crash happens when inject completing request long time(nearly 30s).
Each name space has a request queue, when inject completing request long
time, multi request queues may have time out requests at the same time,
nvme_rdma_timeout will execute concurrently. Multi requests in different
request queues may be queued in the same rdma queue, multi
nvme_rdma_timeout may call nvme_rdma_stop_queue at the same time.
The first nvme_rdma_timeout will clear NVME_RDMA_Q_LIVE and continue
stopping the rdma queue(drain qp), but the others check NVME_RDMA_Q_LIVE
is already cleared, and then directly complete the requests, complete
request before the qp is fully drained may lead to a use-after-free
condition.
Add a multex lock to serialize nvme_rdma_stop_queue.
Signed-off-by: Chao Leng <lengchao@huawei.com>
Tested-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/rdma.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 65e3d0ef36e1a..493ed7ba86ed2 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -97,6 +97,7 @@ struct nvme_rdma_queue {
struct completion cm_done;
bool pi_support;
int cq_size;
+ struct mutex queue_lock;
};
struct nvme_rdma_ctrl {
@@ -579,6 +580,7 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
int ret;
queue = &ctrl->queues[idx];
+ mutex_init(&queue->queue_lock);
queue->ctrl = ctrl;
if (idx && ctrl->ctrl.max_integrity_segments)
queue->pi_support = true;
@@ -598,7 +600,8 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
if (IS_ERR(queue->cm_id)) {
dev_info(ctrl->ctrl.device,
"failed to create CM ID: %ld\n", PTR_ERR(queue->cm_id));
- return PTR_ERR(queue->cm_id);
+ ret = PTR_ERR(queue->cm_id);
+ goto out_destroy_mutex;
}
if (ctrl->ctrl.opts->mask & NVMF_OPT_HOST_TRADDR)
@@ -628,6 +631,8 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
out_destroy_cm_id:
rdma_destroy_id(queue->cm_id);
nvme_rdma_destroy_queue_ib(queue);
+out_destroy_mutex:
+ mutex_destroy(&queue->queue_lock);
return ret;
}
@@ -639,9 +644,10 @@ static void __nvme_rdma_stop_queue(struct nvme_rdma_queue *queue)
static void nvme_rdma_stop_queue(struct nvme_rdma_queue *queue)
{
- if (!test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags))
- return;
- __nvme_rdma_stop_queue(queue);
+ mutex_lock(&queue->queue_lock);
+ if (test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags))
+ __nvme_rdma_stop_queue(queue);
+ mutex_unlock(&queue->queue_lock);
}
static void nvme_rdma_free_queue(struct nvme_rdma_queue *queue)
@@ -651,6 +657,7 @@ static void nvme_rdma_free_queue(struct nvme_rdma_queue *queue)
nvme_rdma_destroy_queue_ib(queue);
rdma_destroy_id(queue->cm_id);
+ mutex_destroy(&queue->queue_lock);
}
static void nvme_rdma_free_io_queues(struct nvme_rdma_ctrl *ctrl)
--
2.27.0
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-01-29 15:37 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210129153713.1592185-1-sashal@kernel.org>
2021-01-29 15:36 ` [PATCH AUTOSEL 5.10 22/41] nvme: check the PRINFO bit before deciding the host buffer length Sasha Levin
2021-01-29 15:36 ` Sasha Levin [this message]
2021-01-29 15:36 ` [PATCH AUTOSEL 5.10 24/41] nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout Sasha Levin
2021-01-29 15:36 ` [PATCH AUTOSEL 5.10 25/41] nvme-pci: allow use of cmb on v1.4 controllers Sasha Levin
2021-01-29 15:36 ` [PATCH AUTOSEL 5.10 26/41] nvmet: set right status on error in id-ns handler Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210129153713.1592185-23-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=hch@lst.de \
--cc=israelr@nvidia.com \
--cc=lengchao@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox