All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Chao Leng <lengchao@huawei.com>,
	Israel Rukshin <israelr@nvidia.com>,
	Christoph Hellwig <hch@lst.de>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 40/57] nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
Date: Fri,  5 Feb 2021 15:07:06 +0100	[thread overview]
Message-ID: <20210205140657.694285324@linuxfoundation.org> (raw)
In-Reply-To: <20210205140655.982616732@linuxfoundation.org>

From: Chao Leng <lengchao@huawei.com>

[ Upstream commit 7674073b2ed35ac951a49c425dec6b39d5a57140 ]

A crash happens when inject completing request long time(nearly 30s).
Each name space has a request queue, when inject completing request long
time, multi request queues may have time out requests at the same time,
nvme_rdma_timeout will execute concurrently. Multi requests in different
request queues may be queued in the same rdma queue, multi
nvme_rdma_timeout may call nvme_rdma_stop_queue at the same time.
The first nvme_rdma_timeout will clear NVME_RDMA_Q_LIVE and continue
stopping the rdma queue(drain qp), but the others check NVME_RDMA_Q_LIVE
is already cleared, and then directly complete the requests, complete
request before the qp is fully drained may lead to a use-after-free
condition.

Add a multex lock to serialize nvme_rdma_stop_queue.

Signed-off-by: Chao Leng <lengchao@huawei.com>
Tested-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/nvme/host/rdma.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 65e3d0ef36e1a..493ed7ba86ed2 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -97,6 +97,7 @@ struct nvme_rdma_queue {
 	struct completion	cm_done;
 	bool			pi_support;
 	int			cq_size;
+	struct mutex		queue_lock;
 };
 
 struct nvme_rdma_ctrl {
@@ -579,6 +580,7 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
 	int ret;
 
 	queue = &ctrl->queues[idx];
+	mutex_init(&queue->queue_lock);
 	queue->ctrl = ctrl;
 	if (idx && ctrl->ctrl.max_integrity_segments)
 		queue->pi_support = true;
@@ -598,7 +600,8 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
 	if (IS_ERR(queue->cm_id)) {
 		dev_info(ctrl->ctrl.device,
 			"failed to create CM ID: %ld\n", PTR_ERR(queue->cm_id));
-		return PTR_ERR(queue->cm_id);
+		ret = PTR_ERR(queue->cm_id);
+		goto out_destroy_mutex;
 	}
 
 	if (ctrl->ctrl.opts->mask & NVMF_OPT_HOST_TRADDR)
@@ -628,6 +631,8 @@ static int nvme_rdma_alloc_queue(struct nvme_rdma_ctrl *ctrl,
 out_destroy_cm_id:
 	rdma_destroy_id(queue->cm_id);
 	nvme_rdma_destroy_queue_ib(queue);
+out_destroy_mutex:
+	mutex_destroy(&queue->queue_lock);
 	return ret;
 }
 
@@ -639,9 +644,10 @@ static void __nvme_rdma_stop_queue(struct nvme_rdma_queue *queue)
 
 static void nvme_rdma_stop_queue(struct nvme_rdma_queue *queue)
 {
-	if (!test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags))
-		return;
-	__nvme_rdma_stop_queue(queue);
+	mutex_lock(&queue->queue_lock);
+	if (test_and_clear_bit(NVME_RDMA_Q_LIVE, &queue->flags))
+		__nvme_rdma_stop_queue(queue);
+	mutex_unlock(&queue->queue_lock);
 }
 
 static void nvme_rdma_free_queue(struct nvme_rdma_queue *queue)
@@ -651,6 +657,7 @@ static void nvme_rdma_free_queue(struct nvme_rdma_queue *queue)
 
 	nvme_rdma_destroy_queue_ib(queue);
 	rdma_destroy_id(queue->cm_id);
+	mutex_destroy(&queue->queue_lock);
 }
 
 static void nvme_rdma_free_io_queues(struct nvme_rdma_ctrl *ctrl)
-- 
2.27.0




  parent reply	other threads:[~2021-02-05 17:46 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-05 14:06 [PATCH 5.10 00/57] 5.10.14-rc1 review Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 01/57] net: dsa: microchip: Adjust reset release timing to match reference reset circuit Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 02/57] net: stmmac: dwmac-intel-plat: remove config data on error Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 03/57] net: fec: put child node on error path Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 04/57] net: octeontx2: Make sure the buffer is 128 byte aligned Greg Kroah-Hartman
2021-02-07  9:20   ` Pavel Machek
2021-02-07 10:39     ` Kevin Hao
2021-02-05 14:06 ` [PATCH 5.10 05/57] stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 06/57] mlxsw: spectrum_span: Do not overwrite policer configuration Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 07/57] net: dsa: bcm_sf2: put device node before return Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 08/57] net: switchdev: dont set port_obj_info->handled true when -EOPNOTSUPP Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 09/57] ibmvnic: Ensure that CRQ entry read are correctly ordered Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 10/57] iommu/io-pgtable-arm: Support coherency for Mali LPAE Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 11/57] drm/panfrost: Support cache-coherent integrations Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 12/57] arm64: dts: meson: Describe G12b GPU as coherent Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 13/57] arm64: Fix kernel address detection of __is_lm_address() Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 14/57] arm64: Do not pass tagged addresses to __is_lm_address() Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 15/57] Revert "x86/setup: dont remove E820_TYPE_RAM for pfn 0" Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 16/57] ARM: 9025/1: Kconfig: CPU_BIG_ENDIAN depends on !LD_IS_LLD Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 17/57] iommu/vt-d: Do not use flush-queue when caching-mode is on Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 18/57] phy: cpcap-usb: Fix warning for missing regulator_disable Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 19/57] tools/power/x86/intel-speed-select: Set scaling_max_freq to base_frequency Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 20/57] tools/power/x86/intel-speed-select: Set higher of cpuinfo_max_freq or base_frequency Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 21/57] platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 22/57] platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352 Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 23/57] habanalabs: fix dma_addr passed to dma_mmap_coherent Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 24/57] locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 25/57] x86: __always_inline __{rd,wr}msr() Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 26/57] scsi: scsi_transport_srp: Dont block target in failfast state Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 27/57] scsi: libfc: Avoid invoking response handler twice if ep is already completed Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 28/57] scsi: fnic: Fix memleak in vnic_dev_init_devcmd2 Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 29/57] ASoC: SOF: Intel: hda: Resume codec to do jack detection Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 30/57] ALSA: hda: Add AlderLake-P PCI ID and HDMI codec vid Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 31/57] objtool: Dont add empty symbols to the rbtree Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 32/57] mac80211: fix incorrect strlen of .write in debugfs Greg Kroah-Hartman
2021-02-05 14:06 ` [PATCH 5.10 33/57] mac80211: fix fast-rx encryption check Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 34/57] mac80211: fix encryption key selection for 802.3 xmit Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 35/57] scsi: ibmvfc: Set default timeout to avoid crash during migration Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 36/57] ALSA: hda: Add Cometlake-R PCI ID Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 37/57] i2c: tegra: Create i2c_writesl_vi() to use with VI I2C for filling TX FIFO Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 38/57] udf: fix the problem that the disc content is not displayed Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 39/57] nvme: check the PRINFO bit before deciding the host buffer length Greg Kroah-Hartman
2021-02-05 14:07 ` Greg Kroah-Hartman [this message]
2021-02-05 14:07 ` [PATCH 5.10 41/57] nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 42/57] nvme-pci: allow use of cmb on v1.4 controllers Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 43/57] nvmet: set right status on error in id-ns handler Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 44/57] platform/x86: thinkpad_acpi: Add P53/73 firmware to fan_quirk_table for dual fan control Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 45/57] selftests/powerpc: Only test lwm/stmw on big endian Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 46/57] drm/amd/display: Update dram_clock_change_latency for DCN2.1 Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 47/57] drm/amd/display: Allow PSTATE chnage when no displays are enabled Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 48/57] drm/amd/display: Change function decide_dp_link_settings to avoid infinite looping Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 49/57] drm/amd/display: Use hardware sequencer functions for PG control Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 50/57] drm/amd/display: Fixed corruptions on HPDRX link loss restore Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 51/57] habanalabs: zero pci counters packet before submit to FW Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 52/57] habanalabs: fix backward compatibility of idle check Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 53/57] habanalabs: disable FW events on device removal Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 54/57] objtool: Dont fail the kernel build on fatal errors Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 55/57] x86/cpu: Add another Alder Lake CPU to the Intel family Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 56/57] kthread: Extract KTHREAD_IS_PER_CPU Greg Kroah-Hartman
2021-02-05 14:07 ` [PATCH 5.10 57/57] workqueue: Restrict affinity change to rescuer Greg Kroah-Hartman
2021-02-05 23:04 ` [PATCH 5.10 00/57] 5.10.14-rc1 review Igor
2021-02-05 23:11 ` Pavel Machek
2021-02-08 12:39   ` Greg Kroah-Hartman
2021-02-06 14:16 ` Naresh Kamboju
2021-02-06 16:02 ` Guenter Roeck
2021-02-08 12:39   ` Greg Kroah-Hartman
2021-02-06 18:22 ` Jean-Denis Girard
2021-02-06 18:22   ` Jean-Denis Girard
2021-02-06 18:22   ` Jean-Denis Girard
2021-02-07 13:26 ` Jon Hunter
2021-02-08 12:39   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210205140657.694285324@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=israelr@nvidia.com \
    --cc=lengchao@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.