All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Jack Morgenstein <jackm@dev.mellanox.co.il>,
	Matan Barak <matanb@mellanox.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH 4.4 16/18] net/mlx4_core: Fix racy CQ (Completion Queue) free
Date: Sun, 16 Apr 2017 10:02:39 +0200	[thread overview]
Message-ID: <20170416080213.522683010@linuxfoundation.org> (raw)
In-Reply-To: <20170416080212.713469777@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

commit 291c566a28910614ce42d0ffe82196eddd6346f4 upstream.

In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.

Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().

Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
   we no longer take this spinlock in the interrupt context.
   The spinlock here, therefore, simply protects against different
   threads simultaneously invoking mlx4_cq_free() for different cq's.

2. In function mlx4_cq_free(), we move the radix tree delete to before
   the synchronize_irq() calls. This guarantees that we will not
   access this cq during any subsequent interrupts, and therefore can
   safely free the CQ after the synchronize_irq calls. The rcu_read_lock
   in the interrupt handlers only needs to protect against corrupting the
   radix tree; the interrupt handlers may access the cq outside the
   rcu_read_lock due to the synchronize_irq calls which protect against
   premature freeing of the cq.

3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.

4. We leave the cq reference count mechanism in place, because it is
   still needed for the cq completion tasklet mechanism.

Fixes: 6d90aa5cf17b ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/mellanox/mlx4/cq.c |   38 ++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -101,13 +101,19 @@ void mlx4_cq_completion(struct mlx4_dev
 {
 	struct mlx4_cq *cq;
 
+	rcu_read_lock();
 	cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree,
 			       cqn & (dev->caps.num_cqs - 1));
+	rcu_read_unlock();
+
 	if (!cq) {
 		mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	++cq->arm_sn;
 
 	cq->comp(cq);
@@ -118,23 +124,19 @@ void mlx4_cq_event(struct mlx4_dev *dev,
 	struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
 	struct mlx4_cq *cq;
 
-	spin_lock(&cq_table->lock);
-
+	rcu_read_lock();
 	cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1));
-	if (cq)
-		atomic_inc(&cq->refcount);
-
-	spin_unlock(&cq_table->lock);
+	rcu_read_unlock();
 
 	if (!cq) {
-		mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn);
+		mlx4_dbg(dev, "Async event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	cq->event(cq, event_type);
-
-	if (atomic_dec_and_test(&cq->refcount))
-		complete(&cq->free);
 }
 
 static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
@@ -301,9 +303,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev,
 	if (err)
 		return err;
 
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	err = radix_tree_insert(&cq_table->tree, cq->cqn, cq);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 	if (err)
 		goto err_icm;
 
@@ -347,9 +349,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev,
 	return 0;
 
 err_radix:
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 
 err_icm:
 	mlx4_cq_free_icm(dev, cq->cqn);
@@ -368,15 +370,15 @@ void mlx4_cq_free(struct mlx4_dev *dev,
 	if (err)
 		mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn);
 
+	spin_lock(&cq_table->lock);
+	radix_tree_delete(&cq_table->tree, cq->cqn);
+	spin_unlock(&cq_table->lock);
+
 	synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq);
 	if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq !=
 	    priv->eq_table.eq[MLX4_EQ_ASYNC].irq)
 		synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
 
-	spin_lock_irq(&cq_table->lock);
-	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
-
 	if (atomic_dec_and_test(&cq->refcount))
 		complete(&cq->free);
 	wait_for_completion(&cq->free);

  parent reply	other threads:[~2017-04-16  8:21 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-16  8:02 [PATCH 4.4 00/18] 4.4.62-stable review Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 02/18] drm/i915: Stop using RP_DOWN_EI on Baytrail Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 03/18] usb: dwc3: gadget: delay unmap of bounced requests Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 05/18] MIPS: Introduce irq_stack Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 06/18] MIPS: Stack unwinding while on IRQ stack Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 07/18] MIPS: Only change $28 to thread_info if coming from user mode Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 08/18] MIPS: Switch to the irq_stack in interrupts Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 09/18] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 10/18] MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 12/18] net/packet: fix overflow in check for priv area size Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 13/18] blk-mq: Avoid memory reclaim when remapping queues Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 14/18] usb: hub: Wait for connection to be reestablished after port reset Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 15/18] net/mlx4_en: Fix bad WQE issue Greg Kroah-Hartman
2017-04-16  8:02 ` Greg Kroah-Hartman [this message]
2017-04-16  8:02 ` [PATCH 4.4 17/18] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Greg Kroah-Hartman
2017-04-16  8:02 ` [PATCH 4.4 18/18] ibmveth: set correct gso_size and gso_type Greg Kroah-Hartman
2017-04-16 21:28 ` [PATCH 4.4 00/18] 4.4.62-stable review Guenter Roeck
2017-04-17 18:19 ` Shuah Khan
     [not found] ` <58f36a37.cc9bdf0a.86c97.bb8a@mx.google.com>
2017-04-17 18:48   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170416080213.522683010@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=jackm@dev.mellanox.co.il \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matanb@mellanox.com \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.