public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
To: linux-rdma@vger.kernel.org
Cc: prathameshdeshpande7@gmail.com, dledford@redhat.com,
	haggaie@mellanox.com, jgg@ziepe.ca, leon@kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH v8 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable
Date: Sun,  5 Apr 2026 22:57:18 +0100	[thread overview]
Message-ID: <20260405215718.19301-3-prathameshdeshpande7@gmail.com> (raw)
In-Reply-To: <20260405215718.19301-1-prathameshdeshpande7@gmail.com>

Update mlx5_ib_enable_lb() and mlx5_ib_disable_lb() to ensure
symmetric updates of user_td and qps counters.

Software state leaks can occur if the force_enable flag is checked
before updating counters. Furthermore, the hardware deactivation
condition in the original code (user_td == 1) can fail to disable
loopback if user_td remains 0, or cause premature deactivation in
multi-user scenarios.

Fix these by:
- Updating counters prior to checking the force_enable gate.
- Disabling hardware only when both user_td and qps reach zero.
- Implementing a counter rollback if the hardware command fails.

Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.
v7:
- Split into a separate patch for better bisection.
- Moved force_enable check after increments/decrements to fix leaks [Sashiko].
- Updated hardware disable condition to a strict zero-check.
v6:
- Always update software counters regardless of force_enable to prevent 
  underflows during dynamic unbinding [Sashiko].
- Updated disable condition to user_td <= 1 to prevent HW state leaks 
  on systems without transport domains [Sashiko].
- Rebased on rdma/for-next to resolve baseline application failures.
v5:
- Moved mutex_init to stage_init_init to prevent crashes on non-ETH hardware.
- Implemented 'goto unlock' for concurrency safety in enable/disable paths.
- Added atomic rollback and fixed tdn leak.
v4:
- Moved rollback logic into mlx5_ib_enable_lb() to ensure atomicity
  within the mutex and prevent race conditions [Sashiko].
v3:
- Also call mlx5_ib_disable_lb() on failure to roll back software state/counters
  [Sashiko].
v2:
- Added deallocation of tdn if mlx5_ib_enable_lb() fails [Sashiko].
- Reworded commit message to reflect the functional fix and credit the tool.

 drivers/infiniband/hw/mlx5/main.c | 37 ++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f49f746bc5bd..fde72ebe721a 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2009,23 +2009,29 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
 	int err = 0;
 
-	if (dev->lb.force_enable)
-		return 0;
-
 	mutex_lock(&dev->lb.mutex);
+
 	if (td)
 		dev->lb.user_td++;
 	if (qp)
 		dev->lb.qps++;
 
-	if (dev->lb.user_td == 2 ||
-	    dev->lb.qps == 1) {
-		if (!dev->lb.enabled) {
-			err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
+	if (dev->lb.force_enable)
+		goto unlock;
+
+	if (!dev->lb.enabled && (dev->lb.user_td >= 1 || dev->lb.qps >= 1)) {
+		err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
+		if (err) {
+			if (td)
+				dev->lb.user_td--;
+			if (qp)
+				dev->lb.qps--;
+		} else {
 			dev->lb.enabled = true;
 		}
 	}
 
+unlock:
 	mutex_unlock(&dev->lb.mutex);
 
 	return err;
@@ -2033,23 +2039,22 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 
 void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
-	if (dev->lb.force_enable)
-		return;
-
 	mutex_lock(&dev->lb.mutex);
+
 	if (td)
 		dev->lb.user_td--;
 	if (qp)
 		dev->lb.qps--;
 
-	if (dev->lb.user_td == 1 &&
-	    dev->lb.qps == 0) {
-		if (dev->lb.enabled) {
-			mlx5_nic_vport_update_local_lb(dev->mdev, false);
-			dev->lb.enabled = false;
-		}
+	if (dev->lb.force_enable)
+		goto unlock;
+
+	if (dev->lb.enabled && (dev->lb.user_td == 0 && dev->lb.qps == 0)) {
+		mlx5_nic_vport_update_local_lb(dev->mdev, false);
+		dev->lb.enabled = false;
 	}
 
+unlock:
 	mutex_unlock(&dev->lb.mutex);
 }
 
-- 
2.43.0


      parent reply	other threads:[~2026-04-05 21:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-05 21:57 [PATCH v8 0/2] Fix loopback leaks and return paths Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 1/2] IB/mlx5: Fix success return path and mutex initialization Prathamesh Deshpande
2026-04-05 21:57 ` Prathamesh Deshpande [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260405215718.19301-3-prathameshdeshpande7@gmail.com \
    --to=prathameshdeshpande7@gmail.com \
    --cc=dledford@redhat.com \
    --cc=haggaie@mellanox.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox