All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
To: prathameshdeshpande7@gmail.com
Cc: dledford@redhat.com, haggaie@mellanox.com, jgg@ziepe.ca,
	leon@kernel.org, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org
Subject: [PATCH v4] IB/mlx5: Fix state corruption and resource leaks in loopback enablement
Date: Thu,  2 Apr 2026 00:52:32 +0100	[thread overview]
Message-ID: <20260401235232.21155-1-prathameshdeshpande7@gmail.com> (raw)
In-Reply-To: <20260401223550.20040-1-prathameshdeshpande7@gmail.com>

In mlx5_ib_alloc_transport_domain(), an early success path was
returning 'err' (which is 0) instead of a literal 0.

Additionally, as identified by Sashiko, if mlx5_ib_enable_lb() fails
to update the hardware, it leaves the software state in an
inconsistent state where reference counters are incremented but the
hardware remains disabled. Fixing this in the caller created a race
window where the mutex was released between enablement and rollback.

Update mlx5_ib_enable_lb() to perform an atomic rollback of reference
counters and only set the 'enabled' flag if the hardware command
succeeds.

Also, add error handling in mlx5_ib_alloc_transport_domain() to
deallocate the transport domain (tdn) if loopback enablement fails.

Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
v4:
- Moved rollback logic into mlx5_ib_enable_lb() to ensure atomicity
  within the mutex and prevent race conditions [Sashiko].
v3:
- Also call mlx5_ib_disable_lb() on failure to roll back software state/counters
  [Sashiko].
v2:
- Added deallocation of tdn if mlx5_ib_enable_lb() fails [Sashiko].
- Reworded commit message to reflect the functional fix and credit the tool.

 drivers/infiniband/hw/mlx5/main.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 635002e684a5..877b02e98033 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2022,7 +2022,14 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 	    dev->lb.qps == 1) {
 		if (!dev->lb.enabled) {
 			err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
-			dev->lb.enabled = true;
+			if (err) {
+				if (td)
+					dev->lb.user_td--;
+				if (qp)
+					dev->lb.qps--;
+			} else {
+				dev->lb.enabled = true;
+			}
 		}
 	}
 
@@ -2068,9 +2075,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
 	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
 	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
 	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
-		return err;
+		return 0;
 
-	return mlx5_ib_enable_lb(dev, true, false);
+	err = mlx5_ib_enable_lb(dev, true, false);
+	if (err)
+		mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
+
+	return err;
 }
 
 static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn,
-- 
2.43.0


  reply	other threads:[~2026-04-01 23:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31  1:28 [PATCH] IB/mlx5: Clarify success return path in mlx5_ib_alloc_transport_domain Prathamesh Deshpande
2026-03-31 13:48 ` Leon Romanovsky
2026-03-31 23:04   ` [PATCH v2] IB/mlx5: Fix tdn leak " Prathamesh Deshpande
2026-04-01 22:35     ` [PATCH v3] IB/mlx5: Fix tdn leak and state corruption " Prathamesh Deshpande
2026-04-01 23:52       ` Prathamesh Deshpande [this message]
2026-04-04 21:51         ` [PATCH v5] IB/mlx5: Fix loopback enablement state and resource leaks Prathamesh Deshpande
2026-04-04 23:07           ` [PATCH v6] " Prathamesh Deshpande
2026-04-05 13:09             ` [PATCH v7 0/2] Fix loopback leaks and return paths Prathamesh Deshpande
2026-04-05 13:09               ` [PATCH v7 1/2] IB/mlx5: Fix success return path and mutex initialization Prathamesh Deshpande
2026-04-05 13:09               ` [PATCH v7 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable Prathamesh Deshpande
2026-04-05 19:22               ` [PATCH v7 0/2] Fix loopback leaks and return paths Leon Romanovsky
2026-04-05 22:09                 ` Prathamesh Deshpande

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260401235232.21155-1-prathameshdeshpande7@gmail.com \
    --to=prathameshdeshpande7@gmail.com \
    --cc=dledford@redhat.com \
    --cc=haggaie@mellanox.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.