Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH v12 0/2] IB/mlx5: Fix loopback rollback and threshold accounting
@ 2026-05-10 22:22 Prathamesh Deshpande
  2026-05-10 22:22 ` [PATCH v12 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
  2026-05-10 22:22 ` [PATCH v12 2/2] IB/mlx5: Fix loopback threshold/accounting in regular path Prathamesh Deshpande
  0 siblings, 2 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-05-10 22:22 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Patrisious Haddad, Mark Bloch, Doug Ledford, Haggai Eran,
	Majd Dibbiny, linux-rdma, linux-kernel, Prathamesh Deshpande

This series fixes transport-domain rollback and inconsistent accounting
in the regular (non-MP) device paths.

Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the
success return path explicit, initializes lb.mutex earlier, and destroys
it on cleanup/failure paths.

Patch 2 corrects the loopback threshold logic to use a capability-aware
baseline rather than a hardcoded value and ensures that
user_td/qps counters are rolled back if the hardware command fails.

v12:
- Add mutex_destroy() for lb.mutex in mlx5_ib_stage_init_cleanup() and
  mlx5_ib_stage_init_init() failure paths.
- Keep MP helper behavior unchanged.

v11:
- Dropped the MP locking changes per review feedback
  to keep the logic unchanged.
- Narrowed the scope of Patch 2/2 to focus solely on regular-path
  threshold and accounting fixes.

v10:
- Initialize lb.mutex before multiport master init to avoid race.
- Use <= td_base in disable paths to handle idle/no-TD cases.

v9:
- Address race/state issues around force_enable and enabled.
- Fix TD leak on failure after successful allocation.
- Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to
  handle both TD-capable and no-TD hardware correctly.
- Serialize MP force-enable transitions under lb.mutex.

v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.

v7:
- Split the series into two patches to isolate the return-value/mutex 
  initialization fix from the refcounting logic.
- Moved force_enable check after increments/decrements to fix leaks.
- Updated hardware disable condition to a strict zero-check.

v1-v6:
- Initial combined versions.
- Added deallocation of tdn on failure.
- Moved mutex_init to stage_init_init to prevent crashes on non-ETH.
- Implemented atomic rollback in enable/disable paths.

Prathamesh Deshpande (2):
  IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  IB/mlx5: Fix loopback threshold/accounting in regular path

 drivers/infiniband/hw/mlx5/main.c | 47 ++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 14 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v12 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  2026-05-10 22:22 [PATCH v12 0/2] IB/mlx5: Fix loopback rollback and threshold accounting Prathamesh Deshpande
@ 2026-05-10 22:22 ` Prathamesh Deshpande
  2026-05-10 22:22 ` [PATCH v12 2/2] IB/mlx5: Fix loopback threshold/accounting in regular path Prathamesh Deshpande
  1 sibling, 0 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-05-10 22:22 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Patrisious Haddad, Mark Bloch, Doug Ledford, Haggai Eran,
	Majd Dibbiny, linux-rdma, linux-kernel, Prathamesh Deshpande

mlx5_ib_alloc_transport_domain() allocates a transport domain and then
may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked.

Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an
error. Also return 0 explicitly in the no-loopback-capability success
branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init().

Destroy dev->lb.mutex in the matching cleanup path and in init failure
paths after the mutex is initialized.

Fixes: 146d2f1af324 ("IB/mlx5: Allocate a Transport Domain for each ucontext")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
 drivers/infiniband/hw/mlx5/main.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e02bfb1479f5..f6d9841c2bcf 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2068,9 +2068,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
 	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
 	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
 	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
-		return err;
+		return 0;
+
+	err = mlx5_ib_enable_lb(dev, true, false);
+	if (err)
+		mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
 
-	return mlx5_ib_enable_lb(dev, true, false);
+	return err;
 }
 
 static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn,
@@ -4464,6 +4468,7 @@ static void mlx5_ib_stage_init_cleanup(struct mlx5_ib_dev *dev)
 	mlx5_ib_cleanup_multiport_master(dev);
 	WARN_ON(!xa_empty(&dev->odp_mkeys));
 	mutex_destroy(&dev->cap_mask_mutex);
+	mutex_destroy(&dev->lb.mutex);
 	WARN_ON(!xa_empty(&dev->sig_mrs));
 	WARN_ON(!bitmap_empty(dev->dm.memic_alloc_pages, MLX5_MAX_MEMIC_PAGES));
 	mlx5r_macsec_dealloc_gids(dev);
@@ -4486,17 +4491,19 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 		dev->port[i].roce.last_port_state = IB_PORT_DOWN;
 	}
 
+	mutex_init(&dev->lb.mutex);
+
 	err = mlx5r_cmd_query_special_mkeys(dev);
 	if (err)
-		return err;
+		goto err_lb_mutex;
 
 	err = mlx5r_macsec_init_gids_and_devlist(dev);
 	if (err)
-		return err;
+		goto err_lb_mutex;
 
 	err = mlx5_ib_init_multiport_master(dev);
 	if (err)
-		goto err;
+		goto err_gids;
 
 	err = set_has_smi_cap(dev);
 	if (err)
@@ -4536,8 +4543,10 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 	mlx5_ib_data_direct_cleanup(dev);
 err_mp:
 	mlx5_ib_cleanup_multiport_master(dev);
-err:
+err_gids:
 	mlx5r_macsec_dealloc_gids(dev);
+err_lb_mutex:
+	mutex_destroy(&dev->lb.mutex);
 	return err;
 }
 
@@ -4786,11 +4795,6 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev)
 	if (err)
 		return err;
 
-	if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
-	    (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) ||
-	     MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
-		mutex_init(&dev->lb.mutex);
-
 	if (MLX5_CAP_GEN_64(dev->mdev, general_obj_types) &
 			MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
 		err = mlx5_ib_init_var_region(dev);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v12 2/2] IB/mlx5: Fix loopback threshold/accounting in regular path
  2026-05-10 22:22 [PATCH v12 0/2] IB/mlx5: Fix loopback rollback and threshold accounting Prathamesh Deshpande
  2026-05-10 22:22 ` [PATCH v12 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
@ 2026-05-10 22:22 ` Prathamesh Deshpande
  1 sibling, 0 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-05-10 22:22 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Patrisious Haddad, Mark Bloch, Doug Ledford, Haggai Eran,
	Majd Dibbiny, linux-rdma, linux-kernel, Prathamesh Deshpande

In regular (non-MP) loopback enable/disable paths, threshold logic uses a
hardcoded user_td baseline and does not rollback counters when HW enable
fails.

Use a TD-capability-aware baseline for user_td transitions, and rollback
user_td/qps accounting if mlx5_nic_vport_update_local_lb() fails.

Per review, keep MP helper behavior unchanged.

Fixes: 08aae7860450 ("RDMA/mlx5: Fix vport loopback forcing for MPV device")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
 drivers/infiniband/hw/mlx5/main.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f6d9841c2bcf..eda578029d28 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2005,8 +2005,14 @@ static void mlx5_ib_disable_lb_mp(struct mlx5_core_dev *master,
 	lb_state->force_enable = false;
 }
 
+static inline u32 mlx5_ib_lb_td_base(struct mlx5_core_dev *mdev)
+{
+	return MLX5_CAP_GEN(mdev, log_max_transport_domain) ? 1 : 0;
+}
+
 int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
+	u32 td_base = mlx5_ib_lb_td_base(dev->mdev);
 	int err = 0;
 
 	if (dev->lb.force_enable)
@@ -2018,11 +2024,18 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 	if (qp)
 		dev->lb.qps++;
 
-	if (dev->lb.user_td == 2 ||
+	if (dev->lb.user_td == td_base + 1 ||
 	    dev->lb.qps == 1) {
 		if (!dev->lb.enabled) {
 			err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
-			dev->lb.enabled = true;
+			if (err) {
+				if (td)
+					dev->lb.user_td--;
+				if (qp)
+					dev->lb.qps--;
+			} else {
+				dev->lb.enabled = true;
+			}
 		}
 	}
 
@@ -2033,6 +2046,8 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 
 void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
+	u32 td_base = mlx5_ib_lb_td_base(dev->mdev);
+
 	if (dev->lb.force_enable)
 		return;
 
@@ -2042,7 +2057,7 @@ void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 	if (qp)
 		dev->lb.qps--;
 
-	if (dev->lb.user_td == 1 &&
+	if (dev->lb.user_td <= td_base &&
 	    dev->lb.qps == 0) {
 		if (dev->lb.enabled) {
 			mlx5_nic_vport_update_local_lb(dev->mdev, false);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-10 22:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-10 22:22 [PATCH v12 0/2] IB/mlx5: Fix loopback rollback and threshold accounting Prathamesh Deshpande
2026-05-10 22:22 ` [PATCH v12 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
2026-05-10 22:22 ` [PATCH v12 2/2] IB/mlx5: Fix loopback threshold/accounting in regular path Prathamesh Deshpande

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox