Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
@ 2026-04-10  0:52 Prathamesh Deshpande
  2026-04-10  0:52 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Prathamesh Deshpande @ 2026-04-10  0:52 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: linux-rdma, linux-kernel, dledford, haggaie, Prathamesh Deshpande

This series fixes transport-domain rollback and loopback state
consistency in mlx5 IB.

Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the
success return path explicit, and initializes lb.mutex earlier.

Patch 2 serializes MP force-enable state updates with lb.mutex and
implements capability-aware thresholds (td_base) to ensure correct
loopback behavior on both TD-capable and no-TD hardware.

v9:
- Address race/state issues around force_enable and enabled.
- Fix TD leak on failure after successful allocation.
- Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to
  handle both TD-capable and no-TD hardware correctly.
- Serialize MP force-enable transitions under lb.mutex.

v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.

v7:
- Split the series into two patches to isolate the return-value/mutex 
  initialization fix from the refcounting logic.
- Moved force_enable check after increments/decrements to fix leaks.
- Updated hardware disable condition to a strict zero-check.

v1-v6:
- Initial combined versions.
- Added deallocation of tdn on failure.
- Moved mutex_init to stage_init_init to prevent crashes on non-ETH.
- Implemented atomic rollback in enable/disable paths.

Prathamesh Deshpande (2):
  IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  IB/mlx5: Serialize force-enable state and preserve loopback accounting

 drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++--------
 1 file changed, 62 insertions(+), 19 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  2026-04-10  0:52 [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Prathamesh Deshpande
@ 2026-04-10  0:52 ` Prathamesh Deshpande
  2026-05-10 10:56   ` Leon Romanovsky
  2026-04-10  0:52 ` [PATCH v9 2/2] IB/mlx5: Serialize force-enable state and preserve loopback accounting Prathamesh Deshpande
  2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky
  2 siblings, 1 reply; 6+ messages in thread
From: Prathamesh Deshpande @ 2026-04-10  0:52 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: linux-rdma, linux-kernel, dledford, haggaie, Prathamesh Deshpande

mlx5_ib_alloc_transport_domain() allocates a transport domain and then
may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked.

Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an
error. Also return 0 explicitly in the no-loopback-capability success
branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init().

Fixes: 146d2f1af324 ("IB/mlx5: Allocate a Transport Domain for each ucontext")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
 drivers/infiniband/hw/mlx5/main.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index e02bfb1479f5..6be198c0651c 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2068,9 +2068,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
 	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
 	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
 	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
-		return err;
+		return 0;
+
+	err = mlx5_ib_enable_lb(dev, true, false);
+	if (err)
+		mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
 
-	return mlx5_ib_enable_lb(dev, true, false);
+	return err;
 }
 
 static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn,
@@ -4513,6 +4517,7 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
 
 	mutex_init(&dev->cap_mask_mutex);
 	mutex_init(&dev->data_direct_lock);
+	mutex_init(&dev->lb.mutex);
 	INIT_LIST_HEAD(&dev->qp_list);
 	spin_lock_init(&dev->reset_flow_resource_lock);
 	xa_init(&dev->odp_mkeys);
@@ -4786,11 +4791,6 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev)
 	if (err)
 		return err;
 
-	if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
-	    (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) ||
-	     MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
-		mutex_init(&dev->lb.mutex);
-
 	if (MLX5_CAP_GEN_64(dev->mdev, general_obj_types) &
 			MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
 		err = mlx5_ib_init_var_region(dev);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v9 2/2] IB/mlx5: Serialize force-enable state and preserve loopback accounting
  2026-04-10  0:52 [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Prathamesh Deshpande
  2026-04-10  0:52 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
@ 2026-04-10  0:52 ` Prathamesh Deshpande
  2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky
  2 siblings, 0 replies; 6+ messages in thread
From: Prathamesh Deshpande @ 2026-04-10  0:52 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: linux-rdma, linux-kernel, dledford, haggaie, Prathamesh Deshpande

force_enable is shared between MP bind/unbind flows and regular loopback
enable/disable flows. MP helpers updated force_enable without lb.mutex,
while regular paths read it under lb.mutex, allowing races and state
mismatches.

Serialize MP force-enable transitions under lb.mutex. In regular loopback
paths, update counters before checking force_enable
and roll them back if HW enable fails. Also keep pre-existing
master loopback enabled when MP enable fails on the slave side.

Use a TD-capability-aware baseline for user_td transitions so threshold
checks are correct on both TD-capable and no-TD hardware.

Fixes: 08aae7860450 ("RDMA/mlx5: Fix vport loopback forcing for MPV device")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
 drivers/infiniband/hw/mlx5/main.c | 67 +++++++++++++++++++++++++------
 1 file changed, 55 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 6be198c0651c..5038053cc9cc 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1973,25 +1973,45 @@ static void deallocate_uars(struct mlx5_ib_dev *dev,
 					     context->devx_uid);
 }
 
+static inline u32 mlx5_ib_lb_td_base(struct mlx5_core_dev *mdev)
+{
+	return MLX5_CAP_GEN(mdev, log_max_transport_domain) ? 1 : 0;
+}
+
 static int mlx5_ib_enable_lb_mp(struct mlx5_core_dev *master,
 				struct mlx5_core_dev *slave,
 				struct mlx5_ib_lb_state *lb_state)
 {
+	bool user_enabled;
 	int err;
 
+	lockdep_assert_held(&mlx5_ib_multiport_mutex);
+
+	mutex_lock(&lb_state->mutex);
+	if (lb_state->force_enable) {
+		mutex_unlock(&lb_state->mutex);
+		return 0;
+	}
+	user_enabled = lb_state->enabled;
+
 	err = mlx5_nic_vport_update_local_lb(master, true);
 	if (err)
-		return err;
+		goto unlock;
 
 	err = mlx5_nic_vport_update_local_lb(slave, true);
 	if (err)
 		goto out;
 
 	lb_state->force_enable = true;
+	lb_state->enabled = true;
+	mutex_unlock(&lb_state->mutex);
 	return 0;
 
 out:
-	mlx5_nic_vport_update_local_lb(master, false);
+	if (!user_enabled)
+		mlx5_nic_vport_update_local_lb(master, false);
+unlock:
+	mutex_unlock(&lb_state->mutex);
 	return err;
 }
 
@@ -1999,33 +2019,53 @@ static void mlx5_ib_disable_lb_mp(struct mlx5_core_dev *master,
 				  struct mlx5_core_dev *slave,
 				  struct mlx5_ib_lb_state *lb_state)
 {
-	mlx5_nic_vport_update_local_lb(slave, false);
-	mlx5_nic_vport_update_local_lb(master, false);
+	u32 td_base = mlx5_ib_lb_td_base(master);
+
+	lockdep_assert_held(&mlx5_ib_multiport_mutex);
+
+	mutex_lock(&lb_state->mutex);
 
+	mlx5_nic_vport_update_local_lb(slave, false);
 	lb_state->force_enable = false;
+	if (lb_state->enabled &&
+	    lb_state->user_td == td_base && lb_state->qps == 0) {
+		mlx5_nic_vport_update_local_lb(master, false);
+		lb_state->enabled = false;
+	}
+
+	mutex_unlock(&lb_state->mutex);
 }
 
 int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
+	u32 td_base = mlx5_ib_lb_td_base(dev->mdev);
 	int err = 0;
 
-	if (dev->lb.force_enable)
-		return 0;
-
 	mutex_lock(&dev->lb.mutex);
 	if (td)
 		dev->lb.user_td++;
 	if (qp)
 		dev->lb.qps++;
 
-	if (dev->lb.user_td == 2 ||
+	if (dev->lb.force_enable)
+		goto unlock;
+
+	if (dev->lb.user_td == td_base + 1 ||
 	    dev->lb.qps == 1) {
 		if (!dev->lb.enabled) {
 			err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
-			dev->lb.enabled = true;
+			if (err) {
+				if (td)
+					dev->lb.user_td--;
+				if (qp)
+					dev->lb.qps--;
+			} else {
+				dev->lb.enabled = true;
+			}
 		}
 	}
 
+unlock:
 	mutex_unlock(&dev->lb.mutex);
 
 	return err;
@@ -2033,8 +2073,7 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 
 void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 {
-	if (dev->lb.force_enable)
-		return;
+	u32 td_base = mlx5_ib_lb_td_base(dev->mdev);
 
 	mutex_lock(&dev->lb.mutex);
 	if (td)
@@ -2042,7 +2081,10 @@ void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 	if (qp)
 		dev->lb.qps--;
 
-	if (dev->lb.user_td == 1 &&
+	if (dev->lb.force_enable)
+		goto unlock;
+
+	if (dev->lb.user_td == td_base &&
 	    dev->lb.qps == 0) {
 		if (dev->lb.enabled) {
 			mlx5_nic_vport_update_local_lb(dev->mdev, false);
@@ -2050,6 +2092,7 @@ void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
 		}
 	}
 
+unlock:
 	mutex_unlock(&dev->lb.mutex);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
  2026-04-10  0:52 [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Prathamesh Deshpande
  2026-04-10  0:52 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
  2026-04-10  0:52 ` [PATCH v9 2/2] IB/mlx5: Serialize force-enable state and preserve loopback accounting Prathamesh Deshpande
@ 2026-05-10 10:55 ` Leon Romanovsky
  2026-05-10 22:35   ` Prathamesh Deshpande
  2 siblings, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2026-05-10 10:55 UTC (permalink / raw)
  To: Prathamesh Deshpande
  Cc: Jason Gunthorpe, linux-rdma, linux-kernel, dledford, haggaie

On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote:
> This series fixes transport-domain rollback and loopback state
> consistency in mlx5 IB.
> 
> Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the
> success return path explicit, and initializes lb.mutex earlier.
> 
> Patch 2 serializes MP force-enable state updates with lb.mutex and
> implements capability-aware thresholds (td_base) to ensure correct
> loopback behavior on both TD-capable and no-TD hardware.
> 
> v9:
> - Address race/state issues around force_enable and enabled.
> - Fix TD leak on failure after successful allocation.
> - Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to
>   handle both TD-capable and no-TD hardware correctly.
> - Serialize MP force-enable transitions under lb.mutex.
> 
> v8:
> - Resubmitted as a fresh, independent thread per maintainer request.
> - No functional changes since v7.
> 
> v7:
> - Split the series into two patches to isolate the return-value/mutex 
>   initialization fix from the refcounting logic.
> - Moved force_enable check after increments/decrements to fix leaks.
> - Updated hardware disable condition to a strict zero-check.
> 
> v1-v6:
> - Initial combined versions.
> - Added deallocation of tdn on failure.
> - Moved mutex_init to stage_init_init to prevent crashes on non-ETH.
> - Implemented atomic rollback in enable/disable paths.
> 
> Prathamesh Deshpande (2):
>   IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier

I agree that this patch is needed.

>   IB/mlx5: Serialize force-enable state and preserve loopback accounting

This change does not appear to be justified. The commit message provides no
clear explanation of why it is needed.

Thanks

> 
>  drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++--------
>  1 file changed, 62 insertions(+), 19 deletions(-)
> 
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  2026-04-10  0:52 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
@ 2026-05-10 10:56   ` Leon Romanovsky
  0 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2026-05-10 10:56 UTC (permalink / raw)
  To: Prathamesh Deshpande
  Cc: Jason Gunthorpe, linux-rdma, linux-kernel, dledford, haggaie

On Fri, Apr 10, 2026 at 01:52:17AM +0100, Prathamesh Deshpande wrote:
> mlx5_ib_alloc_transport_domain() allocates a transport domain and then
> may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked.
> 
> Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an
> error. Also return 0 explicitly in the no-loopback-capability success
> branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init().
> 
> Fixes: 146d2f1af324 ("IB/mlx5: Allocate a Transport Domain for each ucontext")
> Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
> ---
>  drivers/infiniband/hw/mlx5/main.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index e02bfb1479f5..6be198c0651c 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -2068,9 +2068,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
>  	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
>  	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
>  	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
> -		return err;
> +		return 0;
> +
> +	err = mlx5_ib_enable_lb(dev, true, false);
> +	if (err)
> +		mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid);
>  
> -	return mlx5_ib_enable_lb(dev, true, false);
> +	return err;
>  }
>  
>  static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn,
> @@ -4513,6 +4517,7 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
>  
>  	mutex_init(&dev->cap_mask_mutex);
>  	mutex_init(&dev->data_direct_lock);
> +	mutex_init(&dev->lb.mutex);

There is also a need to call mutex_destroy() to ensure proper resource cleanup.

Thanks

>  	INIT_LIST_HEAD(&dev->qp_list);
>  	spin_lock_init(&dev->reset_flow_resource_lock);
>  	xa_init(&dev->odp_mkeys);
> @@ -4786,11 +4791,6 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev)
>  	if (err)
>  		return err;
>  
> -	if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
> -	    (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) ||
> -	     MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
> -		mutex_init(&dev->lb.mutex);
> -
>  	if (MLX5_CAP_GEN_64(dev->mdev, general_obj_types) &
>  			MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
>  		err = mlx5_ib_init_var_region(dev);
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
  2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky
@ 2026-05-10 22:35   ` Prathamesh Deshpande
  0 siblings, 0 replies; 6+ messages in thread
From: Prathamesh Deshpande @ 2026-05-10 22:35 UTC (permalink / raw)
  To: leon; +Cc: dledford, haggaie, jgg, linux-kernel, linux-rdma,
	prathameshdeshpande7

On Sun, May 10, 2026 at 13:55:31 +0300, Leon Romanovsky wrote:
> On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote:
> > Prathamesh Deshpande (2):
> >   IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
> 
> I agree that this patch is needed.
> 
> >   IB/mlx5: Serialize force-enable state and preserve loopback accounting
> 
> This change does not appear to be justified. The commit message provides no
> clear explanation of why it is needed.
> 
> Thanks
 
Thanks, Leon.
 
v11 dropped the MP force-enable locking changes and kept MP helper behavior
unchanged. Patch 2 is now limited to the regular-path threshold/accounting
fixes.

I have also just sent a fresh v12 series that addresses your Patch 1 
review regarding the missing mutex cleanups. You can find the updated 
series here: https://lore.kernel.org/all/20260510222258.6654-1-prathameshdeshpande7@gmail.com/ 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-10 22:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-10  0:52 [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Prathamesh Deshpande
2026-04-10  0:52 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Prathamesh Deshpande
2026-05-10 10:56   ` Leon Romanovsky
2026-04-10  0:52 ` [PATCH v9 2/2] IB/mlx5: Serialize force-enable state and preserve loopback accounting Prathamesh Deshpande
2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky
2026-05-10 22:35   ` Prathamesh Deshpande

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox