* [PATCH v8 0/2] Fix loopback leaks and return paths
@ 2026-04-05 21:57 Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 1/2] IB/mlx5: Fix success return path and mutex initialization Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable Prathamesh Deshpande
0 siblings, 2 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-04-05 21:57 UTC (permalink / raw)
To: linux-rdma
Cc: prathameshdeshpande7, dledford, haggaie, jgg, leon, linux-kernel
This series fixes a return-value bug in the transport domain allocation
path and refactors the loopback enablement logic to resolve reference
count leaks and premature hardware deactivation.
In v7, the patchset is split into two parts:
1. A direct fix for the return-value bug and mutex initialization.
2. A refactor of the loopback state machine to ensure symmetric counter
updates and correct hardware toggling at zero-count transitions.
The split allows for cleaner bisection and separates the immediate
bug fixes from the lifecycle improvements identified during review.
Prathamesh Deshpande (2):
IB/mlx5: Fix success return path and mutex initialization
IB/mlx5: Fix loopback refcounting leaks and premature disable
drivers/infiniband/hw/mlx5/main.c | 45 ++++++++++++++++---------------
1 file changed, 23 insertions(+), 22 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v8 1/2] IB/mlx5: Fix success return path and mutex initialization
2026-04-05 21:57 [PATCH v8 0/2] Fix loopback leaks and return paths Prathamesh Deshpande
@ 2026-04-05 21:57 ` Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable Prathamesh Deshpande
1 sibling, 0 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-04-05 21:57 UTC (permalink / raw)
To: linux-rdma
Cc: prathameshdeshpande7, dledford, haggaie, jgg, leon, linux-kernel
Fix an incorrect return path in mlx5_ib_alloc_transport_domain() where
a success case could return an uninitialized error value instead of 0.
Additionally, move dev->lb.mutex initialization to
mlx5_ib_stage_init_init(). This ensures the mutex is initialized
before potential access by create_raw_packet_qp_tir(), preventing
a null pointer dereference.
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.
v7:
- Split from the main loopback refactor into a standalone patch to
improve bisection and isolate the return-value fix.
v1-v6:
- Part of the combined "IB/mlx5: Fix loopback enablement state and
resource leaks" patch.
drivers/infiniband/hw/mlx5/main.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index b74bf2697655..f49f746bc5bd 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2068,7 +2068,7 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn,
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
(!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
!MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
- return err;
+ return 0;
return mlx5_ib_enable_lb(dev, true, false);
}
@@ -4515,6 +4515,7 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev)
mutex_init(&dev->data_direct_lock);
INIT_LIST_HEAD(&dev->qp_list);
spin_lock_init(&dev->reset_flow_resource_lock);
+ mutex_init(&dev->lb.mutex);
xa_init(&dev->odp_mkeys);
xa_init(&dev->sig_mrs);
atomic_set(&dev->mkey_var, 0);
@@ -4786,11 +4787,6 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev)
if (err)
return err;
- if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
- (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) ||
- MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
- mutex_init(&dev->lb.mutex);
-
if (MLX5_CAP_GEN_64(dev->mdev, general_obj_types) &
MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) {
err = mlx5_ib_init_var_region(dev);
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH v8 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable
2026-04-05 21:57 [PATCH v8 0/2] Fix loopback leaks and return paths Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 1/2] IB/mlx5: Fix success return path and mutex initialization Prathamesh Deshpande
@ 2026-04-05 21:57 ` Prathamesh Deshpande
1 sibling, 0 replies; 3+ messages in thread
From: Prathamesh Deshpande @ 2026-04-05 21:57 UTC (permalink / raw)
To: linux-rdma
Cc: prathameshdeshpande7, dledford, haggaie, jgg, leon, linux-kernel
Update mlx5_ib_enable_lb() and mlx5_ib_disable_lb() to ensure
symmetric updates of user_td and qps counters.
Software state leaks can occur if the force_enable flag is checked
before updating counters. Furthermore, the hardware deactivation
condition in the original code (user_td == 1) can fail to disable
loopback if user_td remains 0, or cause premature deactivation in
multi-user scenarios.
Fix these by:
- Updating counters prior to checking the force_enable gate.
- Disabling hardware only when both user_td and qps reach zero.
- Implementing a counter rollback if the hardware command fails.
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.
v7:
- Split into a separate patch for better bisection.
- Moved force_enable check after increments/decrements to fix leaks [Sashiko].
- Updated hardware disable condition to a strict zero-check.
v6:
- Always update software counters regardless of force_enable to prevent
underflows during dynamic unbinding [Sashiko].
- Updated disable condition to user_td <= 1 to prevent HW state leaks
on systems without transport domains [Sashiko].
- Rebased on rdma/for-next to resolve baseline application failures.
v5:
- Moved mutex_init to stage_init_init to prevent crashes on non-ETH hardware.
- Implemented 'goto unlock' for concurrency safety in enable/disable paths.
- Added atomic rollback and fixed tdn leak.
v4:
- Moved rollback logic into mlx5_ib_enable_lb() to ensure atomicity
within the mutex and prevent race conditions [Sashiko].
v3:
- Also call mlx5_ib_disable_lb() on failure to roll back software state/counters
[Sashiko].
v2:
- Added deallocation of tdn if mlx5_ib_enable_lb() fails [Sashiko].
- Reworded commit message to reflect the functional fix and credit the tool.
drivers/infiniband/hw/mlx5/main.c | 37 ++++++++++++++++++-------------
1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f49f746bc5bd..fde72ebe721a 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2009,23 +2009,29 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
{
int err = 0;
- if (dev->lb.force_enable)
- return 0;
-
mutex_lock(&dev->lb.mutex);
+
if (td)
dev->lb.user_td++;
if (qp)
dev->lb.qps++;
- if (dev->lb.user_td == 2 ||
- dev->lb.qps == 1) {
- if (!dev->lb.enabled) {
- err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
+ if (dev->lb.force_enable)
+ goto unlock;
+
+ if (!dev->lb.enabled && (dev->lb.user_td >= 1 || dev->lb.qps >= 1)) {
+ err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
+ if (err) {
+ if (td)
+ dev->lb.user_td--;
+ if (qp)
+ dev->lb.qps--;
+ } else {
dev->lb.enabled = true;
}
}
+unlock:
mutex_unlock(&dev->lb.mutex);
return err;
@@ -2033,23 +2039,22 @@ int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
void mlx5_ib_disable_lb(struct mlx5_ib_dev *dev, bool td, bool qp)
{
- if (dev->lb.force_enable)
- return;
-
mutex_lock(&dev->lb.mutex);
+
if (td)
dev->lb.user_td--;
if (qp)
dev->lb.qps--;
- if (dev->lb.user_td == 1 &&
- dev->lb.qps == 0) {
- if (dev->lb.enabled) {
- mlx5_nic_vport_update_local_lb(dev->mdev, false);
- dev->lb.enabled = false;
- }
+ if (dev->lb.force_enable)
+ goto unlock;
+
+ if (dev->lb.enabled && (dev->lb.user_td == 0 && dev->lb.qps == 0)) {
+ mlx5_nic_vport_update_local_lb(dev->mdev, false);
+ dev->lb.enabled = false;
}
+unlock:
mutex_unlock(&dev->lb.mutex);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-05 21:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-05 21:57 [PATCH v8 0/2] Fix loopback leaks and return paths Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 1/2] IB/mlx5: Fix success return path and mutex initialization Prathamesh Deshpande
2026-04-05 21:57 ` [PATCH v8 2/2] IB/mlx5: Fix loopback refcounting leaks and premature disable Prathamesh Deshpande
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox