* Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking [not found] <20260410005219.5197-1-prathameshdeshpande7@gmail.com> @ 2026-05-10 10:55 ` Leon Romanovsky 2026-05-10 22:35 ` Prathamesh Deshpande [not found] ` <20260410005219.5197-2-prathameshdeshpande7@gmail.com> 1 sibling, 1 reply; 3+ messages in thread From: Leon Romanovsky @ 2026-05-10 10:55 UTC (permalink / raw) To: Prathamesh Deshpande Cc: Jason Gunthorpe, linux-rdma, linux-kernel, dledford, haggaie On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote: > This series fixes transport-domain rollback and loopback state > consistency in mlx5 IB. > > Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the > success return path explicit, and initializes lb.mutex earlier. > > Patch 2 serializes MP force-enable state updates with lb.mutex and > implements capability-aware thresholds (td_base) to ensure correct > loopback behavior on both TD-capable and no-TD hardware. > > v9: > - Address race/state issues around force_enable and enabled. > - Fix TD leak on failure after successful allocation. > - Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to > handle both TD-capable and no-TD hardware correctly. > - Serialize MP force-enable transitions under lb.mutex. > > v8: > - Resubmitted as a fresh, independent thread per maintainer request. > - No functional changes since v7. > > v7: > - Split the series into two patches to isolate the return-value/mutex > initialization fix from the refcounting logic. > - Moved force_enable check after increments/decrements to fix leaks. > - Updated hardware disable condition to a strict zero-check. > > v1-v6: > - Initial combined versions. > - Added deallocation of tdn on failure. > - Moved mutex_init to stage_init_init to prevent crashes on non-ETH. > - Implemented atomic rollback in enable/disable paths. > > Prathamesh Deshpande (2): > IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier I agree that this patch is needed. > IB/mlx5: Serialize force-enable state and preserve loopback accounting This change does not appear to be justified. The commit message provides no clear explanation of why it is needed. Thanks > > drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++-------- > 1 file changed, 62 insertions(+), 19 deletions(-) > > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking 2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky @ 2026-05-10 22:35 ` Prathamesh Deshpande 0 siblings, 0 replies; 3+ messages in thread From: Prathamesh Deshpande @ 2026-05-10 22:35 UTC (permalink / raw) To: leon; +Cc: dledford, haggaie, jgg, linux-kernel, linux-rdma, prathameshdeshpande7 On Sun, May 10, 2026 at 13:55:31 +0300, Leon Romanovsky wrote: > On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote: > > Prathamesh Deshpande (2): > > IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier > > I agree that this patch is needed. > > > IB/mlx5: Serialize force-enable state and preserve loopback accounting > > This change does not appear to be justified. The commit message provides no > clear explanation of why it is needed. > > Thanks Thanks, Leon. v11 dropped the MP force-enable locking changes and kept MP helper behavior unchanged. Patch 2 is now limited to the regular-path threshold/accounting fixes. I have also just sent a fresh v12 series that addresses your Patch 1 review regarding the missing mutex cleanups. You can find the updated series here: https://lore.kernel.org/all/20260510222258.6654-1-prathameshdeshpande7@gmail.com/ ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <20260410005219.5197-2-prathameshdeshpande7@gmail.com>]
* Re: [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier [not found] ` <20260410005219.5197-2-prathameshdeshpande7@gmail.com> @ 2026-05-10 10:56 ` Leon Romanovsky 0 siblings, 0 replies; 3+ messages in thread From: Leon Romanovsky @ 2026-05-10 10:56 UTC (permalink / raw) To: Prathamesh Deshpande Cc: Jason Gunthorpe, linux-rdma, linux-kernel, dledford, haggaie On Fri, Apr 10, 2026 at 01:52:17AM +0100, Prathamesh Deshpande wrote: > mlx5_ib_alloc_transport_domain() allocates a transport domain and then > may fail in mlx5_ib_enable_lb(). In that case, the allocated TD is leaked. > > Fix this by deallocating the TD when mlx5_ib_enable_lb() returns an > error. Also return 0 explicitly in the no-loopback-capability success > branch, and move dev->lb.mutex initialization to mlx5_ib_stage_init_init(). > > Fixes: 146d2f1af324 ("IB/mlx5: Allocate a Transport Domain for each ucontext") > Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com> > --- > drivers/infiniband/hw/mlx5/main.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c > index e02bfb1479f5..6be198c0651c 100644 > --- a/drivers/infiniband/hw/mlx5/main.c > +++ b/drivers/infiniband/hw/mlx5/main.c > @@ -2068,9 +2068,13 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn, > if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) || > (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) && > !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc))) > - return err; > + return 0; > + > + err = mlx5_ib_enable_lb(dev, true, false); > + if (err) > + mlx5_cmd_dealloc_transport_domain(dev->mdev, *tdn, uid); > > - return mlx5_ib_enable_lb(dev, true, false); > + return err; > } > > static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn, > @@ -4513,6 +4517,7 @@ static int mlx5_ib_stage_init_init(struct mlx5_ib_dev *dev) > > mutex_init(&dev->cap_mask_mutex); > mutex_init(&dev->data_direct_lock); > + mutex_init(&dev->lb.mutex); There is also a need to call mutex_destroy() to ensure proper resource cleanup. Thanks > INIT_LIST_HEAD(&dev->qp_list); > spin_lock_init(&dev->reset_flow_resource_lock); > xa_init(&dev->odp_mkeys); > @@ -4786,11 +4791,6 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev) > if (err) > return err; > > - if ((MLX5_CAP_GEN(dev->mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) && > - (MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) || > - MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc))) > - mutex_init(&dev->lb.mutex); > - > if (MLX5_CAP_GEN_64(dev->mdev, general_obj_types) & > MLX5_GENERAL_OBJ_TYPES_CAP_VIRTIO_NET_Q) { > err = mlx5_ib_init_var_region(dev); > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-10 22:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260410005219.5197-1-prathameshdeshpande7@gmail.com>
2026-05-10 10:55 ` [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking Leon Romanovsky
2026-05-10 22:35 ` Prathamesh Deshpande
[not found] ` <20260410005219.5197-2-prathameshdeshpande7@gmail.com>
2026-05-10 10:56 ` [PATCH v9 1/2] IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier Leon Romanovsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox