public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net V3 0/4] net/mlx5: Fixes for Socket-Direct
@ 2026-04-23 12:31 Tariq Toukan
  2026-04-23 12:31 ` [PATCH net V3 1/4] net/mlx5: SD: Serialize init/cleanup Tariq Toukan
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Tariq Toukan @ 2026-04-23 12:31 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Tariq Toukan, Mark Bloch, Leon Romanovsky,
	Shay Drory, Simon Horman, Kees Cook, Patrisious Haddad,
	Parav Pandit, Gal Pressman, netdev, linux-rdma, linux-kernel,
	Dragos Tatulea

Hi,

This series fixes several race conditions and bugs in the mlx5
Socket-Direct (SD) single netdev flow.

Patch 1 serializes mlx5_sd_init()/mlx5_sd_cleanup() with
mlx5_devcom_comp_lock() and tracks the SD group state on the primary
device, preventing concurrent or duplicate bring-up/tear-down.

Patch 2 fixes the debugfs "multi-pf" directory being stored on the
calling device's sd struct instead of the primary's, which caused
memory leaks and recreation errors when cleanup ran from a different PF.

Patch 3 fixes a race where a secondary PF could access the primary's
auxiliary device after it had been unbound, by holding the primary's
device lock while operating on its auxiliary device.

Patch 4 fixes missing cleanup on ETH probe/resume errors.

Regards,
Tariq

V3:
- Link to V2:
  https://lore.kernel.org/all/20260413105323.186411-1-tariqt@nvidia.com/
- Added "net/mlx5e: SD, Fix missing cleanup on probe/resume error"
  patch to solve missing cleanup bug. (Sashiko)
- remove MLX5_SD_STATE_DESTROYING and move
  mlx5_devcom_comp_set_ready(false) to mlx5_sd_cleanup(), simplify the
  locking around SD state. (Sashiko)

Shay Drory (4):
  net/mlx5: SD: Serialize init/cleanup
  net/mlx5: SD, Keep multi-pf debugfs entries on primary
  net/mlx5e: SD, Fix missing cleanup on probe/resume error
  net/mlx5e: SD, Fix race condition in secondary device probe/remove

 .../net/ethernet/mellanox/mlx5/core/en_main.c | 32 +++++++--
 .../net/ethernet/mellanox/mlx5/core/lib/sd.c  | 68 +++++++++++++++----
 .../net/ethernet/mellanox/mlx5/core/lib/sd.h  |  2 +
 3 files changed, 86 insertions(+), 16 deletions(-)


base-commit: d40831b016b4986e70d20d0ad14e6a0c62318986
-- 
2.44.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-23 12:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 12:31 [PATCH net V3 0/4] net/mlx5: Fixes for Socket-Direct Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 1/4] net/mlx5: SD: Serialize init/cleanup Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 2/4] net/mlx5: SD, Keep multi-pf debugfs entries on primary Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 3/4] net/mlx5e: SD, Fix missing cleanup on probe/resume error Tariq Toukan
2026-04-23 12:31 ` [PATCH net V3 4/4] net/mlx5e: SD, Fix race condition in secondary device probe/remove Tariq Toukan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox