public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next V2 0/3] net/mlx5: Fix E-Switch work queue deadlock with devlink lock
@ 2026-04-28  5:10 Tariq Toukan
  2026-04-28  5:10 ` [PATCH net-next V2 1/3] net/mlx5: E-Switch, move work queue generation counter Tariq Toukan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tariq Toukan @ 2026-04-28  5:10 UTC (permalink / raw)
  To: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Andrew Lunn,
	David S. Miller
  Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, Mark Bloch, netdev,
	linux-rdma, linux-kernel, Gal Pressman, Dragos Tatulea

Hi,

See detailed description by Mark below [1].

Regards,
Tariq

[1]
mlx5_eswitch_cleanup() calls destroy_workqueue() while holding the
devlink lock through mlx5_uninit_one(). E-Switch workqueue workers also
need the devlink lock, but previously took it before checking whether
their work item was stale. Cleanup can therefore wait for a worker that
is blocked on the same devlink lock.

Mode changes have the same ordering hazard: the mode-change path holds
devlink lock while tearing down the current mode, and old work may still
be pending on the E-Switch workqueue.

Fix this by making esw_wq_handler() check the generation counter before
attempting to take devlink lock. The worker uses devl_trylock(); if the
lock is busy and the work is still current, it sleeps on an E-Switch wait
queue with a short timeout. Invalidation increments the generation
counter and wakes the wait queue, so stale workers exit without spinning
or blocking cleanup.

The generation counter already existed but was buried in
mlx5_esw_functions and only covered function-change events. The three
patches get from there to the fix in small steps.

Patch 1 moves the counter up to mlx5_eswitch. Pure refactor,
no behavior change.

Patch 2 cleans up the work queue plumbing: factors out the repeated
lock/check/dispatch boilerplate into a single esw_wq_handler() and
adds mlx5_esw_add_work() as the one place to enqueue work.

Patch 3 is the actual fix: check the generation before the lock, use
devl_trylock() instead of devl_lock(), add a wait queue so lock retries
do not spin, and invalidate pending work at the earliest safe operation
boundary. Cleanup invalidates before destroy_workqueue(), and mode
teardown unregisters the work-producing notifiers before invalidating so
new notifier work cannot capture the new generation.

V2:

Split out from a larger series. The representor lifecycle improvements
to be sent separately.

Patch 3:
- Move generation invalidation after notifier unregister but before
  teardown, so old work is discarded early without allowing new notifier
  work to use the new generation.
- Replace cond_resched() polling with a wait queue to avoid CPU spinning
  while devlink lock is held by a long operation.

Link to V1:
https://lore.kernel.org/all/20260409115550.156419-1-tariqt@nvidia.com/

Mark Bloch (3):
  net/mlx5: E-Switch, move work queue generation counter
  net/mlx5: E-Switch, introduce generic work queue dispatch helper
  net/mlx5: E-Switch, fix deadlock between devlink lock and esw->wq

 .../net/ethernet/mellanox/mlx5/core/eswitch.c | 20 +++-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |  5 +-
 .../mellanox/mlx5/core/eswitch_offloads.c     | 96 ++++++++++++-------
 3 files changed, 82 insertions(+), 39 deletions(-)


base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
-- 
2.44.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-04-28  5:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  5:10 [PATCH net-next V2 0/3] net/mlx5: Fix E-Switch work queue deadlock with devlink lock Tariq Toukan
2026-04-28  5:10 ` [PATCH net-next V2 1/3] net/mlx5: E-Switch, move work queue generation counter Tariq Toukan
2026-04-28  5:10 ` [PATCH net-next V2 2/3] net/mlx5: E-Switch, introduce generic work queue dispatch helper Tariq Toukan
2026-04-28  5:10 ` [PATCH net-next V2 3/3] net/mlx5: E-Switch, fix deadlock between devlink lock and esw->wq Tariq Toukan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox