All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] fix multi-process VF hotplug
@ 2026-02-21  2:44 longli
  0 siblings, 0 replies; only message in thread
From: longli @ 2026-02-21  2:44 UTC (permalink / raw)
  To: dev, Wei Hu, Stephen Hemminger, stable, Dariusz Sosnowski,
	Viacheslav Ovsiienko, Bing Zhao, Ori Kam, Suanming Mou,
	Matan Azrad
  Cc: Long Li

From: Long Li <longli@microsoft.com>

This series fixes multi-process support for DPDK drivers used on
Azure VMs with Accelerated Networking (AN). When AN is toggled, the
VF device is hot-removed and hot-added, which can crash secondary
processes due to stale fast-path pointers and race conditions.

Patches 1-3 fix the netvsc PMD:
- Prevent secondary from calling unsupported promiscuous ops
- Fix rwlock misuse and race conditions on VF add/remove events
- Add multi-process VF device removal support via IPC

Patches 4-5 fix resource leaks:
- MANA PD resource leak on device close
- netvsc devargs memory leak on hotplug

Patches 6-8 fix a common bug across MANA, MLX5, and MLX4 drivers
where the secondary process START_RXTX/STOP_RXTX IPC handlers
update dev->rx_pkt_burst/tx_pkt_burst but do not update the
process-local rte_eth_fp_ops[] array. Since rte_eth_rx_burst()
uses rte_eth_fp_ops (not dev->rx_pkt_burst), the secondary retains
stale queue data pointers after VF hot-add, causing a segfault.

v2:
- Patch 2: rename __hn_vf_add/__hn_vf_remove to
  hn_vf_add_unlocked/hn_vf_remove_unlocked to avoid C-reserved
  double-underscore prefix (C99 7.1.3)
- Patch 2: add hn_vf_detach() cleanup path when VF configure/start
  fails after hn_vf_attach() succeeds, preventing half-attached
  VF state
- Patch 2: unconditionally clear vf_vsc_switched on VF remove
  regardless of hn_nvs_set_datapath() result, since VF is being
  removed anyway
- Patch 3: add rte_eth_dev_is_valid_port() check before accessing
  rte_eth_devices[] in secondary VF removal handler
- Patch 3: rename netvsc_mp_req_VF to netvsc_mp_req_vf per DPDK
  lowercase naming convention
- Patch 3: use rte_memory_order_acquire/release instead of relaxed
  for secondary_cnt to ensure visibility on ARM
- Patch 3: initialize ret = 0 in netvsc_init_once()
- Patch 4: use local 'err' variable for ibv_dealloc_pd() return
  value to avoid shadowing outer 'ret'

Long Li (8):
  net/netvsc: secondary ignore promiscuous enable/disable
  net/netvsc: fix race conditions on VF add/remove events
  net/netvsc: add multi-process VF device removal support
  net/mana: fix PD resource leak on device close
  net/netvsc: fix devargs memory leak on hotplug
  net/mana: fix fast-path ops setup in secondary process
  net/mlx5: fix fast-path ops setup in secondary process
  net/mlx4: fix fast-path ops setup in secondary process

 drivers/net/mana/mana.c             |  14 ++
 drivers/net/mana/mp.c               |   6 +
 drivers/net/mlx4/mlx4_mp.c          |   4 +
 drivers/net/mlx5/linux/mlx5_mp_os.c |   4 +
 drivers/net/netvsc/hn_ethdev.c      | 293 +++++++++++++++++++++++++++-
 drivers/net/netvsc/hn_nvs.h         |   5 +
 drivers/net/netvsc/hn_rxtx.c        |  40 ++--
 drivers/net/netvsc/hn_var.h         |   1 +
 drivers/net/netvsc/hn_vf.c          | 144 ++++++++------
 9 files changed, 423 insertions(+), 88 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-02-23  7:58 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-21  2:44 [PATCH v2 0/8] fix multi-process VF hotplug longli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.