From: longli@linux.microsoft.com
To: dev@dpdk.org, Wei Hu <weh@microsoft.com>,
Stephen Hemminger <stephen@networkplumber.org>,
stable@dpdk.org, Dariusz Sosnowski <dsosnowski@nvidia.com>,
Viacheslav Ovsiienko <viacheslavo@nvidia.com>,
Bing Zhao <bingz@nvidia.com>, Ori Kam <orika@nvidia.com>,
Suanming Mou <suanmingm@nvidia.com>,
Matan Azrad <matan@nvidia.com>
Cc: Long Li <longli@microsoft.com>
Subject: [PATCH v3 0/7] fix multi-process VF hotplug
Date: Tue, 24 Feb 2026 18:02:32 -0800 [thread overview]
Message-ID: <20260225020246.890306-1-longli@linux.microsoft.com> (raw)
From: Long Li <longli@microsoft.com>
This series fixes multi-process support for DPDK drivers used on
Azure VMs with Accelerated Networking (AN). When AN is toggled, the
VF device is hot-removed and hot-added, which can crash secondary
processes due to stale fast-path pointers and race conditions.
Patches 1-2 fix the netvsc PMD:
- Fix rwlock misuse and race conditions on VF add/remove events
- Add multi-process VF device removal support via IPC
Patches 3-4 fix resource leaks:
- MANA PD resource leak on device close
- netvsc devargs memory leak on hotplug
Patches 5-7 fix a common bug across MANA, MLX5, and MLX4 drivers
where the secondary process START_RXTX/STOP_RXTX IPC handlers
update dev->rx_pkt_burst/tx_pkt_burst but do not update the
process-local rte_eth_fp_ops[] array. Since rte_eth_rx_burst()
uses rte_eth_fp_ops (not dev->rx_pkt_burst), the secondary retains
stale queue data pointers after VF hot-add, causing a segfault.
Tested on Azure D8s_v3 (mlx5) with symmetric_mp primary+secondary.
AN disable/re-enable correctly hot-removes and re-attaches VF in
both processes without crash.
v3:
- Drop patch 1 from v2 (secondary ignore promiscuous enable/disable)
as it is no longer needed with the VF race condition fixes
- Patch 2: use #define for MZ_NETVSC_SHARED_DATA instead of const
char pointer
- Patch 2: simplify netvsc_secondary_handle_device_remove() to take
vf_port directly instead of struct hn_data pointer
- Patch 2: return 0 (not error) when VF port is not present in
secondary, as this is a normal condition during startup
- Patch 2: pass vf_port as parameter to netvsc_mp_req_vf() instead
of reading from hv->vf_ctx internally
- Patch 2: protect netvsc_init_once() and secondary_cnt increment
under same spinlock to prevent race between MP handler registration
and secondary count visibility
- Patch 2: add secondary_cnt decrement in error and cleanup paths
- Patch 2: fix misleading comment about cross-process locking
v2:
- Patch 1: rename __hn_vf_add/__hn_vf_remove to
hn_vf_add_unlocked/hn_vf_remove_unlocked to avoid C-reserved
double-underscore prefix (C99 7.1.3)
- Patch 1: add hn_vf_detach() cleanup path when VF configure/start
fails after hn_vf_attach() succeeds, preventing half-attached
VF state
- Patch 1: unconditionally clear vf_vsc_switched on VF remove
regardless of hn_nvs_set_datapath() result, since VF is being
removed anyway
- Patch 2: add rte_eth_dev_is_valid_port() check before accessing
rte_eth_devices[] in secondary VF removal handler
- Patch 2: rename netvsc_mp_req_VF to netvsc_mp_req_vf per DPDK
lowercase naming convention
- Patch 2: use rte_memory_order_acquire/release instead of relaxed
for secondary_cnt to ensure visibility on ARM
- Patch 2: initialize ret = 0 in netvsc_init_once()
- Patch 3: use local 'err' variable for ibv_dealloc_pd() return
value to avoid shadowing outer 'ret'
Long Li (7):
net/netvsc: fix race conditions on VF add/remove events
net/netvsc: add multi-process VF device removal support
net/mana: fix PD resource leak on device close
net/netvsc: fix devargs memory leak on hotplug
net/mana: fix fast-path ops setup in secondary process
net/mlx5: fix fast-path ops setup in secondary process
net/mlx4: fix fast-path ops setup in secondary process
drivers/net/mana/mana.c | 14 ++
drivers/net/mana/mp.c | 6 +
drivers/net/mlx4/mlx4_mp.c | 4 +
drivers/net/mlx5/linux/mlx5_mp_os.c | 4 +
drivers/net/netvsc/hn_ethdev.c | 287 +++++++++++++++++++++++++++-
drivers/net/netvsc/hn_nvs.h | 5 +
drivers/net/netvsc/hn_rxtx.c | 40 ++--
drivers/net/netvsc/hn_var.h | 1 +
drivers/net/netvsc/hn_vf.c | 144 ++++++++------
9 files changed, 417 insertions(+), 88 deletions(-)
--
2.43.0
next reply other threads:[~2026-02-25 13:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-25 2:02 longli [this message]
2026-02-25 2:02 ` [PATCH v3 1/7] net/netvsc: fix race conditions on VF add/remove events longli
2026-02-25 2:02 ` [PATCH v3 2/7] net/netvsc: add multi-process VF device removal support longli
2026-02-25 2:02 ` [PATCH v3 3/7] net/mana: fix PD resource leak on device close longli
2026-02-25 2:02 ` [PATCH v3 4/7] net/netvsc: fix devargs memory leak on hotplug longli
2026-02-25 2:02 ` [PATCH v3 5/7] net/mana: fix fast-path ops setup in secondary process longli
2026-02-25 2:02 ` [PATCH v3 6/7] net/mlx5: " longli
2026-02-25 2:02 ` [PATCH v3 7/7] net/mlx4: " longli
2026-02-25 22:36 ` [PATCH v3 0/7] fix multi-process VF hotplug Stephen Hemminger
2026-02-26 1:18 ` [EXTERNAL] " Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260225020246.890306-1-longli@linux.microsoft.com \
--to=longli@linux.microsoft.com \
--cc=bingz@nvidia.com \
--cc=dev@dpdk.org \
--cc=dsosnowski@nvidia.com \
--cc=longli@microsoft.com \
--cc=matan@nvidia.com \
--cc=orika@nvidia.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
--cc=suanmingm@nvidia.com \
--cc=viacheslavo@nvidia.com \
--cc=weh@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.