public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: longli@linux.microsoft.com
Cc: dev@dpdk.org, Wei Hu <weh@microsoft.com>,
	stable@dpdk.org, Dariusz Sosnowski <dsosnowski@nvidia.com>,
	Viacheslav Ovsiienko <viacheslavo@nvidia.com>,
	Bing Zhao <bingz@nvidia.com>, Ori Kam <orika@nvidia.com>,
	Suanming Mou <suanmingm@nvidia.com>,
	Matan Azrad <matan@nvidia.com>, Long Li <longli@microsoft.com>
Subject: Re: [PATCH v2 3/8] net/netvsc: add multi-process VF device removal support
Date: Mon, 23 Feb 2026 09:45:19 -0800	[thread overview]
Message-ID: <20260223094519.5764d5fd@phoenix.local> (raw)
In-Reply-To: <20260221024540.659098-3-longli@linux.microsoft.com>

On Fri, 20 Feb 2026 18:45:22 -0800
longli@linux.microsoft.com wrote:

> From: Long Li <longli@microsoft.com>
> 
> When a VF device is hot-removed by the primary process, secondary
> processes must be notified to release their references to the VF port.
> Without this, secondary processes retain stale port references leading
> to crashes or undefined behavior when accessing the removed device.
> 
> This patch adds multi-process communication infrastructure to coordinate
> VF removal across all processes:
> 
> - Shared memory (netvsc_shared_data) to track secondary process count
> - Multi-process message handlers (NETVSC_MP_REQ_VF_REMOVE) to notify
>   secondaries when primary removes a VF device
> - Secondary handler calls rte_eth_dev_release_port() to cleanly release
>   the VF port in its own process space
> - Primary waits for all secondaries to acknowledge removal before
>   proceeding
> 
> The implementation uses rte_mp_request_sync() to ensure all secondary
> processes respond within NETVSC_MP_REQ_TIMEOUT_SEC (5 seconds) before
> the primary completes the VF removal sequence.
> 
> Fixes: 7fc4c0997b04 ("net/netvsc: fix hot adding multiple VF PCI devices")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Long Li <longli@microsoft.com>

AI review feedback:

**Patch 3 (net/netvsc: add multi-process VF device removal support)** — adds MP infrastructure to coordinate VF removal across processes.

**Three concerns:**

1. **Race window on `secondary_cnt` during probe (~50% confidence).** The secondary increments `secondary_cnt` *after* `rte_eth_dev_probing_finish()`, but `netvsc_init_once()` and device setup happen before that. A primary removing a VF during this window sees `secondary_cnt == 0`, skips `rte_mp_request_sync()`, and the secondary never gets notified — leaving it with a stale VF port reference.

2. **Misleading "VF is already locked by primary" comment.** In `netvsc_secondary_handle_device_remove()`, the code reads `hv->vf_ctx.vf_port` from shared memory with a comment saying the primary's lock protects it. But `rte_rwlock_t` is process-local — it doesn't work cross-process. The actual synchronization comes from the MP message exchange itself (the primary sends the message after setting state, the secondary handles it after receiving). The comment should reflect that.

3. **`netvsc_init_once()` not protected by the spinlock.** It's called from `eth_hn_probe()` without `netvsc_shared_data_lock`, while `netvsc_uninit_once()` is called *inside* the lock. If two netvsc devices probe concurrently in the same process, the `init_done` flag check could race. Low risk since DPDK probe is typically single-threaded, but inconsistent with the uninit path.

**Minor style notes:** `MZ_NETVSC_SHARED_DATA` uses macro-style naming but is a `const char *` variable — could be a `#define` for consistency with `NETVSC_MP_NAME`. The stub `netvsc_mp_primary_handle()` that always returns 0 is benign but could mask future protocol errors.

**Overall:** Sound infrastructure, suitable for merging with the comment fix and awareness of the probe-time race window.

  reply	other threads:[~2026-02-23 17:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-21  2:45 [PATCH v2 1/8] net/netvsc: secondary ignore promiscuous enable/disable longli
2026-02-21  2:45 ` [PATCH v2 2/8] net/netvsc: fix race conditions on VF add/remove events longli
2026-02-23 17:40   ` Stephen Hemminger
2026-02-23 17:43   ` Stephen Hemminger
2026-02-23 22:31     ` [EXTERNAL] " Long Li
2026-02-21  2:45 ` [PATCH v2 3/8] net/netvsc: add multi-process VF device removal support longli
2026-02-23 17:45   ` Stephen Hemminger [this message]
2026-02-23 22:32     ` [EXTERNAL] " Long Li
2026-02-21  2:45 ` [PATCH v2 4/8] net/mana: fix PD resource leak on device close longli
2026-02-21  2:45 ` [PATCH v2 5/8] net/netvsc: fix devargs memory leak on hotplug longli
2026-02-21  2:45 ` [PATCH v2 6/8] net/mana: fix fast-path ops setup in secondary process longli
2026-02-21  2:45 ` [PATCH v2 7/8] net/mlx5: " longli
2026-02-21  2:45 ` [PATCH v2 8/8] net/mlx4: " longli
2026-02-21  5:16 ` [PATCH v2 1/8] net/netvsc: secondary ignore promiscuous enable/disable Stephen Hemminger
2026-02-23 23:25   ` [EXTERNAL] " Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223094519.5764d5fd@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=bingz@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=dsosnowski@nvidia.com \
    --cc=longli@linux.microsoft.com \
    --cc=longli@microsoft.com \
    --cc=matan@nvidia.com \
    --cc=orika@nvidia.com \
    --cc=stable@dpdk.org \
    --cc=suanmingm@nvidia.com \
    --cc=viacheslavo@nvidia.com \
    --cc=weh@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox