DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Long Li <longli@microsoft.com>
Cc: dev@dpdk.org, Wei Hu <weh@microsoft.com>, stable@dpdk.org
Subject: Re: [PATCH v2 1/7] net/netvsc: retry VF hotplug indefinitely until PCI device disappears
Date: Wed, 6 May 2026 19:49:48 -0700	[thread overview]
Message-ID: <20260506194948.2508fa6b@phoenix.local> (raw)
In-Reply-To: <20260506020529.281654-1-longli@microsoft.com>

On Tue,  5 May 2026 19:05:22 -0700
Long Li <longli@microsoft.com> wrote:

> After PCI rescan on Azure, the MANA kernel driver can take over 100
> seconds to probe and create the /sys/bus/pci/devices/<dev>/net directory.
> The previous fixed retry limit (NETVSC_MAX_HOTADD_RETRY=10, ~12 seconds)
> was insufficient, causing VF re-attach to fail with 'Failed to parse PCI
> device' on systems with slow MANA driver initialization.
> 
> Replace the fixed retry limit with an indefinite retry that only gives up
> when the PCI device itself disappears from sysfs. This is safe because:
> 
> - The retry uses rte_eal_alarm callbacks which are serialized on the EAL
>   interrupt thread, preventing races with VF remove or device close paths.
> - Device close (eth_hn_dev_uninit) explicitly cancels all pending hotplug
>   alarms via rte_eal_alarm_cancel and frees the context.
> - If the PCI device is removed while retrying, access() detects the
>   missing sysfs path and stops immediately.
> 
> A periodic NOTICE log every 30 retries (~30s) provides visibility into
> long waits without flooding the log at DEBUG level.
> 
> Fixes: a2a23a794b3a ("net/netvsc: support VF device hot add/remove")
> Cc: stable@dpdk.org
> Signed-off-by: Long Li <longli@microsoft.com>
> ---
Better but still seeing AI review warnings.

Reviewed the v2 7-patch series against upstream drivers/net/netvsc/. Patches 1, 2, 3, and 5 are clean. Findings on the rest:
Patch 4 — the new "retry loop exiting" NOTICE fires on every termination including the success path, producing a noise alert on every successful VF re-attach.
Patch 6 — two warnings: (a) reaching directly into vf_dev->dev_ops->stats_get works only because eth_stats_qstats_get() already memset the buffers before invoking netvsc's callback, an undocumented dependency on the caller; (b) the else fallback to rte_eth_stats_get() is dead code — it returns -ENOTSUP for the same reason as the direct call.
Patch 7 — the recovering and recovery_success callbacks acquire vf_lock directly from event-callback context, departing from the existing INTR_RMV pattern that defers work via rte_eal_alarm_set precisely to avoid cross-driver lock-order assumptions. The unlocked vf_attached read in recovery_failed is a benign race that can be simplified by dropping the guard.

  parent reply	other threads:[~2026-05-07  2:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06  2:05 [PATCH v2 1/7] net/netvsc: retry VF hotplug indefinitely until PCI device disappears Long Li
2026-05-06  2:05 ` [PATCH v2 2/7] net/netvsc: retry on SIOCGIFHWADDR failure during VF hotplug Long Li
2026-05-06  2:05 ` [PATCH v2 3/7] net/netvsc: retry full probe when IB device not ready during hotplug Long Li
2026-05-06  2:05 ` [PATCH v2 4/7] net/netvsc: add debug logging for VF hotplug retry Long Li
2026-05-06  2:05 ` [PATCH v2 5/7] net/netvsc: retry when no matching MAC found in net directory Long Li
2026-05-06  2:05 ` [PATCH v2 6/7] net/netvsc: forward per-queue stats from VF device Long Li
2026-05-06  2:05 ` [PATCH v2 7/7] net/netvsc: handle VF recovery events for service reset Long Li
2026-05-07  2:49 ` Stephen Hemminger [this message]
2026-05-15 19:45   ` [EXTERNAL] Re: [PATCH v2 1/7] net/netvsc: retry VF hotplug indefinitely until PCI device disappears Long Li
2026-05-15 19:28 ` [PATCH v3 0/7] net/netvsc: fix VF hotplug and service reset handling Long Li
2026-05-15 19:28   ` [PATCH v3 1/7] net/netvsc: retry VF hotplug indefinitely until PCI device disappears Long Li
2026-05-15 19:28   ` [PATCH v3 2/7] net/netvsc: retry on SIOCGIFHWADDR failure during VF hotplug Long Li
2026-05-15 19:28   ` [PATCH v3 3/7] net/netvsc: retry full probe when IB device not ready during hotplug Long Li
2026-05-15 19:28   ` [PATCH v3 4/7] net/netvsc: add debug logging for VF hotplug retry Long Li
2026-05-15 19:28   ` [PATCH v3 5/7] net/netvsc: retry when no matching MAC found in net directory Long Li
2026-05-15 19:28   ` [PATCH v3 6/7] net/netvsc: forward per-queue stats from VF device Long Li
2026-05-15 19:28   ` [PATCH v3 7/7] net/netvsc: handle VF recovery events for service reset Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260506194948.2508fa6b@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=dev@dpdk.org \
    --cc=longli@microsoft.com \
    --cc=stable@dpdk.org \
    --cc=weh@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox