From: Stephen Hemminger <stephen@networkplumber.org>
To: Wei Hu <weh@linux.microsoft.com>
Cc: dev@dpdk.org, longli@microsoft.com, weh@microsoft.com
Subject: Re: [PATCH v10 1/1] net/mana: add device reset support
Date: Tue, 16 Jun 2026 14:22:50 -0700 [thread overview]
Message-ID: <20260616142250.29603185@phoenix.local> (raw)
In-Reply-To: <20260616123158.43583-2-weh@linux.microsoft.com>
On Tue, 16 Jun 2026 05:31:58 -0700
Wei Hu <weh@linux.microsoft.com> wrote:
> teardown immediately (dev_stop, secondary IPC, dev_close, MR cache
> free) before waiting for the hardware recovery timer to fire. This
> avoids blocking the EAL interrupt thread on multi-second IPC
> timeouts and ibverbs calls. After the recovery delay, the thread
> unregisters the interrupt handler, re-probes the PCI device,
> reinitializes MR caches, and restarts queues. Each function owns
> its own lock scope with no lock hand-off between threads.
>
> Each queue has an atomic burst_state variable where bit 0 is the
> in-burst flag and bit 1 is a blocked flag. The data path uses a
> single compare-and-swap (0 to 1) to enter a burst, which fails
> immediately if the blocked bit is set. The reset path sets the
> blocked bit via atomic fetch-or and polls bit 0 to wait for
> in-flight bursts to drain. This single-variable design avoids the
> need for sequential consistency ordering.
>
> A per-device mutex serializes the reset path with ethdev
> operations. The mutex uses PTHREAD_PROCESS_SHARED for multi-process
> support and is held across blocking IB verbs calls. A trylock
> helper encapsulates the lock acquisition and device state check
> for all ethdev operation wrappers. Operations that cannot wait
> (configure, queue setup) return -EBUSY during reset, while
> dev_stop and dev_close join the reset thread before acquiring
> the lock to ensure proper sequencing.
>
> The reset thread keeps reset_thread_active true throughout its
> lifetime. mana_join_reset_thread uses rte_thread_equal to detect
> the self-join case (when a recovery callback calls dev_stop or
> dev_close from the reset thread itself) and calls
> rte_thread_detach instead of join, so thread resources are freed
> on exit. External callers join normally.
>
> The condvar wait in the reset thread uses a predicate loop that
> checks dev_state under reset_cond_mutex, so a PCI remove signal
> that arrives before the thread enters the wait is not lost. The
> PCI remove callback sets dev_state to RESET_FAILED under the
> same mutex before signaling. A lock/unlock barrier on
> reset_ops_lock in the PCI remove path ensures teardown has
> completed before emitting the INTR_RMV event.
>
> Multi-process support is included: secondary processes unmap and
> remap doorbell pages via IPC during the reset enter and exit
> phases. The secondary RESET_EXIT handler closes the received fd
> unconditionally after processing, even when the doorbell page is
> already mapped. Data path functions in both primary and secondary
> processes check the device state atomically and return early when
> the device is not active.
>
> The driver emits RTE_ETH_EVENT_ERR_RECOVERING before entering the
> reset path so that upper layers (e.g. netvsc) can switch their
> data path before queues are stopped. The event is emitted outside
> the reset lock to avoid deadlock if the callback calls dev_stop or
> dev_close. On completion, the driver emits RECOVERY_SUCCESS or
> RECOVERY_FAILED after releasing the lock. If a recovery callback
> triggers dev_stop or dev_close, the self-join detection in
> mana_join_reset_thread detaches the thread to avoid deadlock. If
> the enter phase fails internally, RECOVERY_FAILED is sent
> immediately so the application receives a terminal event. A PCI
> device removal event callback distinguishes hot-remove from
> service reset.
>
> Documentation for the device reset feature is added in the MANA
> NIC guide and the 26.07 release notes.
>
> Signed-off-by: Wei Hu <weh@microsoft.com>
> ---
Applied to next-net
prev parent reply other threads:[~2026-06-16 21:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 12:31 [PATCH v10 0/1] net/mana: add device reset support Wei Hu
2026-06-16 12:31 ` [PATCH v10 1/1] " Wei Hu
2026-06-16 21:22 ` Stephen Hemminger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260616142250.29603185@phoenix.local \
--to=stephen@networkplumber.org \
--cc=dev@dpdk.org \
--cc=longli@microsoft.com \
--cc=weh@linux.microsoft.com \
--cc=weh@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox