From: Paolo Abeni <pabeni@redhat.com>
To: jiawenwu@trustnetic.com
Cc: netdev@vger.kernel.org, mengyuanlou@net-swift.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, richardcochran@gmail.com,
linux@armlinux.org.uk, horms@kernel.org, kees@kernel.org,
larysa.zaremba@intel.com, leitao@debian.org, joe@dama.to,
jacob.e.keller@intel.com, fabio.baltieri@gmail.com
Subject: Re: [PATCH net-next v3 5/5] net: wangxun: implement pci_error_handlers ops
Date: Thu, 14 May 2026 11:24:00 +0200 [thread overview]
Message-ID: <20260514092415.90852-1-pabeni@redhat.com> (raw)
In-Reply-To: <20260509100540.32612-6-jiawenwu@trustnetic.com>
From: AI Reviewer <ai@example.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: wangxun: implement pci_error_handlers ops
This patch adds AER error_detected, slot_reset, and resume callbacks
for the libwx, ngbe, and txgbe drivers, gates pci_disable_device()
behind a new WX_STATE_DISABLED bit, and adds pci_save_state() at the
end of probe.
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> index deb9fac2e1a9..64d43bf6e9da 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> @@ -4,11 +4,119 @@
>
> #include <linux/netdevice.h>
> #include <linux/pci.h>
> +#include <linux/aer.h>
>
> #include "wx_type.h"
> #include "wx_lib.h"
> #include "wx_err.h"
>
> +/**
> + * wx_io_error_detected - called when PCI error is detected
> + * @pdev: Pointer to PCI device
> + * @state: The current pci connection state
> + *
> + * Return: pci_ers_result_t.
> + *
> + * This function is called after a PCI bus error affecting
> + * this device has been detected.
> + */
> +static pci_ers_result_t wx_io_error_detected(struct pci_dev *pdev,
> + pci_channel_state_t state)
> +{
> + struct wx *wx = pci_get_drvdata(pdev);
> + struct net_device *netdev;
> +
> + if (!wx)
> + return PCI_ERS_RESULT_DISCONNECT;
> +
> + netdev = wx->netdev;
> + if (!netif_device_present(netdev))
> + return PCI_ERS_RESULT_DISCONNECT;
> +
> + if (state == pci_channel_io_perm_failure)
> + return PCI_ERS_RESULT_DISCONNECT;
> +
> + rtnl_lock();
> + netif_device_detach(netdev);
> +
> + if (netif_running(netdev))
> + wx->close_suspend(wx);
[High]
Should this path consult pci_channel_offline(pdev) (or the state
argument) before invoking close_suspend()? When state ==
pci_channel_io_frozen, MMIO is unreliable: reads return all-ones and
writes are dropped.
close_suspend() walks into ngbe_down()/txgbe_down(), which call
wx_disable_pcie_master() and wx_disable_rx_queue() in
drivers/net/ethernet/wangxun/libwx/wx_hw.c. Both use read_poll_timeout
on MMIO:
wx_disable_pcie_master() polls WX_PX_TRANSACTION_PENDING with
WX_PCI_MASTER_DISABLE_TIMEOUT (around 80 ms)
wx_disable_rx_queue() polls WX_PX_RR_CFG (~100 us per queue)
When the channel is frozen, those polls cannot succeed and will spin
the full timeout for every queue. wx_update_stats() at the tail of
disable_device additionally accumulates 0xFFFFFFFF reads as garbage
counters.
Documentation/PCI/pci-error-recovery.rst suggests error_detected
should avoid MMIO when the channel is frozen and only quiesce (stop
queues, free IRQs, disable the device). Would skipping close_suspend
or substituting an MMIO-free quiesce path be appropriate when
pci_channel_offline(pdev) is true?
> +
> + if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
> + pci_disable_device(pdev);
> + rtnl_unlock();
> +
> + /* Request a slot reset. */
> + return PCI_ERS_RESULT_NEED_RESET;
> +}
> +
> +/**
> + * wx_io_slot_reset - called after the pci bus has been reset.
> + * @pdev: Pointer to PCI device
> + *
> + * Return: pci_ers_result_t.
> + *
> + * Restart the card from scratch, as if from a cold-boot.
> + */
> +static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
> +{
> + struct wx *wx = pci_get_drvdata(pdev);
> + pci_ers_result_t result;
> +
> + if (pci_enable_device_mem(pdev)) {
> + wx_err(wx, "Cannot re-enable PCI device after reset.\n");
> + result = PCI_ERS_RESULT_DISCONNECT;
[Medium]
What unwinds the resources released by close_suspend() in
wx_io_error_detected() when this path returns
PCI_ERS_RESULT_DISCONNECT?
netif_device_detach() only clears __LINK_STATE_PRESENT; it does not
clear __LINK_STATE_START, so netif_running() still reports true.
After this failure, the AER core calls error_detected again with
pci_channel_io_perm_failure and the driver short-circuits, leaving the
netdev in a half-torn-down state until ngbe_remove/txgbe_remove
eventually runs.
In the meantime, any rtnl path gated on netif_running() (for example
wx_set_features and wx_update_rsc, which call wx->do_reset ->
reinit_locked -> *_down -> *_disable_device touching freed
wx->rx_ring[i]/tx_ring[i]) would dereference freed rings. Should this
failure path also clear LINK_STATE_START (e.g., via dev_close()) so
those callers see the netdev as down?
> + } else {
> + /* make all bar access done before reset. */
> + smp_mb__before_atomic();
[Low]
Is the comment "make all bar access done before reset." accurate?
The PCI bus reset happens before slot_reset is invoked by the AER
core, so "before reset" reads temporally backwards. There is also no
BAR/MMIO access in this function prior to the barrier — only
pci_enable_device_mem(), which touches config space.
smp_mb__before_atomic() is a CPU barrier paired with the following
non-RMW atomic (clear_bit) per Documentation/atomic_t.txt and
memory-barriers.txt; it does not order MMIO/BAR accesses (which would
require a readback or wmb()-class primitive). Could the comment be
updated to reflect what the barrier actually does?
> + clear_bit(WX_STATE_DISABLED, wx->state);
> + pci_set_master(pdev);
> + pci_restore_state(pdev);
[Medium]
Should pci_save_state(pdev) be called again here after
pci_restore_state(pdev)?
pci_restore_state() ends with dev->state_saved = false, and the
canonical pattern in other AER slot_reset implementations
(ixgbe_io_slot_reset, igb, e1000e) is restore-then-save so that
subsequent recovery cycles start from a known-good baseline. Without
re-saving, repeated AER events on the same device would rely on stale
or partially-cleared per-cap save state.
> + pci_wake_from_d3(pdev, false);
> +
> + wx->do_reset(wx->netdev, false);
[Medium]
Should rtnl_lock be held around this do_reset() call?
Every other call site of wx->do_reset() runs under rtnl_lock:
wx_reset_subtask() in wx_err.c takes it explicitly, wx_set_features
and wx_update_rsc are invoked under rtnl by netdev_ops, and
wx_io_resume() takes rtnl_lock around ndo_open.
txgbe_do_reset -> txgbe_reset writes wx->mac_table[]
(wx_flush_sw_mac_table + wx_mac_set_default_filter) and reprograms HW
(txgbe_reset_hw, wx_start_hw); ngbe_reset also writes mac_table[].
netif_device_detach() in error_detected only clears
__LINK_STATE_PRESENT and stops TX queues — it does not block
rtnl-protected ndo callbacks (ndo_set_mac_address, ndo_set_features,
ndo_vlan_rx_add_vid, ethtool_ops). Those userspace-triggered paths
run under rtnl_lock and would concurrently modify mac_table[] / HW
registers, racing with do_reset.
> + result = PCI_ERS_RESULT_RECOVERED;
> + }
> +
> + pci_aer_clear_nonfatal_status(pdev);
> +
> + return result;
> +}
> +
> +/**
> + * wx_io_resume - called when traffic can start flowing again.
> + * @pdev: Pointer to PCI device
> + *
> + * This callback is called when the error recovery driver tells us that
> + * its OK to resume normal operation.
> + */
> +static void wx_io_resume(struct pci_dev *pdev)
> +{
> + struct wx *wx = pci_get_drvdata(pdev);
> + struct net_device *netdev;
> + int err;
> +
> + netdev = wx->netdev;
> + rtnl_lock();
> + if (netif_running(netdev)) {
> + err = netdev->netdev_ops->ndo_open(netdev);
> + if (err) {
> + wx_err(wx, "Failed to open netdev after reset\n");
> + goto out;
> + }
> + }
> + netif_device_attach(netdev);
> +out:
> + rtnl_unlock();
> +}
[High]
When ndo_open() fails here, what state is the netdev left in?
ngbe_open()/txgbe_open() free their resources on failure paths and
return without clearing __LINK_STATE_START (only dev_close() does
that, and ndo_open is being called directly here). netif_device_attach
is then skipped, so __LINK_STATE_PRESENT stays cleared.
The result is a netdev where netif_running() returns true (LINK_START
set) but PRESENT is clear, with rings/IRQs/phylink already freed by
close_suspend in error_detected and only partially re-allocated. AER
treats recovery as complete (resume returns void).
Code paths gated only on netif_running() — wx_reset_subtask in
wx_err.c, wx_tx_timeout, wx_set_features -> wx->do_reset, ethtool
callbacks — would then dereference freed Tx/Rx rings, freed IRQ
handlers, or a disconnected phylink. For example:
void wx_reset_subtask(struct wx *wx)
{
...
if (!netif_running(wx->netdev) ||
test_bit(WX_STATE_RESETTING, wx->state))
goto out;
...
if (wx->do_reset)
wx->do_reset(wx->netdev, true);
}
would call into reinit_locked -> *_down -> *_disable_device on freed
rings. Should the failure path here force the netdev fully down (e.g.,
dev_close_many or equivalent) so netif_running() reports false?
--
This is an AI-generated review.
prev parent reply other threads:[~2026-05-14 9:24 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 10:05 [PATCH net-next v3 0/5] net: wangxun: timeout and error Jiawen Wu
2026-05-09 10:05 ` [PATCH net-next v3 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
2026-05-09 10:05 ` [PATCH net-next v3 2/5] net: wangxun: add Tx timeout process Jiawen Wu
2026-05-14 9:23 ` Paolo Abeni
2026-05-09 10:05 ` [PATCH net-next v3 3/5] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
2026-05-09 10:05 ` [PATCH net-next v3 4/5] net: wangxun: extract the close_suspend sequence Jiawen Wu
2026-05-09 10:05 ` [PATCH net-next v3 5/5] net: wangxun: implement pci_error_handlers ops Jiawen Wu
2026-05-14 9:24 ` Paolo Abeni [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260514092415.90852-1-pabeni@redhat.com \
--to=pabeni@redhat.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fabio.baltieri@gmail.com \
--cc=horms@kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=jiawenwu@trustnetic.com \
--cc=joe@dama.to \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=larysa.zaremba@intel.com \
--cc=leitao@debian.org \
--cc=linux@armlinux.org.uk \
--cc=mengyuanlou@net-swift.com \
--cc=netdev@vger.kernel.org \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox