* [PATCH net-next v5 0/5] net: wangxun: timeout and error
@ 2026-06-04 8:56 Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
It is about adding the Tx timeout process and pci_error_handlers.
When a PCIe error occurs, the txgbe device is able to recover on platform
that support AER interrupt. And for Tx timeout, the txgbe driver can
recover the device by reset process.
For ngbe devices, due to the absence of the current function, it cannot
br fully recovered once there is a PCIe error or Tx timeout. Its
function will be completed in the future.
Changes log:
v5:
- Avoid the same name on two functions.
- Encode the device identity into the name of reset work queue.
- Change pr_err() to wx_err().
- Check WX_STATE_DOWN and WX_STATE_RESETTING at the entry of every work item.
- Implement wx_ptp_quiesce().
- Add netif_carrier_off() and netif_tx_disable() in soft_quiesce.
- Move resource free operations after PCIe recovery.
- Return error code in down path.
v4: https://lore.kernel.org/all/20260601072221.2952-1-jiawenwu@trustnetic.com
- Create a separate work queue for the reset task.
- Gate wx_watchdog_flush_tx() on netif_running().
- Add rtnl_lock() around wx->do_reset() in wx_io_slot_reset().
- Change .close_suspend() to .soft_quiesce() to avoid MMIO when PCI
channel is frozen.
v3: https://lore.kernel.org/all/20260509100540.32612-1-jiawenwu@trustnetic.com
- Merge the multiple string line into one in wx_handle_tx_hang().
- Remove the redundant warn messages.
- Use test_and_clear_bit() instead of checking the flag bit then clear it.
- Drop the Tx hang check in tx_timeout.
- Call wx_update_stats() before wx_check_tx_hang().
- Add Tx flush when link lost.
- Move wx_ptp_stop() into wx->close_suspend().
- Drop V2 patch 5/6 because WOL packets are handled before DMA ring.
- Check wx NULL pointer in wx_io_error_detected().
- Check perm failure before hardware teardown.
v2: https://lore.kernel.org/all/20260430082517.19612-1-jiawenwu@trustnetic.com
- Add the missing rtnl_unlock() at early return in wx_reset_subtask().
- Replace ngbe_close() with ngbe_close_suspend() in ngbe_dev_shutdown().
- Add a patch to clear stored DMA addresses.
v1: https://lore.kernel.org/r/20260428021156.13564-1-jiawenwu@trustnetic.com
Jiawen Wu (5):
net: ngbe: implement libwx reset ops
net: wangxun: add Tx timeout process
net: wangxun: add reinit parameter to wx->do_reset callback
net: wangxun: implement soft quiesce for PCIe error recovery
net: wangxun: add pcie error handler
drivers/net/ethernet/wangxun/libwx/Makefile | 2 +-
drivers/net/ethernet/wangxun/libwx/wx_err.c | 289 ++++++++++++++++++
drivers/net/ethernet/wangxun/libwx/wx_err.h | 18 ++
.../net/ethernet/wangxun/libwx/wx_ethtool.c | 2 +-
drivers/net/ethernet/wangxun/libwx/wx_hw.c | 17 +-
drivers/net/ethernet/wangxun/libwx/wx_lib.c | 55 +++-
drivers/net/ethernet/wangxun/libwx/wx_lib.h | 1 +
drivers/net/ethernet/wangxun/libwx/wx_ptp.c | 21 ++
drivers/net/ethernet/wangxun/libwx/wx_ptp.h | 1 +
drivers/net/ethernet/wangxun/libwx/wx_type.h | 23 +-
.../net/ethernet/wangxun/ngbe/ngbe_ethtool.c | 1 -
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 74 ++++-
drivers/net/ethernet/wangxun/ngbe/ngbe_type.h | 1 +
.../ethernet/wangxun/txgbe/txgbe_ethtool.c | 6 +-
.../net/ethernet/wangxun/txgbe/txgbe_main.c | 79 ++++-
.../net/ethernet/wangxun/txgbe/txgbe_type.h | 4 +-
16 files changed, 566 insertions(+), 28 deletions(-)
create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.c
create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.h
--
2.51.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
@ 2026-06-04 8:56 ` Jiawen Wu
2026-06-08 8:47 ` Loktionov, Aleksandr
2026-06-04 8:56 ` [PATCH net-next v5 2/5] net: wangxun: add Tx timeout process Jiawen Wu
` (3 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
Implement wx->do_reset() for library module calling.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
---
.../net/ethernet/wangxun/ngbe/ngbe_ethtool.c | 1 -
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 37 ++++++++++++++++++-
drivers/net/ethernet/wangxun/ngbe/ngbe_type.h | 1 +
3 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
index b2e191982803..1960f7154151 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
@@ -59,7 +59,6 @@ static int ngbe_set_ringparam(struct net_device *netdev,
wx_set_ring(wx, new_tx_count, new_rx_count, temp_ring);
kvfree(temp_ring);
- wx_configure(wx);
ngbe_up(wx);
clear_reset:
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index 8678c49b892a..dea6dfb043f3 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -133,6 +133,7 @@ static int ngbe_sw_init(struct wx *wx)
wx->mbx.size = WX_VXMAILBOX_SIZE;
wx->setup_tc = ngbe_setup_tc;
+ wx->do_reset = ngbe_do_reset;
set_bit(0, &wx->fwd_bitmask);
return 0;
@@ -423,7 +424,7 @@ void ngbe_down(struct wx *wx)
wx_clean_all_rx_rings(wx);
}
-void ngbe_up(struct wx *wx)
+static void ngbe_up_complete(struct wx *wx)
{
wx_configure_vectors(wx);
@@ -490,7 +491,7 @@ static int ngbe_open(struct net_device *netdev)
wx_ptp_init(wx);
- ngbe_up(wx);
+ ngbe_up_complete(wx);
return 0;
err_dis_phy:
@@ -503,6 +504,12 @@ static int ngbe_open(struct net_device *netdev)
return err;
}
+void ngbe_up(struct wx *wx)
+{
+ wx_configure(wx);
+ ngbe_up_complete(wx);
+}
+
/**
* ngbe_close - Disables a network interface
* @netdev: network interface device structure
@@ -590,6 +597,8 @@ int ngbe_setup_tc(struct net_device *dev, u8 tc)
*/
if (netif_running(dev))
ngbe_close(dev);
+ else
+ ngbe_reset(wx);
wx_clear_interrupt_scheme(wx);
@@ -606,6 +615,30 @@ int ngbe_setup_tc(struct net_device *dev, u8 tc)
return 0;
}
+static void ngbe_reinit_locked(struct wx *wx)
+{
+ netif_trans_update(wx->netdev);
+
+ mutex_lock(&wx->reset_lock);
+ set_bit(WX_STATE_RESETTING, wx->state);
+
+ ngbe_down(wx);
+ ngbe_up(wx);
+
+ clear_bit(WX_STATE_RESETTING, wx->state);
+ mutex_unlock(&wx->reset_lock);
+}
+
+void ngbe_do_reset(struct net_device *netdev)
+{
+ struct wx *wx = netdev_priv(netdev);
+
+ if (netif_running(netdev))
+ ngbe_reinit_locked(wx);
+ else
+ ngbe_reset(wx);
+}
+
static const struct net_device_ops ngbe_netdev_ops = {
.ndo_open = ngbe_open,
.ndo_stop = ngbe_close,
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
index 7077a0da4c98..4f648f272c08 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
@@ -125,5 +125,6 @@ extern char ngbe_driver_name[];
void ngbe_down(struct wx *wx);
void ngbe_up(struct wx *wx);
int ngbe_setup_tc(struct net_device *dev, u8 tc);
+void ngbe_do_reset(struct net_device *netdev);
#endif /* _NGBE_TYPE_H_ */
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v5 2/5] net: wangxun: add Tx timeout process
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
@ 2026-06-04 8:56 ` Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 3/5] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
Implement .ndo_tx_timeout to handle Tx side timeout event. When a Tx
timeout event occur, it will trigger driver into reset process. And
allocate a separate work queue for reset process.
The WX_HANG_CHECK_ARMED bit is set to indicate a potential hang. It will
be cleared if a pause frame is received to avoid false hang detection
caused by pause frames.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/ethernet/wangxun/libwx/Makefile | 2 +-
drivers/net/ethernet/wangxun/libwx/wx_err.c | 175 ++++++++++++++++++
drivers/net/ethernet/wangxun/libwx/wx_err.h | 16 ++
drivers/net/ethernet/wangxun/libwx/wx_hw.c | 17 +-
drivers/net/ethernet/wangxun/libwx/wx_lib.c | 37 ++++
drivers/net/ethernet/wangxun/libwx/wx_type.h | 19 +-
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 14 ++
.../net/ethernet/wangxun/txgbe/txgbe_main.c | 14 ++
8 files changed, 289 insertions(+), 5 deletions(-)
create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.c
create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.h
diff --git a/drivers/net/ethernet/wangxun/libwx/Makefile b/drivers/net/ethernet/wangxun/libwx/Makefile
index a71b0ad77de3..c8724bb129aa 100644
--- a/drivers/net/ethernet/wangxun/libwx/Makefile
+++ b/drivers/net/ethernet/wangxun/libwx/Makefile
@@ -4,5 +4,5 @@
obj-$(CONFIG_LIBWX) += libwx.o
-libwx-objs := wx_hw.o wx_lib.o wx_ethtool.o wx_ptp.o wx_mbx.o wx_sriov.o
+libwx-objs := wx_hw.o wx_lib.o wx_ethtool.o wx_ptp.o wx_mbx.o wx_sriov.o wx_err.o
libwx-objs += wx_vf.o wx_vf_lib.o wx_vf_common.o
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
new file mode 100644
index 000000000000..b6e2d16d4a16
--- /dev/null
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2015 - 2026 Beijing WangXun Technology Co., Ltd. */
+/* Copyright (c) 1999 - 2026 Intel Corporation. */
+
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+
+#include "wx_type.h"
+#include "wx_lib.h"
+#include "wx_err.h"
+
+static void wx_pf_reset_subtask(struct wx *wx)
+{
+ if (!test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
+ return;
+
+ wx_warn(wx, "Reset adapter.\n");
+ if (wx->do_reset)
+ wx->do_reset(wx->netdev);
+}
+
+static void wx_reset_task(struct work_struct *work)
+{
+ struct wx *wx = container_of(work, struct wx, reset_task);
+
+ rtnl_lock();
+
+ if (test_bit(WX_STATE_DOWN, wx->state) ||
+ test_bit(WX_STATE_RESETTING, wx->state))
+ goto out;
+
+ wx_pf_reset_subtask(wx);
+
+out:
+ rtnl_unlock();
+}
+
+void wx_check_err_subtask(struct wx *wx)
+{
+ if (test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
+ queue_work(wx->reset_wq, &wx->reset_task);
+}
+EXPORT_SYMBOL(wx_check_err_subtask);
+
+int wx_init_err_task(struct wx *wx)
+{
+ wx->reset_wq = alloc_workqueue("%s_reset_wq_%x", WQ_UNBOUND | WQ_HIGHPRI,
+ 1, wx->driver_name, pci_dev_id(wx->pdev));
+ if (!wx->reset_wq) {
+ wx_err(wx, "Failed to create wx_reset_wq workqueue\n");
+ return -ENOMEM;
+ }
+
+ INIT_WORK(&wx->reset_task, wx_reset_task);
+ return 0;
+}
+EXPORT_SYMBOL(wx_init_err_task);
+
+static bool wx_ring_tx_pending(struct wx *wx)
+{
+ int i;
+
+ for (i = 0; i < wx->num_tx_queues; i++) {
+ struct wx_ring *tx_ring = wx->tx_ring[i];
+
+ if (tx_ring->next_to_use != tx_ring->next_to_clean)
+ return true;
+ }
+
+ return false;
+}
+
+static bool wx_vf_tx_pending(struct wx *wx)
+{
+ struct wx_ring_feature *vmdq = &wx->ring_feature[RING_F_VMDQ];
+ u32 q_per_pool = __ALIGN_MASK(1, ~vmdq->mask);
+ u32 i, j;
+
+ if (!wx->num_vfs)
+ return false;
+
+ for (i = 0; i < wx->num_vfs; i++) {
+ for (j = 0; j < q_per_pool; j++) {
+ u32 h, t;
+
+ h = rd32(wx, WX_PX_TR_RP_PV(q_per_pool, i, j));
+ t = rd32(wx, WX_PX_TR_WP_PV(q_per_pool, i, j));
+
+ if (h != t)
+ return true;
+ }
+ }
+
+ return false;
+}
+
+static void wx_watchdog_flush_tx(struct wx *wx)
+{
+ if (!netif_running(wx->netdev))
+ return;
+ if (netif_carrier_ok(wx->netdev))
+ return;
+
+ if (wx_ring_tx_pending(wx) || wx_vf_tx_pending(wx)) {
+ /* We've lost link, so the controller stops DMA,
+ * but we've got queued Tx work that's never going
+ * to get done, so reset controller to flush Tx.
+ * (Do the reset outside of interrupt context).
+ */
+ wx_warn(wx, "initiating reset due to lost link with pending Tx work\n");
+ set_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
+ }
+}
+
+static void wx_detect_tx_hang(struct wx *wx)
+{
+ int i;
+
+ /* If we're down or resetting, just bail */
+ if (!netif_running(wx->netdev) ||
+ test_bit(WX_STATE_RESETTING, wx->state))
+ return;
+
+ /* Force detection of hung controller */
+ if (netif_carrier_ok(wx->netdev)) {
+ for (i = 0; i < wx->num_tx_queues; i++)
+ set_bit(WX_TX_DETECT_HANG, wx->tx_ring[i]->state);
+ }
+}
+
+void wx_check_hang_subtask(struct wx *wx)
+{
+ if (test_bit(WX_STATE_DOWN, wx->state) ||
+ test_bit(WX_STATE_RESETTING, wx->state))
+ return;
+
+ wx_watchdog_flush_tx(wx);
+ wx_detect_tx_hang(wx);
+}
+EXPORT_SYMBOL(wx_check_hang_subtask);
+
+static void wx_tx_timeout_reset(struct wx *wx)
+{
+ if (test_bit(WX_STATE_DOWN, wx->state))
+ return;
+
+ set_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
+ wx_warn(wx, "initiating reset due to tx timeout\n");
+ wx_service_event_schedule(wx);
+}
+
+void wx_tx_timeout(struct net_device *netdev, unsigned int __always_unused txqueue)
+{
+ struct wx *wx = netdev_priv(netdev);
+
+ wx_tx_timeout_reset(wx);
+}
+EXPORT_SYMBOL(wx_tx_timeout);
+
+void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next)
+{
+ struct wx *wx = netdev_priv(tx_ring->netdev);
+
+ wx_warn(wx,
+ "Detected Tx Unit Hang: Queue %d, TDH %x, TDT %x, ntu %x, ntc %x, ntc.time_stamp %lx, jiffies %lx\n",
+ tx_ring->queue_index,
+ rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx)),
+ rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx)),
+ tx_ring->next_to_use, next,
+ tx_ring->tx_buffer_info[next].time_stamp, jiffies);
+
+ netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index);
+
+ wx_tx_timeout_reset(wx);
+}
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.h b/drivers/net/ethernet/wangxun/libwx/wx_err.h
new file mode 100644
index 000000000000..1eed13e48095
--- /dev/null
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * WangXun Gigabit PCI Express Linux driver
+ * Copyright (c) 2015 - 2026 Beijing WangXun Technology Co., Ltd.
+ */
+
+#ifndef _WX_ERR_H_
+#define _WX_ERR_H_
+
+void wx_check_err_subtask(struct wx *wx);
+int wx_init_err_task(struct wx *wx);
+void wx_check_hang_subtask(struct wx *wx);
+void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue);
+void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next);
+
+#endif /* _WX_ERR_H_ */
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
index 260e14d5d541..122c4952d203 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
@@ -1932,6 +1932,7 @@ static void wx_configure_tx_ring(struct wx *wx,
else
ring->atr_sample_rate = 0;
+ bitmap_zero(ring->state, WX_RING_STATE_NBITS);
/* reinitialize tx_buffer_info */
memset(ring->tx_buffer_info, 0,
sizeof(struct wx_tx_buffer) * ring->count);
@@ -2851,16 +2852,26 @@ EXPORT_SYMBOL(wx_fc_enable);
static void wx_update_xoff_rx_lfc(struct wx *wx)
{
struct wx_hw_stats *hwstats = &wx->stats;
+ u64 data;
+ int i;
if (wx->fc.mode != wx_fc_full &&
wx->fc.mode != wx_fc_rx_pause)
return;
if (wx->mac.type >= wx_mac_aml)
- hwstats->lxoffrxc += rd32_wrap(wx, WX_MAC_LXOFFRXC_AML,
- &wx->last_stats.lxoffrxc);
+ data = rd32_wrap(wx, WX_MAC_LXOFFRXC_AML,
+ &wx->last_stats.lxoffrxc);
else
- hwstats->lxoffrxc += rd64(wx, WX_MAC_LXOFFRXC);
+ data = rd64(wx, WX_MAC_LXOFFRXC);
+ hwstats->lxoffrxc += data;
+
+ /* refill credits (no tx hang) if we received xoff */
+ if (!data)
+ return;
+
+ for (i = 0; i < wx->num_tx_queues; i++)
+ clear_bit(WX_HANG_CHECK_ARMED, wx->tx_ring[i]->state);
}
/**
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index d042567b8128..da4d9e229c9e 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -14,6 +14,7 @@
#include "wx_type.h"
#include "wx_lib.h"
+#include "wx_err.h"
#include "wx_ptp.h"
#include "wx_hw.h"
#include "wx_vf_lib.h"
@@ -742,6 +743,37 @@ static struct netdev_queue *wx_txring_txq(const struct wx_ring *ring)
return netdev_get_tx_queue(ring->netdev, ring->queue_index);
}
+static u32 wx_get_tx_pending(struct wx_ring *ring)
+{
+ unsigned int head, tail;
+
+ head = ring->next_to_clean;
+ tail = ring->next_to_use;
+
+ return ((head <= tail) ? tail : tail + ring->count) - head;
+}
+
+static bool wx_check_tx_hang(struct wx_ring *ring)
+{
+ u32 tx_done_old = ring->tx_stats.tx_done_old;
+ u32 tx_pending = wx_get_tx_pending(ring);
+ u32 tx_done = ring->stats.packets;
+
+ if (!test_and_clear_bit(WX_TX_DETECT_HANG, ring->state))
+ return false;
+
+ if (tx_done_old == tx_done && tx_pending)
+ /* make sure it is true for two checks in a row */
+ return test_and_set_bit(WX_HANG_CHECK_ARMED, ring->state);
+
+ /* update completed stats and continue */
+ ring->tx_stats.tx_done_old = tx_done;
+ /* reset the countdown */
+ clear_bit(WX_HANG_CHECK_ARMED, ring->state);
+
+ return false;
+}
+
/**
* wx_clean_tx_irq - Reclaim resources after transmit completes
* @q_vector: structure containing interrupt and ring information
@@ -866,6 +898,11 @@ static bool wx_clean_tx_irq(struct wx_q_vector *q_vector,
netdev_tx_completed_queue(wx_txring_txq(tx_ring),
total_packets, total_bytes);
+ if (wx_check_tx_hang(tx_ring)) {
+ wx_handle_tx_hang(tx_ring, i);
+ return true;
+ }
+
#define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) &&
(wx_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) {
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index c7befe4cdfe9..75d74ca2e259 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -450,6 +450,11 @@ enum WX_MSCA_CMD_value {
#define WX_PX_TR_CFG_THRE_SHIFT 8
#define WX_PX_TR_CFG_HEAD_WB BIT(27)
+#define WX_PX_TR_RP_PV(q_per_pool, vf_number, vf_q_index) \
+ (WX_PX_TR_RP((q_per_pool) * (vf_number) + (vf_q_index)))
+#define WX_PX_TR_WP_PV(q_per_pool, vf_number, vf_q_index) \
+ (WX_PX_TR_WP((q_per_pool) * (vf_number) + (vf_q_index)))
+
/* Receive DMA Registers */
#define WX_PX_RR_BAL(_i) (0x01000 + ((_i) * 0x40))
#define WX_PX_RR_BAH(_i) (0x01004 + ((_i) * 0x40))
@@ -1039,6 +1044,7 @@ struct wx_queue_stats {
struct wx_tx_queue_stats {
u64 restart_queue;
u64 tx_busy;
+ u32 tx_done_old;
};
struct wx_rx_queue_stats {
@@ -1054,6 +1060,12 @@ struct wx_rx_queue_stats {
#define wx_for_each_ring(posm, headm) \
for (posm = (headm).ring; posm; posm = posm->next)
+enum wx_ring_state {
+ WX_TX_DETECT_HANG,
+ WX_HANG_CHECK_ARMED,
+ WX_RING_STATE_NBITS
+};
+
struct wx_ring_container {
struct wx_ring *ring; /* pointer to linked list of rings */
unsigned int total_bytes; /* total bytes processed this int */
@@ -1073,6 +1085,7 @@ struct wx_ring {
struct wx_tx_buffer *tx_buffer_info;
struct wx_rx_buffer *rx_buffer_info;
};
+ DECLARE_BITMAP(state, WX_RING_STATE_NBITS);
u8 __iomem *tail;
dma_addr_t dma; /* phys. address of descriptor ring */
dma_addr_t headwb_dma;
@@ -1274,6 +1287,7 @@ enum wx_pf_flags {
WX_FLAG_NEED_DO_RESET,
WX_FLAG_RX_MERGE_ENABLED,
WX_FLAG_TXHEAD_WB_ENABLED,
+ WX_FLAG_NEED_PF_RESET,
WX_PF_FLAGS_NBITS /* must be last */
};
@@ -1422,6 +1436,8 @@ struct wx {
struct timer_list service_timer;
struct work_struct service_task;
+ struct work_struct reset_task;
+ struct workqueue_struct *reset_wq;
struct mutex reset_lock; /* mutex for reset */
};
@@ -1504,7 +1520,8 @@ rd32_wrap(struct wx *wx, u32 reg, u32 *last)
#define wx_err(wx, fmt, arg...) \
dev_err(&(wx)->pdev->dev, fmt, ##arg)
-
+#define wx_warn(wx, fmt, arg...) \
+ dev_warn(&(wx)->pdev->dev, fmt, ##arg)
#define wx_dbg(wx, fmt, arg...) \
dev_dbg(&(wx)->pdev->dev, fmt, ##arg)
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index dea6dfb043f3..4bcef967e992 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -14,6 +14,7 @@
#include "../libwx/wx_type.h"
#include "../libwx/wx_hw.h"
#include "../libwx/wx_lib.h"
+#include "../libwx/wx_err.h"
#include "../libwx/wx_ptp.h"
#include "../libwx/wx_mbx.h"
#include "../libwx/wx_sriov.h"
@@ -148,6 +149,8 @@ static void ngbe_service_task(struct work_struct *work)
struct wx *wx = container_of(work, struct wx, service_task);
wx_update_stats(wx);
+ wx_check_hang_subtask(wx);
+ wx_check_err_subtask(wx);
wx_service_event_complete(wx);
}
@@ -393,6 +396,7 @@ static void ngbe_disable_device(struct wx *wx)
netif_tx_stop_all_queues(netdev);
netif_tx_disable(netdev);
+ clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
timer_delete_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
@@ -644,6 +648,7 @@ static const struct net_device_ops ngbe_netdev_ops = {
.ndo_stop = ngbe_close,
.ndo_change_mtu = wx_change_mtu,
.ndo_start_xmit = wx_xmit_frame,
+ .ndo_tx_timeout = wx_tx_timeout,
.ndo_set_rx_mode = wx_set_rx_mode,
.ndo_set_features = wx_set_features,
.ndo_fix_features = wx_fix_features,
@@ -733,6 +738,7 @@ static int ngbe_probe(struct pci_dev *pdev,
wx->driver_name = ngbe_driver_name;
ngbe_set_ethtool_ops(netdev);
netdev->netdev_ops = &ngbe_netdev_ops;
+ netdev->watchdog_timeo = 5 * HZ;
netdev->features = NETIF_F_SG | NETIF_F_IP_CSUM |
NETIF_F_TSO | NETIF_F_TSO6 |
@@ -830,6 +836,10 @@ static int ngbe_probe(struct pci_dev *pdev,
eth_hw_addr_set(netdev, wx->mac.perm_addr);
wx_mac_set_default_filter(wx, wx->mac.perm_addr);
+ err = wx_init_err_task(wx);
+ if (err)
+ goto err_free_mac_table;
+
ngbe_init_service(wx);
err = wx_init_interrupt_scheme(wx);
@@ -857,6 +867,8 @@ static int ngbe_probe(struct pci_dev *pdev,
err_cancel_service:
timer_delete_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
+ cancel_work_sync(&wx->reset_task);
+ destroy_workqueue(wx->reset_wq);
err_free_mac_table:
kfree(wx->rss_key);
kfree(wx->mac_table);
@@ -888,6 +900,8 @@ static void ngbe_remove(struct pci_dev *pdev)
timer_shutdown_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
+ cancel_work_sync(&wx->reset_task);
+ destroy_workqueue(wx->reset_wq);
phylink_destroy(wx->phylink);
pci_release_selected_regions(pdev,
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index ce82e13aa8ae..689679b315ae 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -14,6 +14,7 @@
#include "../libwx/wx_type.h"
#include "../libwx/wx_lib.h"
+#include "../libwx/wx_err.h"
#include "../libwx/wx_ptp.h"
#include "../libwx/wx_hw.h"
#include "../libwx/wx_mbx.h"
@@ -123,6 +124,8 @@ static void txgbe_service_task(struct work_struct *work)
txgbe_module_detection_subtask(wx);
txgbe_link_config_subtask(wx);
wx_update_stats(wx);
+ wx_check_hang_subtask(wx);
+ wx_check_err_subtask(wx);
wx_service_event_complete(wx);
}
@@ -224,6 +227,7 @@ static void txgbe_disable_device(struct wx *wx)
wx_irq_disable(wx);
wx_napi_disable_all(wx);
+ clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
timer_delete_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
@@ -654,6 +658,7 @@ static const struct net_device_ops txgbe_netdev_ops = {
.ndo_stop = txgbe_close,
.ndo_change_mtu = wx_change_mtu,
.ndo_start_xmit = wx_xmit_frame,
+ .ndo_tx_timeout = wx_tx_timeout,
.ndo_set_rx_mode = wx_set_rx_mode,
.ndo_set_features = wx_set_features,
.ndo_fix_features = wx_fix_features,
@@ -745,6 +750,7 @@ static int txgbe_probe(struct pci_dev *pdev,
wx->driver_name = txgbe_driver_name;
txgbe_set_ethtool_ops(netdev);
netdev->netdev_ops = &txgbe_netdev_ops;
+ netdev->watchdog_timeo = 5 * HZ;
netdev->udp_tunnel_nic_info = &txgbe_udp_tunnels;
/* setup the private structure */
@@ -815,6 +821,10 @@ static int txgbe_probe(struct pci_dev *pdev,
eth_hw_addr_set(netdev, wx->mac.perm_addr);
wx_mac_set_default_filter(wx, wx->mac.perm_addr);
+ err = wx_init_err_task(wx);
+ if (err)
+ goto err_free_mac_table;
+
txgbe_init_service(wx);
err = wx_init_interrupt_scheme(wx);
@@ -917,6 +927,8 @@ static int txgbe_probe(struct pci_dev *pdev,
err_cancel_service:
timer_delete_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
+ cancel_work_sync(&wx->reset_task);
+ destroy_workqueue(wx->reset_wq);
err_free_mac_table:
kfree(wx->rss_key);
kfree(wx->mac_table);
@@ -949,6 +961,8 @@ static void txgbe_remove(struct pci_dev *pdev)
timer_shutdown_sync(&wx->service_timer);
cancel_work_sync(&wx->service_task);
+ cancel_work_sync(&wx->reset_task);
+ destroy_workqueue(wx->reset_wq);
txgbe_remove_phy(txgbe);
wx_free_isb_resources(wx);
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v5 3/5] net: wangxun: add reinit parameter to wx->do_reset callback
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 2/5] net: wangxun: add Tx timeout process Jiawen Wu
@ 2026-06-04 8:56 ` Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 5/5] net: wangxun: add pcie error handler Jiawen Wu
4 siblings, 0 replies; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
To implement a simple hardware reset without tearing down the network
interface state, introduce a boolean 'reinit' parameter to wx->do_reset
callback.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
drivers/net/ethernet/wangxun/libwx/wx_err.c | 2 +-
drivers/net/ethernet/wangxun/libwx/wx_ethtool.c | 2 +-
drivers/net/ethernet/wangxun/libwx/wx_lib.c | 4 ++--
drivers/net/ethernet/wangxun/libwx/wx_type.h | 2 +-
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 4 ++--
drivers/net/ethernet/wangxun/ngbe/ngbe_type.h | 2 +-
drivers/net/ethernet/wangxun/txgbe/txgbe_main.c | 4 ++--
drivers/net/ethernet/wangxun/txgbe/txgbe_type.h | 2 +-
8 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
index b6e2d16d4a16..ee27f96735dc 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -16,7 +16,7 @@ static void wx_pf_reset_subtask(struct wx *wx)
wx_warn(wx, "Reset adapter.\n");
if (wx->do_reset)
- wx->do_reset(wx->netdev);
+ wx->do_reset(wx->netdev, true);
}
static void wx_reset_task(struct work_struct *work)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c b/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
index 5df971aca9e3..d1356ff5d69b 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
@@ -395,7 +395,7 @@ static void wx_update_rsc(struct wx *wx)
/* reset the device to apply the new RSC setting */
if (need_reset && wx->do_reset)
- wx->do_reset(netdev);
+ wx->do_reset(netdev, true);
}
int wx_set_coalesce(struct net_device *netdev,
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index da4d9e229c9e..e5a45356ba00 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -3148,7 +3148,7 @@ int wx_set_features(struct net_device *netdev, netdev_features_t features)
netdev->features = features;
if (changed & NETIF_F_HW_VLAN_CTAG_RX && wx->do_reset)
- wx->do_reset(netdev);
+ wx->do_reset(netdev, true);
else if (changed & (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_FILTER))
wx_set_rx_mode(netdev);
@@ -3198,7 +3198,7 @@ int wx_set_features(struct net_device *netdev, netdev_features_t features)
out:
if (need_reset && wx->do_reset)
- wx->do_reset(netdev);
+ wx->do_reset(netdev, true);
return 0;
}
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 75d74ca2e259..a8b4e84787f4 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1408,7 +1408,7 @@ struct wx {
void (*atr)(struct wx_ring *ring, struct wx_tx_buffer *first, u8 ptype);
void (*configure_fdir)(struct wx *wx);
int (*setup_tc)(struct net_device *netdev, u8 tc);
- void (*do_reset)(struct net_device *netdev);
+ void (*do_reset)(struct net_device *netdev, bool reinit);
int (*ptp_setup_sdp)(struct wx *wx);
void (*set_num_queues)(struct wx *wx);
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index 4bcef967e992..7dd3e12d48aa 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -633,11 +633,11 @@ static void ngbe_reinit_locked(struct wx *wx)
mutex_unlock(&wx->reset_lock);
}
-void ngbe_do_reset(struct net_device *netdev)
+void ngbe_do_reset(struct net_device *netdev, bool reinit)
{
struct wx *wx = netdev_priv(netdev);
- if (netif_running(netdev))
+ if (netif_running(netdev) && reinit)
ngbe_reinit_locked(wx);
else
ngbe_reset(wx);
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
index 4f648f272c08..c9233dc7ae50 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
@@ -125,6 +125,6 @@ extern char ngbe_driver_name[];
void ngbe_down(struct wx *wx);
void ngbe_up(struct wx *wx);
int ngbe_setup_tc(struct net_device *dev, u8 tc);
-void ngbe_do_reset(struct net_device *netdev);
+void ngbe_do_reset(struct net_device *netdev, bool reinit);
#endif /* _NGBE_TYPE_H_ */
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 689679b315ae..9251e7a1d416 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -610,11 +610,11 @@ static void txgbe_reinit_locked(struct wx *wx)
mutex_unlock(&wx->reset_lock);
}
-void txgbe_do_reset(struct net_device *netdev)
+void txgbe_do_reset(struct net_device *netdev, bool reinit)
{
struct wx *wx = netdev_priv(netdev);
- if (netif_running(netdev))
+ if (netif_running(netdev) && reinit)
txgbe_reinit_locked(wx);
else
txgbe_reset(wx);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
index 6b05f32b4a01..1e373f7fd9b5 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
@@ -313,7 +313,7 @@ extern char txgbe_driver_name[];
void txgbe_down(struct wx *wx);
void txgbe_up(struct wx *wx);
int txgbe_setup_tc(struct net_device *dev, u8 tc);
-void txgbe_do_reset(struct net_device *netdev);
+void txgbe_do_reset(struct net_device *netdev, bool reinit);
#define TXGBE_LINK_SPEED_UNKNOWN 0
#define TXGBE_LINK_SPEED_10GB_FULL 4
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
` (2 preceding siblings ...)
2026-06-04 8:56 ` [PATCH net-next v5 3/5] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
@ 2026-06-04 8:56 ` Jiawen Wu
2026-06-08 8:47 ` Loktionov, Aleksandr
2026-06-04 8:56 ` [PATCH net-next v5 5/5] net: wangxun: add pcie error handler Jiawen Wu
4 siblings, 1 reply; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
Function wx_soft_quiesce() provide a lightweight shutdown path during
PCIe error recovery. It avoids MMIO-dependent operations in PCIe error
status.
Waiting for the service task to complete may unnecessarily delay PCIe
error recovery, especially if the work item is already blocked by the
hardware failure that triggered AER. So the service task is not
explicitly cancelled in quiesce path. As a measure to block the service
task, the checking of WX_STATE_DOWN and WX_STATE_RESETTING is added at
the entry of every work item.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/ethernet/wangxun/libwx/wx_lib.c | 14 +++++++++++++
drivers/net/ethernet/wangxun/libwx/wx_lib.h | 1 +
drivers/net/ethernet/wangxun/libwx/wx_ptp.c | 21 +++++++++++++++++++
drivers/net/ethernet/wangxun/libwx/wx_ptp.h | 1 +
.../net/ethernet/wangxun/txgbe/txgbe_main.c | 8 +++++++
5 files changed, 45 insertions(+)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index e5a45356ba00..0667eb1fe5fe 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -3382,5 +3382,19 @@ void wx_service_timer(struct timer_list *t)
}
EXPORT_SYMBOL(wx_service_timer);
+void wx_soft_quiesce(struct wx *wx)
+{
+ wx_ptp_quiesce(wx);
+ pci_clear_master(wx->pdev);
+ netif_tx_stop_all_queues(wx->netdev);
+ netif_carrier_off(wx->netdev);
+ netif_tx_disable(wx->netdev);
+ wx_napi_disable_all(wx);
+
+ clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
+ timer_delete_sync(&wx->service_timer);
+}
+EXPORT_SYMBOL(wx_soft_quiesce);
+
MODULE_DESCRIPTION("Common library for Wangxun(R) Ethernet drivers.");
MODULE_LICENSE("GPL");
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.h b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
index aed6ea8cf0d6..11bd79985e17 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
@@ -41,5 +41,6 @@ void wx_set_ring(struct wx *wx, u32 new_tx_count,
void wx_service_event_schedule(struct wx *wx);
void wx_service_event_complete(struct wx *wx);
void wx_service_timer(struct timer_list *t);
+void wx_soft_quiesce(struct wx *wx);
#endif /* _WX_LIB_H_ */
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ptp.c b/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
index 44f3e6505246..dcc8b3ae1445 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
@@ -842,6 +842,27 @@ void wx_ptp_stop(struct wx *wx)
}
EXPORT_SYMBOL(wx_ptp_stop);
+void wx_ptp_quiesce(struct wx *wx)
+{
+ if (!test_and_clear_bit(WX_STATE_PTP_RUNNING, wx->state))
+ return;
+
+ clear_bit(WX_FLAG_PTP_PPS_ENABLED, wx->flags);
+
+ if (wx->ptp_tx_skb) {
+ dev_kfree_skb_any(wx->ptp_tx_skb);
+ wx->ptp_tx_skb = NULL;
+ }
+ clear_bit_unlock(WX_STATE_PTP_TX_IN_PROGRESS, wx->state);
+
+ if (wx->ptp_clock) {
+ ptp_clock_unregister(wx->ptp_clock);
+ wx->ptp_clock = NULL;
+ dev_info(&wx->pdev->dev, "removed PHC on %s\n", wx->netdev->name);
+ }
+}
+EXPORT_SYMBOL(wx_ptp_quiesce);
+
/**
* wx_ptp_rx_hwtstamp - utility function which checks for RX time stamp
* @wx: pointer to wx struct
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ptp.h b/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
index 50db90a6e3ee..ad2f824875d5 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
@@ -10,6 +10,7 @@ void wx_ptp_reset(struct wx *wx);
void wx_ptp_init(struct wx *wx);
void wx_ptp_suspend(struct wx *wx);
void wx_ptp_stop(struct wx *wx);
+void wx_ptp_quiesce(struct wx *wx);
void wx_ptp_rx_hwtstamp(struct wx *wx, struct sk_buff *skb);
int wx_hwtstamp_get(struct net_device *dev,
struct kernel_hwtstamp_config *cfg);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 9251e7a1d416..f6e596eb9217 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -94,6 +94,10 @@ static void txgbe_module_detection_subtask(struct wx *wx)
{
int err;
+ if (test_bit(WX_STATE_DOWN, wx->state) ||
+ test_bit(WX_STATE_RESETTING, wx->state))
+ return;
+
if (!test_and_clear_bit(WX_FLAG_NEED_MODULE_RESET, wx->flags))
return;
@@ -107,6 +111,10 @@ static void txgbe_module_detection_subtask(struct wx *wx)
static void txgbe_link_config_subtask(struct wx *wx)
{
+ if (test_bit(WX_STATE_DOWN, wx->state) ||
+ test_bit(WX_STATE_RESETTING, wx->state))
+ return;
+
if (!test_and_clear_bit(WX_FLAG_NEED_LINK_CONFIG, wx->flags))
return;
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net-next v5 5/5] net: wangxun: add pcie error handler
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
` (3 preceding siblings ...)
2026-06-04 8:56 ` [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery Jiawen Wu
@ 2026-06-04 8:56 ` Jiawen Wu
2026-06-08 15:05 ` Simon Horman
4 siblings, 1 reply; 9+ messages in thread
From: Jiawen Wu @ 2026-06-04 8:56 UTC (permalink / raw)
To: netdev
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Jacob Keller, Michal Swiatkowski, Simon Horman, Kees Cook,
Larysa Zaremba, Joe Damato, Breno Leitao, Aleksandr Loktionov,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman, Jiawen Wu
Support AER driver to handle the PCIe errors. Sometimes netdev watchdog
Tx timeout happens before the AER error report when a PCIe error occurs,
CPU blocking would be caused by MMIO during the reset process. To
prevent it, error return is added in reset path. The current function of
ngbe is not yet fully developed, it will be completed in the future.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/ethernet/wangxun/libwx/wx_err.c | 114 ++++++++++++++++++
drivers/net/ethernet/wangxun/libwx/wx_err.h | 2 +
drivers/net/ethernet/wangxun/libwx/wx_type.h | 2 +
drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 23 +++-
.../ethernet/wangxun/txgbe/txgbe_ethtool.c | 6 +-
.../net/ethernet/wangxun/txgbe/txgbe_main.c | 53 ++++++--
.../net/ethernet/wangxun/txgbe/txgbe_type.h | 2 +-
7 files changed, 189 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
index ee27f96735dc..9a4dbb3427d9 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -4,11 +4,125 @@
#include <linux/netdevice.h>
#include <linux/pci.h>
+#include <linux/aer.h>
#include "wx_type.h"
#include "wx_lib.h"
#include "wx_err.h"
+/**
+ * wx_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci connection state
+ *
+ * Return: pci_ers_result_t.
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t wx_io_error_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct wx *wx = pci_get_drvdata(pdev);
+ struct net_device *netdev;
+
+ if (!wx)
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ netdev = wx->netdev;
+ if (!netif_device_present(netdev))
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ if (state == pci_channel_io_perm_failure)
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ rtnl_lock();
+ netif_device_detach(netdev);
+
+ if (netif_running(netdev) &&
+ !test_and_set_bit(WX_STATE_DOWN, wx->state))
+ wx_soft_quiesce(wx);
+
+ if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+ pci_disable_device(pdev);
+ rtnl_unlock();
+
+ /* Request a slot reset. */
+ return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * wx_io_slot_reset - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Return: pci_ers_result_t.
+ *
+ * Restart the card from scratch, as if from a cold-boot.
+ */
+static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
+{
+ struct wx *wx = pci_get_drvdata(pdev);
+ pci_ers_result_t result;
+
+ if (pci_enable_device_mem(pdev)) {
+ wx_err(wx, "Cannot re-enable PCI device after reset.\n");
+ result = PCI_ERS_RESULT_DISCONNECT;
+ } else {
+ /* make all memory operations done before clearing the flag */
+ smp_mb__before_atomic();
+ clear_bit(WX_STATE_DISABLED, wx->state);
+ pci_set_master(pdev);
+ pci_restore_state(pdev);
+ pci_wake_from_d3(pdev, false);
+
+ rtnl_lock();
+ if (netif_running(wx->netdev) && wx->down_suspend)
+ wx->down_suspend(wx);
+ if (wx->do_reset)
+ wx->do_reset(wx->netdev, false);
+ rtnl_unlock();
+ result = PCI_ERS_RESULT_RECOVERED;
+ }
+
+ pci_aer_clear_nonfatal_status(pdev);
+
+ return result;
+}
+
+/**
+ * wx_io_resume - called when traffic can start flowing again.
+ * @pdev: Pointer to PCI device
+ *
+ * This callback is called when the error recovery driver tells us that
+ * its OK to resume normal operation.
+ */
+static void wx_io_resume(struct pci_dev *pdev)
+{
+ struct wx *wx = pci_get_drvdata(pdev);
+ struct net_device *netdev;
+ int err;
+
+ netdev = wx->netdev;
+ rtnl_lock();
+ if (netif_running(netdev)) {
+ err = netdev->netdev_ops->ndo_open(netdev);
+ if (err) {
+ wx_err(wx, "Failed to open netdev after reset\n");
+ goto out;
+ }
+ }
+ netif_device_attach(netdev);
+out:
+ rtnl_unlock();
+}
+
+const struct pci_error_handlers wx_err_handler = {
+ .error_detected = wx_io_error_detected,
+ .slot_reset = wx_io_slot_reset,
+ .resume = wx_io_resume,
+};
+EXPORT_SYMBOL(wx_err_handler);
+
static void wx_pf_reset_subtask(struct wx *wx)
{
if (!test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.h b/drivers/net/ethernet/wangxun/libwx/wx_err.h
index 1eed13e48095..a6a82a263528 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.h
@@ -7,6 +7,8 @@
#ifndef _WX_ERR_H_
#define _WX_ERR_H_
+extern const struct pci_error_handlers wx_err_handler;
+
void wx_check_err_subtask(struct wx *wx);
int wx_init_err_task(struct wx *wx);
void wx_check_hang_subtask(struct wx *wx);
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index a8b4e84787f4..ec66a34b272d 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1221,6 +1221,7 @@ enum wx_state {
WX_STATE_PTP_RUNNING,
WX_STATE_PTP_TX_IN_PROGRESS,
WX_STATE_SERVICE_SCHED,
+ WX_STATE_DISABLED,
WX_STATE_NBITS /* must be last */
};
@@ -1409,6 +1410,7 @@ struct wx {
void (*configure_fdir)(struct wx *wx);
int (*setup_tc)(struct net_device *netdev, u8 tc);
void (*do_reset)(struct net_device *netdev, bool reinit);
+ void (*down_suspend)(struct wx *wx);
int (*ptp_setup_sdp)(struct wx *wx);
void (*set_num_queues)(struct wx *wx);
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index 7dd3e12d48aa..effe9311a57a 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -47,6 +47,19 @@ static const struct pci_device_id ngbe_pci_tbl[] = {
{ }
};
+static void ngbe_down_suspend(struct wx *wx)
+{
+ phylink_stop(wx->phylink);
+ phylink_disconnect_phy(wx->phylink);
+
+ wx_clean_all_tx_rings(wx);
+ wx_clean_all_rx_rings(wx);
+
+ wx_free_irq(wx);
+ wx_free_isb_resources(wx);
+ wx_free_resources(wx);
+}
+
/**
* ngbe_init_type_code - Initialize the shared code
* @wx: pointer to hardware structure
@@ -135,6 +148,7 @@ static int ngbe_sw_init(struct wx *wx)
wx->mbx.size = WX_VXMAILBOX_SIZE;
wx->setup_tc = ngbe_setup_tc;
wx->do_reset = ngbe_do_reset;
+ wx->down_suspend = ngbe_down_suspend;
set_bit(0, &wx->fwd_bitmask);
return 0;
@@ -566,7 +580,8 @@ static void ngbe_dev_shutdown(struct pci_dev *pdev, bool *enable_wake)
*enable_wake = !!wufc;
wx_control_hw(wx, false);
- pci_disable_device(pdev);
+ if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+ pci_disable_device(pdev);
}
static void ngbe_shutdown(struct pci_dev *pdev)
@@ -856,6 +871,7 @@ static int ngbe_probe(struct pci_dev *pdev,
goto err_register;
pci_set_drvdata(pdev, wx);
+ pci_save_state(pdev);
return 0;
@@ -911,7 +927,8 @@ static void ngbe_remove(struct pci_dev *pdev)
kfree(wx->mac_table);
wx_clear_interrupt_scheme(wx);
- pci_disable_device(pdev);
+ if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+ pci_disable_device(pdev);
}
static int ngbe_suspend(struct pci_dev *pdev, pm_message_t state)
@@ -938,6 +955,7 @@ static int ngbe_resume(struct pci_dev *pdev)
wx_err(wx, "Cannot enable PCI device from suspend\n");
return err;
}
+ clear_bit(WX_STATE_DISABLED, wx->state);
pci_set_master(pdev);
device_wakeup_disable(&pdev->dev);
@@ -962,6 +980,7 @@ static struct pci_driver ngbe_driver = {
.resume = ngbe_resume,
.shutdown = ngbe_shutdown,
.sriov_configure = wx_pci_sriov_configure,
+ .err_handler = &wx_err_handler,
};
module_pci_driver(ngbe_driver);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
index 3e32aca72806..80811947d5ac 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
@@ -78,13 +78,17 @@ static int txgbe_set_ringparam(struct net_device *netdev,
goto clear_reset;
}
- txgbe_down(wx);
+ err = txgbe_down(wx);
+ if (err)
+ goto free_temp;
wx_set_ring(wx, new_tx_count, new_rx_count, temp_ring);
kvfree(temp_ring);
txgbe_up(wx);
+free_temp:
+ kvfree(temp_ring);
clear_reset:
clear_bit(WX_STATE_RESETTING, wx->state);
mutex_unlock(&wx->reset_lock);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index f6e596eb9217..98786efbe871 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -212,15 +212,21 @@ static void txgbe_reset(struct wx *wx)
wx_ptp_reset(wx);
}
-static void txgbe_disable_device(struct wx *wx)
+static int txgbe_disable_device(struct wx *wx)
{
struct net_device *netdev = wx->netdev;
+ int ret = 0;
u32 i;
if (test_and_set_bit(WX_STATE_DOWN, wx->state))
- return;
+ return 0;
+
+ ret = wx_disable_pcie_master(wx);
+ if (ret) {
+ wx_soft_quiesce(wx);
+ return ret;
+ }
- wx_disable_pcie_master(wx);
/* disable receives */
wx_disable_rx(wx);
@@ -270,11 +276,18 @@ static void txgbe_disable_device(struct wx *wx)
/* Disable the Tx DMA engine */
wr32m(wx, WX_TDM_CTL, WX_TDM_CTL_TE, 0);
+
+ return 0;
}
-void txgbe_down(struct wx *wx)
+int txgbe_down(struct wx *wx)
{
- txgbe_disable_device(wx);
+ int ret = 0;
+
+ ret = txgbe_disable_device(wx);
+ if (ret)
+ return ret;
+
txgbe_reset(wx);
switch (wx->mac.type) {
@@ -295,6 +308,8 @@ void txgbe_down(struct wx *wx)
wx_clean_all_tx_rings(wx);
wx_clean_all_rx_rings(wx);
+
+ return 0;
}
void txgbe_up(struct wx *wx)
@@ -304,6 +319,18 @@ void txgbe_up(struct wx *wx)
txgbe_up_complete(wx);
}
+static void txgbe_down_suspend(struct wx *wx)
+{
+ phylink_stop(wx->phylink);
+
+ wx_clean_all_tx_rings(wx);
+ wx_clean_all_rx_rings(wx);
+
+ wx_free_irq(wx);
+ txgbe_free_misc_irq(wx->priv);
+ wx_free_resources(wx);
+}
+
/**
* txgbe_init_type_code - Initialize the shared code
* @wx: pointer to hardware structure
@@ -420,6 +447,7 @@ static int txgbe_sw_init(struct wx *wx)
wx->setup_tc = txgbe_setup_tc;
wx->do_reset = txgbe_do_reset;
+ wx->down_suspend = txgbe_down_suspend;
set_bit(0, &wx->fwd_bitmask);
switch (wx->mac.type) {
@@ -556,7 +584,8 @@ static void txgbe_dev_shutdown(struct pci_dev *pdev)
wx_control_hw(wx, false);
- pci_disable_device(pdev);
+ if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+ pci_disable_device(pdev);
}
static void txgbe_shutdown(struct pci_dev *pdev)
@@ -606,13 +635,16 @@ int txgbe_setup_tc(struct net_device *dev, u8 tc)
static void txgbe_reinit_locked(struct wx *wx)
{
+ int ret;
+
netif_trans_update(wx->netdev);
mutex_lock(&wx->reset_lock);
set_bit(WX_STATE_RESETTING, wx->state);
- txgbe_down(wx);
- txgbe_up(wx);
+ ret = txgbe_down(wx);
+ if (!ret)
+ txgbe_up(wx);
clear_bit(WX_STATE_RESETTING, wx->state);
mutex_unlock(&wx->reset_lock);
@@ -908,6 +940,7 @@ static int txgbe_probe(struct pci_dev *pdev,
goto err_remove_phy;
pci_set_drvdata(pdev, wx);
+ pci_save_state(pdev);
netif_tx_stop_all_queues(netdev);
@@ -982,7 +1015,8 @@ static void txgbe_remove(struct pci_dev *pdev)
kfree(wx->mac_table);
wx_clear_interrupt_scheme(wx);
- pci_disable_device(pdev);
+ if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+ pci_disable_device(pdev);
}
static struct pci_driver txgbe_driver = {
@@ -992,6 +1026,7 @@ static struct pci_driver txgbe_driver = {
.remove = txgbe_remove,
.shutdown = txgbe_shutdown,
.sriov_configure = wx_pci_sriov_configure,
+ .err_handler = &wx_err_handler,
};
module_pci_driver(txgbe_driver);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
index 1e373f7fd9b5..daef87274678 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
@@ -310,7 +310,7 @@ struct txgbe_fdir_filter {
extern char txgbe_driver_name[];
-void txgbe_down(struct wx *wx);
+int txgbe_down(struct wx *wx);
void txgbe_up(struct wx *wx);
int txgbe_setup_tc(struct net_device *dev, u8 tc);
void txgbe_do_reset(struct net_device *netdev, bool reinit);
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops
2026-06-04 8:56 ` [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
@ 2026-06-08 8:47 ` Loktionov, Aleksandr
0 siblings, 0 replies; 9+ messages in thread
From: Loktionov, Aleksandr @ 2026-06-08 8:47 UTC (permalink / raw)
To: Jiawen Wu, netdev@vger.kernel.org
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Keller, Jacob E, Michal Swiatkowski, Simon Horman, Kees Cook,
Zaremba, Larysa, Joe Damato, Breno Leitao,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman
> -----Original Message-----
> From: Jiawen Wu <jiawenwu@trustnetic.com>
> Sent: Thursday, June 4, 2026 10:56 AM
> To: netdev@vger.kernel.org
> Cc: Mengyuan Lou <mengyuanlou@net-swift.com>; Andrew Lunn
> <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; Richard Cochran <richardcochran@gmail.com>;
> Russell King <linux@armlinux.org.uk>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Michal Swiatkowski
> <michal.swiatkowski@linux.intel.com>; Simon Horman <horms@kernel.org>;
> Kees Cook <kees@kernel.org>; Zaremba, Larysa
> <larysa.zaremba@intel.com>; Joe Damato <joe@dama.to>; Breno Leitao
> <leitao@debian.org>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Uwe Kleine-König (The Capable Hub)
> <u.kleine-koenig@baylibre.com>; Fabio Baltieri
> <fabio.baltieri@gmail.com>; Thomas Gleixner <tglx@kernel.org>; Greg
> Kroah-Hartman <gregkh@linuxfoundation.org>; Jiawen Wu
> <jiawenwu@trustnetic.com>
> Subject: [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops
>
> Implement wx->do_reset() for library module calling.
>
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
> ---
> .../net/ethernet/wangxun/ngbe/ngbe_ethtool.c | 1 -
> drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 37 ++++++++++++++++++-
> drivers/net/ethernet/wangxun/ngbe/ngbe_type.h | 1 +
> 3 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
> b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
> index b2e191982803..1960f7154151 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
> @@ -59,7 +59,6 @@ static int ngbe_set_ringparam(struct net_device
> *netdev,
> wx_set_ring(wx, new_tx_count, new_rx_count, temp_ring);
> kvfree(temp_ring);
>
> - wx_configure(wx);
> ngbe_up(wx);
>
> clear_reset:
> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index 8678c49b892a..dea6dfb043f3 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> @@ -133,6 +133,7 @@ static int ngbe_sw_init(struct wx *wx)
>
> wx->mbx.size = WX_VXMAILBOX_SIZE;
> wx->setup_tc = ngbe_setup_tc;
> + wx->do_reset = ngbe_do_reset;
> set_bit(0, &wx->fwd_bitmask);
>
> return 0;
> @@ -423,7 +424,7 @@ void ngbe_down(struct wx *wx)
> wx_clean_all_rx_rings(wx);
> }
>
> -void ngbe_up(struct wx *wx)
> +static void ngbe_up_complete(struct wx *wx)
> {
> wx_configure_vectors(wx);
>
> @@ -490,7 +491,7 @@ static int ngbe_open(struct net_device *netdev)
>
> wx_ptp_init(wx);
>
> - ngbe_up(wx);
> + ngbe_up_complete(wx);
>
> return 0;
> err_dis_phy:
> @@ -503,6 +504,12 @@ static int ngbe_open(struct net_device *netdev)
> return err;
> }
>
> +void ngbe_up(struct wx *wx)
> +{
> + wx_configure(wx);
> + ngbe_up_complete(wx);
> +}
> +
> /**
> * ngbe_close - Disables a network interface
> * @netdev: network interface device structure @@ -590,6 +597,8 @@
> int ngbe_setup_tc(struct net_device *dev, u8 tc)
> */
> if (netif_running(dev))
> ngbe_close(dev);
> + else
> + ngbe_reset(wx);
>
> wx_clear_interrupt_scheme(wx);
>
> @@ -606,6 +615,30 @@ int ngbe_setup_tc(struct net_device *dev, u8 tc)
> return 0;
> }
>
> +static void ngbe_reinit_locked(struct wx *wx) {
> + netif_trans_update(wx->netdev);
> +
> + mutex_lock(&wx->reset_lock);
> + set_bit(WX_STATE_RESETTING, wx->state);
> +
> + ngbe_down(wx);
> + ngbe_up(wx);
> +
> + clear_bit(WX_STATE_RESETTING, wx->state);
> + mutex_unlock(&wx->reset_lock);
> +}
> +
> +void ngbe_do_reset(struct net_device *netdev) {
> + struct wx *wx = netdev_priv(netdev);
> +
> + if (netif_running(netdev))
> + ngbe_reinit_locked(wx);
> + else
> + ngbe_reset(wx);
> +}
> +
> static const struct net_device_ops ngbe_netdev_ops = {
> .ndo_open = ngbe_open,
> .ndo_stop = ngbe_close,
> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
> b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
> index 7077a0da4c98..4f648f272c08 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
> @@ -125,5 +125,6 @@ extern char ngbe_driver_name[]; void
> ngbe_down(struct wx *wx); void ngbe_up(struct wx *wx); int
> ngbe_setup_tc(struct net_device *dev, u8 tc);
> +void ngbe_do_reset(struct net_device *netdev);
>
> #endif /* _NGBE_TYPE_H_ */
> --
> 2.51.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery
2026-06-04 8:56 ` [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery Jiawen Wu
@ 2026-06-08 8:47 ` Loktionov, Aleksandr
0 siblings, 0 replies; 9+ messages in thread
From: Loktionov, Aleksandr @ 2026-06-08 8:47 UTC (permalink / raw)
To: Jiawen Wu, netdev@vger.kernel.org
Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
Keller, Jacob E, Michal Swiatkowski, Simon Horman, Kees Cook,
Zaremba, Larysa, Joe Damato, Breno Leitao,
Uwe Kleine-König (The Capable Hub), Fabio Baltieri,
Thomas Gleixner, Greg Kroah-Hartman
> -----Original Message-----
> From: Jiawen Wu <jiawenwu@trustnetic.com>
> Sent: Thursday, June 4, 2026 10:57 AM
> To: netdev@vger.kernel.org
> Cc: Mengyuan Lou <mengyuanlou@net-swift.com>; Andrew Lunn
> <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo
> Abeni <pabeni@redhat.com>; Richard Cochran <richardcochran@gmail.com>;
> Russell King <linux@armlinux.org.uk>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Michal Swiatkowski
> <michal.swiatkowski@linux.intel.com>; Simon Horman <horms@kernel.org>;
> Kees Cook <kees@kernel.org>; Zaremba, Larysa
> <larysa.zaremba@intel.com>; Joe Damato <joe@dama.to>; Breno Leitao
> <leitao@debian.org>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Uwe Kleine-König (The Capable Hub)
> <u.kleine-koenig@baylibre.com>; Fabio Baltieri
> <fabio.baltieri@gmail.com>; Thomas Gleixner <tglx@kernel.org>; Greg
> Kroah-Hartman <gregkh@linuxfoundation.org>; Jiawen Wu
> <jiawenwu@trustnetic.com>
> Subject: [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce
> for PCIe error recovery
>
> Function wx_soft_quiesce() provide a lightweight shutdown path during
> PCIe error recovery. It avoids MMIO-dependent operations in PCIe error
> status.
>
> Waiting for the service task to complete may unnecessarily delay PCIe
> error recovery, especially if the work item is already blocked by the
> hardware failure that triggered AER. So the service task is not
> explicitly cancelled in quiesce path. As a measure to block the
> service task, the checking of WX_STATE_DOWN and WX_STATE_RESETTING is
> added at the entry of every work item.
>
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
> drivers/net/ethernet/wangxun/libwx/wx_lib.c | 14 +++++++++++++
> drivers/net/ethernet/wangxun/libwx/wx_lib.h | 1 +
> drivers/net/ethernet/wangxun/libwx/wx_ptp.c | 21
> +++++++++++++++++++
> drivers/net/ethernet/wangxun/libwx/wx_ptp.h | 1 +
> .../net/ethernet/wangxun/txgbe/txgbe_main.c | 8 +++++++
> 5 files changed, 45 insertions(+)
>
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> index e5a45356ba00..0667eb1fe5fe 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> @@ -3382,5 +3382,19 @@ void wx_service_timer(struct timer_list *t) }
> EXPORT_SYMBOL(wx_service_timer);
>
> +void wx_soft_quiesce(struct wx *wx)
> +{
> + wx_ptp_quiesce(wx);
> + pci_clear_master(wx->pdev);
> + netif_tx_stop_all_queues(wx->netdev);
> + netif_carrier_off(wx->netdev);
> + netif_tx_disable(wx->netdev);
> + wx_napi_disable_all(wx);
> +
> + clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
> + timer_delete_sync(&wx->service_timer);
> +}
> +EXPORT_SYMBOL(wx_soft_quiesce);
> +
> MODULE_DESCRIPTION("Common library for Wangxun(R) Ethernet
> drivers."); MODULE_LICENSE("GPL"); diff --git
> a/drivers/net/ethernet/wangxun/libwx/wx_lib.h
> b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
> index aed6ea8cf0d6..11bd79985e17 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.h
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
> @@ -41,5 +41,6 @@ void wx_set_ring(struct wx *wx, u32 new_tx_count,
> void wx_service_event_schedule(struct wx *wx); void
> wx_service_event_complete(struct wx *wx); void
> wx_service_timer(struct timer_list *t);
> +void wx_soft_quiesce(struct wx *wx);
>
> #endif /* _WX_LIB_H_ */
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
> b/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
> index 44f3e6505246..dcc8b3ae1445 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_ptp.c
> @@ -842,6 +842,27 @@ void wx_ptp_stop(struct wx *wx) }
> EXPORT_SYMBOL(wx_ptp_stop);
>
> +void wx_ptp_quiesce(struct wx *wx)
> +{
> + if (!test_and_clear_bit(WX_STATE_PTP_RUNNING, wx->state))
> + return;
> +
> + clear_bit(WX_FLAG_PTP_PPS_ENABLED, wx->flags);
> +
> + if (wx->ptp_tx_skb) {
> + dev_kfree_skb_any(wx->ptp_tx_skb);
> + wx->ptp_tx_skb = NULL;
> + }
> + clear_bit_unlock(WX_STATE_PTP_TX_IN_PROGRESS, wx->state);
> +
> + if (wx->ptp_clock) {
> + ptp_clock_unregister(wx->ptp_clock);
> + wx->ptp_clock = NULL;
> + dev_info(&wx->pdev->dev, "removed PHC on %s\n", wx-
> >netdev->name);
> + }
> +}
> +EXPORT_SYMBOL(wx_ptp_quiesce);
> +
> /**
> * wx_ptp_rx_hwtstamp - utility function which checks for RX time
> stamp
> * @wx: pointer to wx struct
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
> b/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
> index 50db90a6e3ee..ad2f824875d5 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_ptp.h
> @@ -10,6 +10,7 @@ void wx_ptp_reset(struct wx *wx); void
> wx_ptp_init(struct wx *wx); void wx_ptp_suspend(struct wx *wx); void
> wx_ptp_stop(struct wx *wx);
> +void wx_ptp_quiesce(struct wx *wx);
> void wx_ptp_rx_hwtstamp(struct wx *wx, struct sk_buff *skb); int
> wx_hwtstamp_get(struct net_device *dev,
> struct kernel_hwtstamp_config *cfg); diff --git
> a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> index 9251e7a1d416..f6e596eb9217 100644
> --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> @@ -94,6 +94,10 @@ static void txgbe_module_detection_subtask(struct
> wx *wx) {
> int err;
>
> + if (test_bit(WX_STATE_DOWN, wx->state) ||
> + test_bit(WX_STATE_RESETTING, wx->state))
> + return;
> +
> if (!test_and_clear_bit(WX_FLAG_NEED_MODULE_RESET, wx->flags))
> return;
>
> @@ -107,6 +111,10 @@ static void txgbe_module_detection_subtask(struct
> wx *wx)
>
> static void txgbe_link_config_subtask(struct wx *wx) {
> + if (test_bit(WX_STATE_DOWN, wx->state) ||
> + test_bit(WX_STATE_RESETTING, wx->state))
> + return;
> +
> if (!test_and_clear_bit(WX_FLAG_NEED_LINK_CONFIG, wx->flags))
> return;
>
> --
> 2.51.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net-next v5 5/5] net: wangxun: add pcie error handler
2026-06-04 8:56 ` [PATCH net-next v5 5/5] net: wangxun: add pcie error handler Jiawen Wu
@ 2026-06-08 15:05 ` Simon Horman
0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2026-06-08 15:05 UTC (permalink / raw)
To: jiawenwu
Cc: 'Simon Horman', netdev, mengyuanlou, andrew+netdev, davem,
edumazet, kuba, pabeni, richardcochran, linux, jacob.e.keller,
michal.swiatkowski, kees, larysa.zaremba, joe, leitao,
aleksandr.loktionov, u.kleine-koenig, fabio.baltieri, tglx,
gregkh
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
Simon says: there is also AI-generated review available of other patches of
this patch-set. But I believe those flagged below are the ones
that warrant investigation in the context of this patchset.
---
net: wangxun: add pcie error handler
Support AER driver to handle the PCIe errors. Sometimes netdev watchdog
Tx timeout happens before the AER error report when a PCIe error occurs,
CPU blocking would be caused by MMIO during the reset process. To
prevent it, error return is added in reset path. The current function of
ngbe is not yet fully developed, it will be completed in the future.
> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> index ee27f96735dc7..9a4dbb3427d93 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
[ ... ]
> @@ -4,11 +4,125 @@
> static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
> {
> struct wx *wx = pci_get_drvdata(pdev);
> pci_ers_result_t result;
>
> if (pci_enable_device_mem(pdev)) {
> wx_err(wx, "Cannot re-enable PCI device after reset.\n");
> result = PCI_ERS_RESULT_DISCONNECT;
> } else {
> /* make all memory operations done before clearing the flag */
> smp_mb__before_atomic();
> clear_bit(WX_STATE_DISABLED, wx->state);
> pci_set_master(pdev);
> pci_restore_state(pdev);
> pci_wake_from_d3(pdev, false);
>
> + rtnl_lock();
> + if (netif_running(wx->netdev) && wx->down_suspend)
> + wx->down_suspend(wx);
> + if (wx->do_reset)
> + wx->do_reset(wx->netdev, false);
> + rtnl_unlock();
[Severity: Critical]
Does this sequence leave the netdev state out of sync if wx_io_resume()
later fails?
If wx_io_resume() fails its ndo_open() call, it returns with netif_running()
still true. If an administrator later sets the interface down, the kernel
will call ndo_stop (txgbe_close).
txgbe_close() unconditionally calls wx_free_irq(), txgbe_free_misc_irq(), and
wx_free_resources(). Because txgbe_free_misc_irq() does not set
txgbe->misc.domain to NULL, calling it a second time causes a double free.
> diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
> index 3e32aca728065..80811947d5acf 100644
> --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
> +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_ethtool.c
[ ... ]
> @@ -78,13 +78,17 @@ static int txgbe_set_ringparam(struct net_device *netdev,
> goto clear_reset;
> }
>
> - txgbe_down(wx);
> + err = txgbe_down(wx);
> + if (err)
> + goto free_temp;
>
> wx_set_ring(wx, new_tx_count, new_rx_count, temp_ring);
> kvfree(temp_ring);
>
> txgbe_up(wx);
>
> +free_temp:
> + kvfree(temp_ring);
> clear_reset:
> clear_bit(WX_STATE_RESETTING, wx->state);
[Severity: High]
If txgbe_down() returns an error here, the function jumps to free_temp and
skips calling txgbe_up().
Because txgbe_disable_device() has already called wx_soft_quiesce() to stop
TX queues and disable NAPI, does this leave the interface logically stuck
down while the OS still considers it up?
> diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> index f6e596eb92178..98786efbe871d 100644
> --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
[ ... ]
> @@ -212,15 +212,21 @@ static void txgbe_reset(struct wx *wx)
> wx_ptp_reset(wx);
> }
>
> -static void txgbe_disable_device(struct wx *wx)
> +static int txgbe_disable_device(struct wx *wx)
> {
> struct net_device *netdev = wx->netdev;
> + int ret = 0;
> u32 i;
>
> if (test_and_set_bit(WX_STATE_DOWN, wx->state))
> - return;
> + return 0;
> +
> + ret = wx_disable_pcie_master(wx);
> + if (ret) {
> + wx_soft_quiesce(wx);
> + return ret;
> + }
[Severity: High]
If wx_disable_pcie_master() times out and returns an error, the function
exits without clearing the WX_STATE_DOWN bit. If a subsequent teardown
attempt occurs, will the WX_STATE_DOWN check falsely assume the device is
already disabled and return 0 (success)?
[Severity: Critical]
Returning early here skips critical hardware teardown like wx_disable_rx()
and the MAC shutdown.
Because callers like txgbe_close() do not check the return value of
txgbe_down(), they proceed to unconditionally free all DMA rings via
wx_free_resources(). Can this allow the hardware to DMA incoming packets into
freed memory pages?
[ ... ]
> @@ -606,13 +635,16 @@ int txgbe_setup_tc(struct net_device *dev, u8 tc)
>
> static void txgbe_reinit_locked(struct wx *wx)
> {
> + int ret;
> +
> netif_trans_update(wx->netdev);
>
> mutex_lock(&wx->reset_lock);
> set_bit(WX_STATE_RESETTING, wx->state);
>
> - txgbe_down(wx);
> - txgbe_up(wx);
> + ret = txgbe_down(wx);
> + if (!ret)
> + txgbe_up(wx);
>
> clear_bit(WX_STATE_RESETTING, wx->state);
[Severity: High]
Similar to the ethtool path, if txgbe_down() fails here, txgbe_up() is
skipped. Does this leave the interface in a broken state until the next
administrative down/up cycle?
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-06-08 15:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 8:56 [PATCH net-next v5 0/5] net: wangxun: timeout and error Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 1/5] net: ngbe: implement libwx reset ops Jiawen Wu
2026-06-08 8:47 ` Loktionov, Aleksandr
2026-06-04 8:56 ` [PATCH net-next v5 2/5] net: wangxun: add Tx timeout process Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 3/5] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
2026-06-04 8:56 ` [PATCH net-next v5 4/5] net: wangxun: implement soft quiesce for PCIe error recovery Jiawen Wu
2026-06-08 8:47 ` Loktionov, Aleksandr
2026-06-04 8:56 ` [PATCH net-next v5 5/5] net: wangxun: add pcie error handler Jiawen Wu
2026-06-08 15:05 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox