public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/6] net: wangxun: timeout and error
@ 2026-04-30  8:25 Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops Jiawen Wu
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

This series is a split of the previous series:
https://lore.kernel.org/all/20260326021406.30444-1-jiawenwu@trustnetic.com

It is about adding the Tx timeout process and pci_error_handlers.
The changes from the last full patch set V6:
- Add 'else' handling in ngbe_do_reset().
- Acquire rtnl_lock() before checking netif_running() in
  wx_reset_subtask().
- Use test_and_clear_bit() instead of test_bit()…clear_bit() to avoid
  losing another reset request.
- Change ‘u64 tx_done_old’ to ‘u32’ to avoid data race between
  dev_watchdog and NAPI polling.
- Check the return value of ndo_open() in wx_io_resume().
- Drop pci_save_state().

Changes log:
v2:
- Add the missing rtnl_unlock() at early return in wx_reset_subtask().
- Replace ngbe_close() with ngbe_close_suspend() in ngbe_dev_shutdown().
- Add a patch to clear stored DMA addresses.
 
v1: https://lore.kernel.org/r/20260428021156.13564-1-jiawenwu@trustnetic.com

Jiawen Wu (6):
  net: ngbe: implement libwx reset ops
  net: wangxun: add Tx timeout process
  net: wangxun: add reinit parameter to wx->do_reset callback
  net: wangxun: extract the close_suspend sequence
  net: wangxun: clear stored DMA addresses after dma_free_coherent()
  net: wangxun: implement pci_error_handlers ops

 drivers/net/ethernet/wangxun/libwx/Makefile   |   2 +-
 drivers/net/ethernet/wangxun/libwx/wx_err.c   | 233 ++++++++++++++++++
 drivers/net/ethernet/wangxun/libwx/wx_err.h   |  16 ++
 .../net/ethernet/wangxun/libwx/wx_ethtool.c   |   2 +-
 drivers/net/ethernet/wangxun/libwx/wx_hw.c    |  17 +-
 drivers/net/ethernet/wangxun/libwx/wx_lib.c   |  46 +++-
 drivers/net/ethernet/wangxun/libwx/wx_lib.h   |   1 +
 drivers/net/ethernet/wangxun/libwx/wx_type.h  |  16 +-
 .../net/ethernet/wangxun/ngbe/ngbe_ethtool.c  |   1 -
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c |  70 +++++-
 drivers/net/ethernet/wangxun/ngbe/ngbe_type.h |   2 +
 .../net/ethernet/wangxun/txgbe/txgbe_main.c   |  29 ++-
 .../net/ethernet/wangxun/txgbe/txgbe_type.h   |   3 +-
 13 files changed, 407 insertions(+), 31 deletions(-)
 create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.c
 create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.h

-- 
2.51.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-05-03  2:15   ` Jakub Kicinski
  2026-04-30  8:25 ` [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process Jiawen Wu
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

Implement wx->do_reset() for library module calling.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 .../net/ethernet/wangxun/ngbe/ngbe_ethtool.c  |  1 -
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 37 ++++++++++++++++++-
 drivers/net/ethernet/wangxun/ngbe/ngbe_type.h |  1 +
 3 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
index b2e191982803..1960f7154151 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_ethtool.c
@@ -59,7 +59,6 @@ static int ngbe_set_ringparam(struct net_device *netdev,
 	wx_set_ring(wx, new_tx_count, new_rx_count, temp_ring);
 	kvfree(temp_ring);
 
-	wx_configure(wx);
 	ngbe_up(wx);
 
 clear_reset:
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index d8e3827a8b1f..bd905e267575 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -133,6 +133,7 @@ static int ngbe_sw_init(struct wx *wx)
 
 	wx->mbx.size = WX_VXMAILBOX_SIZE;
 	wx->setup_tc = ngbe_setup_tc;
+	wx->do_reset = ngbe_do_reset;
 	set_bit(0, &wx->fwd_bitmask);
 
 	return 0;
@@ -422,7 +423,7 @@ void ngbe_down(struct wx *wx)
 	wx_clean_all_rx_rings(wx);
 }
 
-void ngbe_up(struct wx *wx)
+static void ngbe_up_complete(struct wx *wx)
 {
 	wx_configure_vectors(wx);
 
@@ -488,7 +489,7 @@ static int ngbe_open(struct net_device *netdev)
 
 	wx_ptp_init(wx);
 
-	ngbe_up(wx);
+	ngbe_up_complete(wx);
 
 	return 0;
 err_dis_phy:
@@ -501,6 +502,12 @@ static int ngbe_open(struct net_device *netdev)
 	return err;
 }
 
+void ngbe_up(struct wx *wx)
+{
+	wx_configure(wx);
+	ngbe_up_complete(wx);
+}
+
 /**
  * ngbe_close - Disables a network interface
  * @netdev: network interface device structure
@@ -588,6 +595,8 @@ int ngbe_setup_tc(struct net_device *dev, u8 tc)
 	 */
 	if (netif_running(dev))
 		ngbe_close(dev);
+	else
+		ngbe_reset(wx);
 
 	wx_clear_interrupt_scheme(wx);
 
@@ -604,6 +613,30 @@ int ngbe_setup_tc(struct net_device *dev, u8 tc)
 	return 0;
 }
 
+static void ngbe_reinit_locked(struct wx *wx)
+{
+	netif_trans_update(wx->netdev);
+
+	mutex_lock(&wx->reset_lock);
+	set_bit(WX_STATE_RESETTING, wx->state);
+
+	ngbe_down(wx);
+	ngbe_up(wx);
+
+	clear_bit(WX_STATE_RESETTING, wx->state);
+	mutex_unlock(&wx->reset_lock);
+}
+
+void ngbe_do_reset(struct net_device *netdev)
+{
+	struct wx *wx = netdev_priv(netdev);
+
+	if (netif_running(netdev))
+		ngbe_reinit_locked(wx);
+	else
+		ngbe_reset(wx);
+}
+
 static const struct net_device_ops ngbe_netdev_ops = {
 	.ndo_open               = ngbe_open,
 	.ndo_stop               = ngbe_close,
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
index 7077a0da4c98..4f648f272c08 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
@@ -125,5 +125,6 @@ extern char ngbe_driver_name[];
 void ngbe_down(struct wx *wx);
 void ngbe_up(struct wx *wx);
 int ngbe_setup_tc(struct net_device *dev, u8 tc);
+void ngbe_do_reset(struct net_device *netdev);
 
 #endif /* _NGBE_TYPE_H_ */
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-05-03  2:15   ` Jakub Kicinski
  2026-04-30  8:25 ` [PATCH net-next v2 3/6] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

Implement .ndo_tx_timeout to handle Tx side timeout event. When Tx
timeout event occur, it will triger driver into reset process.

The WX_HANG_CHECK_ARMED bit is set to indicate a potential hang. It will
be cleared if a pause frame is received to remove false hang detection
due to 802.3 frames.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/Makefile   |   2 +-
 drivers/net/ethernet/wangxun/libwx/wx_err.c   | 126 ++++++++++++++++++
 drivers/net/ethernet/wangxun/libwx/wx_err.h   |  14 ++
 drivers/net/ethernet/wangxun/libwx/wx_hw.c    |  17 ++-
 drivers/net/ethernet/wangxun/libwx/wx_lib.c   |  37 +++++
 drivers/net/ethernet/wangxun/libwx/wx_lib.h   |   1 +
 drivers/net/ethernet/wangxun/libwx/wx_type.h  |  12 +-
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c |   4 +
 .../net/ethernet/wangxun/txgbe/txgbe_main.c   |   4 +
 9 files changed, 212 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.c
 create mode 100644 drivers/net/ethernet/wangxun/libwx/wx_err.h

diff --git a/drivers/net/ethernet/wangxun/libwx/Makefile b/drivers/net/ethernet/wangxun/libwx/Makefile
index a71b0ad77de3..c8724bb129aa 100644
--- a/drivers/net/ethernet/wangxun/libwx/Makefile
+++ b/drivers/net/ethernet/wangxun/libwx/Makefile
@@ -4,5 +4,5 @@
 
 obj-$(CONFIG_LIBWX) += libwx.o
 
-libwx-objs := wx_hw.o wx_lib.o wx_ethtool.o wx_ptp.o wx_mbx.o wx_sriov.o
+libwx-objs := wx_hw.o wx_lib.o wx_ethtool.o wx_ptp.o wx_mbx.o wx_sriov.o wx_err.o
 libwx-objs += wx_vf.o wx_vf_lib.o wx_vf_common.o
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
new file mode 100644
index 000000000000..ba5f23cefc0f
--- /dev/null
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2015 - 2026 Beijing WangXun Technology Co., Ltd. */
+
+#include <linux/netdevice.h>
+#include <linux/pci.h>
+
+#include "wx_type.h"
+#include "wx_lib.h"
+#include "wx_err.h"
+
+static void wx_reset_subtask(struct wx *wx)
+{
+	if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
+		return;
+
+	rtnl_lock();
+
+	if (!netif_running(wx->netdev) ||
+	    test_bit(WX_STATE_RESETTING, wx->state))
+		goto out;
+
+	wx_warn(wx, "Reset adapter.\n");
+
+	if (test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
+		if (wx->do_reset)
+			wx->do_reset(wx->netdev);
+	}
+
+out:
+	rtnl_unlock();
+}
+
+/*
+ * wx_check_tx_hang_subtask - check for hung queues and dropped interrupts
+ * @wx - pointer to the device wx structure
+ *
+ * This function serves two purposes.  First it strobes the interrupt lines
+ * in order to make certain interrupts are occurring.  Secondly it sets the
+ * bits needed to check for TX hangs.  As a result we should immediately
+ * determine if a hang has occurred.
+ */
+static void wx_check_tx_hang_subtask(struct wx *wx)
+{
+	int i;
+
+	/* If we're down or resetting, just bail */
+	if (!netif_running(wx->netdev) ||
+	    test_bit(WX_STATE_RESETTING, wx->state))
+		return;
+
+	/* Force detection of hung controller */
+	if (netif_carrier_ok(wx->netdev)) {
+		for (i = 0; i < wx->num_tx_queues; i++)
+			set_bit(WX_TX_DETECT_HANG, wx->tx_ring[i]->state);
+	}
+}
+
+void wx_handle_errors_subtask(struct wx *wx)
+{
+	wx_reset_subtask(wx);
+	wx_check_tx_hang_subtask(wx);
+}
+EXPORT_SYMBOL(wx_handle_errors_subtask);
+
+static void wx_tx_timeout_reset(struct wx *wx)
+{
+	if (!netif_running(wx->netdev))
+		return;
+
+	set_bit(WX_FLAG_NEED_PF_RESET, wx->flags);
+	wx_warn(wx, "initiating reset due to tx timeout\n");
+	wx_service_event_schedule(wx);
+}
+
+void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue)
+{
+	struct wx *wx = netdev_priv(netdev);
+	u32 head, tail;
+	int i;
+
+	for (i = 0; i < wx->num_tx_queues; i++) {
+		struct wx_ring *tx_ring = wx->tx_ring[i];
+
+		if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
+		    wx_check_tx_hang(tx_ring))
+			wx_warn(wx, "Real tx hang detected on queue %d\n", i);
+
+		head = rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx));
+		tail = rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx));
+		wx_warn(wx,
+			"tx ring %d next_to_use is %d, next_to_clean is %d\n",
+			i, tx_ring->next_to_use,
+			tx_ring->next_to_clean);
+		wx_warn(wx, "tx ring %d hw rp is 0x%x, wp is 0x%x\n",
+			i, head, tail);
+	}
+
+	wx_tx_timeout_reset(wx);
+}
+EXPORT_SYMBOL(wx_tx_timeout);
+
+void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next)
+{
+	struct wx *wx = netdev_priv(tx_ring->netdev);
+
+	wx_warn(wx, "Detected Tx Unit Hang\n"
+		"  Tx Queue             <%d>\n"
+		"  TDH, TDT             <%x>, <%x>\n"
+		"  next_to_use          <%x>\n"
+		"  next_to_clean        <%x>\n"
+		"tx_buffer_info[next_to_clean]\n"
+		"  time_stamp           <%lx>\n"
+		"  jiffies              <%lx>\n",
+		tx_ring->queue_index,
+		rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx)),
+		rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx)),
+		tx_ring->next_to_use, next,
+		tx_ring->tx_buffer_info[next].time_stamp, jiffies);
+
+	netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index);
+
+	wx_warn(wx, "tx hang detected on queue %d, resetting adapter\n",
+		tx_ring->queue_index);
+
+	wx_tx_timeout_reset(wx);
+}
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.h b/drivers/net/ethernet/wangxun/libwx/wx_err.h
new file mode 100644
index 000000000000..e317e6c8d928
--- /dev/null
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * WangXun Gigabit PCI Express Linux driver
+ * Copyright (c) 2015 - 2026 Beijing WangXun Technology Co., Ltd.
+ */
+
+#ifndef _WX_ERR_H_
+#define _WX_ERR_H_
+
+void wx_handle_errors_subtask(struct wx *wx);
+void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue);
+void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next);
+
+#endif /* _WX_ERR_H_ */
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
index d3772d01e00b..401dc7eb1137 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c
@@ -1932,6 +1932,7 @@ static void wx_configure_tx_ring(struct wx *wx,
 	else
 		ring->atr_sample_rate = 0;
 
+	bitmap_zero(ring->state, WX_RING_STATE_NBITS);
 	/* reinitialize tx_buffer_info */
 	memset(ring->tx_buffer_info, 0,
 	       sizeof(struct wx_tx_buffer) * ring->count);
@@ -2847,16 +2848,26 @@ EXPORT_SYMBOL(wx_fc_enable);
 static void wx_update_xoff_rx_lfc(struct wx *wx)
 {
 	struct wx_hw_stats *hwstats = &wx->stats;
+	u64 data;
+	int i;
 
 	if (wx->fc.mode != wx_fc_full &&
 	    wx->fc.mode != wx_fc_rx_pause)
 		return;
 
 	if (wx->mac.type >= wx_mac_aml)
-		hwstats->lxoffrxc += rd32_wrap(wx, WX_MAC_LXOFFRXC_AML,
-					       &wx->last_stats.lxoffrxc);
+		data = rd32_wrap(wx, WX_MAC_LXOFFRXC_AML,
+				 &wx->last_stats.lxoffrxc);
 	else
-		hwstats->lxoffrxc += rd64(wx, WX_MAC_LXOFFRXC);
+		data = rd64(wx, WX_MAC_LXOFFRXC);
+	hwstats->lxoffrxc += data;
+
+	/* refill credits (no tx hang) if we received xoff */
+	if (!data)
+		return;
+
+	for (i = 0; i < wx->num_tx_queues; i++)
+		clear_bit(WX_HANG_CHECK_ARMED, wx->tx_ring[i]->state);
 }
 
 /**
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index 746623fa59b4..9e6167b43f75 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -14,6 +14,7 @@
 
 #include "wx_type.h"
 #include "wx_lib.h"
+#include "wx_err.h"
 #include "wx_ptp.h"
 #include "wx_hw.h"
 #include "wx_vf_lib.h"
@@ -742,6 +743,36 @@ static struct netdev_queue *wx_txring_txq(const struct wx_ring *ring)
 	return netdev_get_tx_queue(ring->netdev, ring->queue_index);
 }
 
+static u32 wx_get_tx_pending(struct wx_ring *ring)
+{
+	unsigned int head, tail;
+
+	head = ring->next_to_clean;
+	tail = ring->next_to_use;
+
+	return ((head <= tail) ? tail : tail + ring->count) - head;
+}
+
+bool wx_check_tx_hang(struct wx_ring *ring)
+{
+	u32 tx_done_old = ring->tx_stats.tx_done_old;
+	u32 tx_pending = wx_get_tx_pending(ring);
+	u32 tx_done = ring->stats.packets;
+
+	clear_bit(WX_TX_DETECT_HANG, ring->state);
+
+	if (tx_done_old == tx_done && tx_pending)
+		/* make sure it is true for two checks in a row */
+		return test_and_set_bit(WX_HANG_CHECK_ARMED, ring->state);
+
+	/* update completed stats and continue */
+	ring->tx_stats.tx_done_old = tx_done;
+	/* reset the countdown */
+	clear_bit(WX_HANG_CHECK_ARMED, ring->state);
+
+	return false;
+}
+
 /**
  * wx_clean_tx_irq - Reclaim resources after transmit completes
  * @q_vector: structure containing interrupt and ring information
@@ -866,6 +897,12 @@ static bool wx_clean_tx_irq(struct wx_q_vector *q_vector,
 	netdev_tx_completed_queue(wx_txring_txq(tx_ring),
 				  total_packets, total_bytes);
 
+	if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
+	    wx_check_tx_hang(tx_ring)) {
+		wx_handle_tx_hang(tx_ring, i);
+		return true;
+	}
+
 #define TX_WAKE_THRESHOLD (DESC_NEEDED * 2)
 	if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) &&
 		     (wx_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) {
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.h b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
index aed6ea8cf0d6..e373cd7f05d3 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.h
@@ -10,6 +10,7 @@
 struct wx_dec_ptype wx_decode_ptype(const u8 ptype);
 void wx_alloc_rx_buffers(struct wx_ring *rx_ring, u16 cleaned_count);
 u16 wx_desc_unused(struct wx_ring *ring);
+bool wx_check_tx_hang(struct wx_ring *ring);
 netdev_tx_t wx_xmit_frame(struct sk_buff *skb,
 			  struct net_device *netdev);
 void wx_napi_enable_all(struct wx *wx);
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 0da5565ee4ff..f65c2d7bae39 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1039,6 +1039,7 @@ struct wx_queue_stats {
 struct wx_tx_queue_stats {
 	u64 restart_queue;
 	u64 tx_busy;
+	u32 tx_done_old;
 };
 
 struct wx_rx_queue_stats {
@@ -1054,6 +1055,12 @@ struct wx_rx_queue_stats {
 #define wx_for_each_ring(posm, headm) \
 	for (posm = (headm).ring; posm; posm = posm->next)
 
+enum wx_ring_state {
+	WX_TX_DETECT_HANG,
+	WX_HANG_CHECK_ARMED,
+	WX_RING_STATE_NBITS
+};
+
 struct wx_ring_container {
 	struct wx_ring *ring;           /* pointer to linked list of rings */
 	unsigned int total_bytes;       /* total bytes processed this int */
@@ -1073,6 +1080,7 @@ struct wx_ring {
 		struct wx_tx_buffer *tx_buffer_info;
 		struct wx_rx_buffer *rx_buffer_info;
 	};
+	DECLARE_BITMAP(state, WX_RING_STATE_NBITS);
 	u8 __iomem *tail;
 	dma_addr_t dma;                 /* phys. address of descriptor ring */
 	dma_addr_t headwb_dma;
@@ -1273,6 +1281,7 @@ enum wx_pf_flags {
 	WX_FLAG_NEED_DO_RESET,
 	WX_FLAG_RX_MERGE_ENABLED,
 	WX_FLAG_TXHEAD_WB_ENABLED,
+	WX_FLAG_NEED_PF_RESET,
 	WX_PF_FLAGS_NBITS               /* must be last */
 };
 
@@ -1503,7 +1512,8 @@ rd32_wrap(struct wx *wx, u32 reg, u32 *last)
 
 #define wx_err(wx, fmt, arg...) \
 	dev_err(&(wx)->pdev->dev, fmt, ##arg)
-
+#define wx_warn(wx, fmt, arg...) \
+	dev_warn(&(wx)->pdev->dev, fmt, ##arg)
 #define wx_dbg(wx, fmt, arg...) \
 	dev_dbg(&(wx)->pdev->dev, fmt, ##arg)
 
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index bd905e267575..e9561996b970 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -14,6 +14,7 @@
 #include "../libwx/wx_type.h"
 #include "../libwx/wx_hw.h"
 #include "../libwx/wx_lib.h"
+#include "../libwx/wx_err.h"
 #include "../libwx/wx_ptp.h"
 #include "../libwx/wx_mbx.h"
 #include "../libwx/wx_sriov.h"
@@ -147,6 +148,7 @@ static void ngbe_service_task(struct work_struct *work)
 {
 	struct wx *wx = container_of(work, struct wx, service_task);
 
+	wx_handle_errors_subtask(wx);
 	wx_update_stats(wx);
 
 	wx_service_event_complete(wx);
@@ -642,6 +644,7 @@ static const struct net_device_ops ngbe_netdev_ops = {
 	.ndo_stop               = ngbe_close,
 	.ndo_change_mtu         = wx_change_mtu,
 	.ndo_start_xmit         = wx_xmit_frame,
+	.ndo_tx_timeout         = wx_tx_timeout,
 	.ndo_set_rx_mode        = wx_set_rx_mode,
 	.ndo_set_features       = wx_set_features,
 	.ndo_fix_features       = wx_fix_features,
@@ -731,6 +734,7 @@ static int ngbe_probe(struct pci_dev *pdev,
 	wx->driver_name = ngbe_driver_name;
 	ngbe_set_ethtool_ops(netdev);
 	netdev->netdev_ops = &ngbe_netdev_ops;
+	netdev->watchdog_timeo = 5 * HZ;
 
 	netdev->features = NETIF_F_SG | NETIF_F_IP_CSUM |
 			   NETIF_F_TSO | NETIF_F_TSO6 |
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 8b7c3753bb6a..5793da5b7bab 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -14,6 +14,7 @@
 
 #include "../libwx/wx_type.h"
 #include "../libwx/wx_lib.h"
+#include "../libwx/wx_err.h"
 #include "../libwx/wx_ptp.h"
 #include "../libwx/wx_hw.h"
 #include "../libwx/wx_mbx.h"
@@ -128,6 +129,7 @@ static void txgbe_service_task(struct work_struct *work)
 {
 	struct wx *wx = container_of(work, struct wx, service_task);
 
+	wx_handle_errors_subtask(wx);
 	txgbe_module_detection_subtask(wx);
 	txgbe_link_config_subtask(wx);
 	wx_update_stats(wx);
@@ -659,6 +661,7 @@ static const struct net_device_ops txgbe_netdev_ops = {
 	.ndo_stop               = txgbe_close,
 	.ndo_change_mtu         = wx_change_mtu,
 	.ndo_start_xmit         = wx_xmit_frame,
+	.ndo_tx_timeout         = wx_tx_timeout,
 	.ndo_set_rx_mode        = wx_set_rx_mode,
 	.ndo_set_features       = wx_set_features,
 	.ndo_fix_features       = wx_fix_features,
@@ -750,6 +753,7 @@ static int txgbe_probe(struct pci_dev *pdev,
 	wx->driver_name = txgbe_driver_name;
 	txgbe_set_ethtool_ops(netdev);
 	netdev->netdev_ops = &txgbe_netdev_ops;
+	netdev->watchdog_timeo = 5 * HZ;
 	netdev->udp_tunnel_nic_info = &txgbe_udp_tunnels;
 
 	/* setup the private structure */
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 3/6] net: wangxun: add reinit parameter to wx->do_reset callback
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence Jiawen Wu
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

To implement a simple hardware reset without tearing down the network
interface state, introduce a boolean 'reinit' parameter to wx->do_reset
callback.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/wx_err.c     | 2 +-
 drivers/net/ethernet/wangxun/libwx/wx_ethtool.c | 2 +-
 drivers/net/ethernet/wangxun/libwx/wx_lib.c     | 4 ++--
 drivers/net/ethernet/wangxun/libwx/wx_type.h    | 2 +-
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c   | 4 ++--
 drivers/net/ethernet/wangxun/ngbe/ngbe_type.h   | 2 +-
 drivers/net/ethernet/wangxun/txgbe/txgbe_main.c | 4 ++--
 drivers/net/ethernet/wangxun/txgbe/txgbe_type.h | 2 +-
 8 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
index ba5f23cefc0f..124011c3d5b1 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -23,7 +23,7 @@ static void wx_reset_subtask(struct wx *wx)
 
 	if (test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
 		if (wx->do_reset)
-			wx->do_reset(wx->netdev);
+			wx->do_reset(wx->netdev, true);
 	}
 
 out:
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c b/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
index 5df971aca9e3..d1356ff5d69b 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_ethtool.c
@@ -395,7 +395,7 @@ static void wx_update_rsc(struct wx *wx)
 
 	/* reset the device to apply the new RSC setting */
 	if (need_reset && wx->do_reset)
-		wx->do_reset(netdev);
+		wx->do_reset(netdev, true);
 }
 
 int wx_set_coalesce(struct net_device *netdev,
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index 9e6167b43f75..3216dee778be 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -3146,7 +3146,7 @@ int wx_set_features(struct net_device *netdev, netdev_features_t features)
 	netdev->features = features;
 
 	if (changed & NETIF_F_HW_VLAN_CTAG_RX && wx->do_reset)
-		wx->do_reset(netdev);
+		wx->do_reset(netdev, true);
 	else if (changed & (NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_FILTER))
 		wx_set_rx_mode(netdev);
 
@@ -3196,7 +3196,7 @@ int wx_set_features(struct net_device *netdev, netdev_features_t features)
 
 out:
 	if (need_reset && wx->do_reset)
-		wx->do_reset(netdev);
+		wx->do_reset(netdev, true);
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index f65c2d7bae39..671ac0a19dee 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1402,7 +1402,7 @@ struct wx {
 	void (*atr)(struct wx_ring *ring, struct wx_tx_buffer *first, u8 ptype);
 	void (*configure_fdir)(struct wx *wx);
 	int (*setup_tc)(struct net_device *netdev, u8 tc);
-	void (*do_reset)(struct net_device *netdev);
+	void (*do_reset)(struct net_device *netdev, bool reinit);
 	int (*ptp_setup_sdp)(struct wx *wx);
 	void (*set_num_queues)(struct wx *wx);
 
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index e9561996b970..ec14dd47cd42 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -629,11 +629,11 @@ static void ngbe_reinit_locked(struct wx *wx)
 	mutex_unlock(&wx->reset_lock);
 }
 
-void ngbe_do_reset(struct net_device *netdev)
+void ngbe_do_reset(struct net_device *netdev, bool reinit)
 {
 	struct wx *wx = netdev_priv(netdev);
 
-	if (netif_running(netdev))
+	if (netif_running(netdev) && reinit)
 		ngbe_reinit_locked(wx);
 	else
 		ngbe_reset(wx);
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
index 4f648f272c08..c9233dc7ae50 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
@@ -125,6 +125,6 @@ extern char ngbe_driver_name[];
 void ngbe_down(struct wx *wx);
 void ngbe_up(struct wx *wx);
 int ngbe_setup_tc(struct net_device *dev, u8 tc);
-void ngbe_do_reset(struct net_device *netdev);
+void ngbe_do_reset(struct net_device *netdev, bool reinit);
 
 #endif /* _NGBE_TYPE_H_ */
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 5793da5b7bab..9887638203cb 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -613,11 +613,11 @@ static void txgbe_reinit_locked(struct wx *wx)
 	mutex_unlock(&wx->reset_lock);
 }
 
-void txgbe_do_reset(struct net_device *netdev)
+void txgbe_do_reset(struct net_device *netdev, bool reinit)
 {
 	struct wx *wx = netdev_priv(netdev);
 
-	if (netif_running(netdev))
+	if (netif_running(netdev) && reinit)
 		txgbe_reinit_locked(wx);
 	else
 		txgbe_reset(wx);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
index 6b05f32b4a01..1e373f7fd9b5 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
@@ -313,7 +313,7 @@ extern char txgbe_driver_name[];
 void txgbe_down(struct wx *wx);
 void txgbe_up(struct wx *wx);
 int txgbe_setup_tc(struct net_device *dev, u8 tc);
-void txgbe_do_reset(struct net_device *netdev);
+void txgbe_do_reset(struct net_device *netdev, bool reinit);
 
 #define TXGBE_LINK_SPEED_UNKNOWN        0
 #define TXGBE_LINK_SPEED_10GB_FULL      4
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
                   ` (2 preceding siblings ...)
  2026-04-30  8:25 ` [PATCH net-next v2 3/6] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-05-03  2:15   ` Jakub Kicinski
  2026-04-30  8:25 ` [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent() Jiawen Wu
  2026-04-30  8:25 ` [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops Jiawen Wu
  5 siblings, 1 reply; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

Refactor the .ndo_close implementation by extracting the necessary
hardware shutdown sequence into a dedicated close_suspend function.

This is for later implementation of PCIe error callback function in
libwx.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/wx_type.h  |  1 +
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c | 20 +++++++++++++------
 drivers/net/ethernet/wangxun/ngbe/ngbe_type.h |  1 +
 .../net/ethernet/wangxun/txgbe/txgbe_main.c   | 13 ++++++------
 .../net/ethernet/wangxun/txgbe/txgbe_type.h   |  1 +
 5 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 671ac0a19dee..4b72835ddec1 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1403,6 +1403,7 @@ struct wx {
 	void (*configure_fdir)(struct wx *wx);
 	int (*setup_tc)(struct net_device *netdev, u8 tc);
 	void (*do_reset)(struct net_device *netdev, bool reinit);
+	void (*close_suspend)(struct wx *wx);
 	int (*ptp_setup_sdp)(struct wx *wx);
 	void (*set_num_queues)(struct wx *wx);
 
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index ec14dd47cd42..2bd00eade11d 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -135,6 +135,7 @@ static int ngbe_sw_init(struct wx *wx)
 	wx->mbx.size = WX_VXMAILBOX_SIZE;
 	wx->setup_tc = ngbe_setup_tc;
 	wx->do_reset = ngbe_do_reset;
+	wx->close_suspend = ngbe_close_suspend;
 	set_bit(0, &wx->fwd_bitmask);
 
 	return 0;
@@ -510,6 +511,16 @@ void ngbe_up(struct wx *wx)
 	ngbe_up_complete(wx);
 }
 
+void ngbe_close_suspend(struct wx *wx)
+{
+	wx_ptp_suspend(wx);
+	ngbe_down(wx);
+	wx_free_irq(wx);
+	wx_free_isb_resources(wx);
+	wx_free_resources(wx);
+	phylink_disconnect_phy(wx->phylink);
+}
+
 /**
  * ngbe_close - Disables a network interface
  * @netdev: network interface device structure
@@ -526,11 +537,8 @@ static int ngbe_close(struct net_device *netdev)
 	struct wx *wx = netdev_priv(netdev);
 
 	wx_ptp_stop(wx);
-	ngbe_down(wx);
-	wx_free_irq(wx);
-	wx_free_isb_resources(wx);
-	wx_free_resources(wx);
-	phylink_disconnect_phy(wx->phylink);
+	if (netif_device_present(netdev))
+		ngbe_close_suspend(wx);
 	wx_control_hw(wx, false);
 
 	return 0;
@@ -547,7 +555,7 @@ static void ngbe_dev_shutdown(struct pci_dev *pdev, bool *enable_wake)
 	netif_device_detach(netdev);
 
 	if (netif_running(netdev))
-		ngbe_close(netdev);
+		ngbe_close_suspend(wx);
 	wx_clear_interrupt_scheme(wx);
 	rtnl_unlock();
 
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
index c9233dc7ae50..eb5c92edae06 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h
@@ -126,5 +126,6 @@ void ngbe_down(struct wx *wx);
 void ngbe_up(struct wx *wx);
 int ngbe_setup_tc(struct net_device *dev, u8 tc);
 void ngbe_do_reset(struct net_device *netdev, bool reinit);
+void ngbe_close_suspend(struct wx *wx);
 
 #endif /* _NGBE_TYPE_H_ */
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 9887638203cb..3bfb3328b8f3 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -415,6 +415,7 @@ static int txgbe_sw_init(struct wx *wx)
 
 	wx->setup_tc = txgbe_setup_tc;
 	wx->do_reset = txgbe_do_reset;
+	wx->close_suspend = txgbe_close_suspend;
 	set_bit(0, &wx->fwd_bitmask);
 
 	switch (wx->mac.type) {
@@ -503,10 +504,12 @@ static int txgbe_open(struct net_device *netdev)
  * This function should contain the necessary work common to both suspending
  * and closing of the device.
  */
-static void txgbe_close_suspend(struct wx *wx)
+void txgbe_close_suspend(struct wx *wx)
 {
 	wx_ptp_suspend(wx);
-	txgbe_disable_device(wx);
+	txgbe_down(wx);
+	wx_free_irq(wx);
+	txgbe_free_misc_irq(wx->priv);
 	wx_free_resources(wx);
 }
 
@@ -526,10 +529,8 @@ static int txgbe_close(struct net_device *netdev)
 	struct wx *wx = netdev_priv(netdev);
 
 	wx_ptp_stop(wx);
-	txgbe_down(wx);
-	wx_free_irq(wx);
-	txgbe_free_misc_irq(wx->priv);
-	wx_free_resources(wx);
+	if (netif_device_present(netdev))
+		txgbe_close_suspend(wx);
 	txgbe_fdir_filter_exit(wx);
 	wx_control_hw(wx, false);
 
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
index 1e373f7fd9b5..cd50ff1ef2ed 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h
@@ -314,6 +314,7 @@ void txgbe_down(struct wx *wx);
 void txgbe_up(struct wx *wx);
 int txgbe_setup_tc(struct net_device *dev, u8 tc);
 void txgbe_do_reset(struct net_device *netdev, bool reinit);
+void txgbe_close_suspend(struct wx *wx);
 
 #define TXGBE_LINK_SPEED_UNKNOWN        0
 #define TXGBE_LINK_SPEED_10GB_FULL      4
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent()
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
                   ` (3 preceding siblings ...)
  2026-04-30  8:25 ` [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-05-03  2:15   ` Jakub Kicinski
  2026-04-30  8:25 ` [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops Jiawen Wu
  5 siblings, 1 reply; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

Rx and Tx descriptor rings are freed via dma_free_coherent() in
wx_free_resources(), while ring->dma is not cleared upon free. This
result in a use-after-free of the DMA rings at ngbe_dev_shutdown(), if
WOL is enabled.

For insurance purposes, clear all stored DMA addresses after
dma_free_coherent().

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/wx_lib.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index 3216dee778be..51599f6b878e 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -2462,6 +2462,7 @@ void wx_free_isb_resources(struct wx *wx)
 	dma_free_coherent(&pdev->dev, sizeof(u32) * 4,
 			  wx->isb_mem, wx->isb_dma);
 	wx->isb_mem = NULL;
+	wx->isb_dma = 0;
 }
 EXPORT_SYMBOL(wx_free_isb_resources);
 
@@ -2678,6 +2679,7 @@ static void wx_free_rx_resources(struct wx_ring *rx_ring)
 			  rx_ring->desc, rx_ring->dma);
 
 	rx_ring->desc = NULL;
+	rx_ring->dma = 0;
 
 	if (rx_ring->page_pool) {
 		page_pool_destroy(rx_ring->page_pool);
@@ -2782,6 +2784,7 @@ static void wx_free_headwb_resources(struct wx_ring *tx_ring)
 	dma_free_coherent(tx_ring->dev, sizeof(u32),
 			  tx_ring->headwb_mem, tx_ring->headwb_dma);
 	tx_ring->headwb_mem = NULL;
+	tx_ring->headwb_dma = 0;
 }
 
 /**
@@ -2803,6 +2806,7 @@ static void wx_free_tx_resources(struct wx_ring *tx_ring)
 	dma_free_coherent(tx_ring->dev, tx_ring->size,
 			  tx_ring->desc, tx_ring->dma);
 	tx_ring->desc = NULL;
+	tx_ring->dma = 0;
 
 	wx_free_headwb_resources(tx_ring);
 }
@@ -2906,6 +2910,7 @@ static int wx_setup_rx_resources(struct wx_ring *rx_ring)
 
 err_desc:
 	dma_free_coherent(dev, rx_ring->size, rx_ring->desc, rx_ring->dma);
+	rx_ring->dma = 0;
 err:
 	kvfree(rx_ring->rx_buffer_info);
 	rx_ring->rx_buffer_info = NULL;
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops
  2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
                   ` (4 preceding siblings ...)
  2026-04-30  8:25 ` [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent() Jiawen Wu
@ 2026-04-30  8:25 ` Jiawen Wu
  2026-05-03  2:15   ` Jakub Kicinski
  5 siblings, 1 reply; 12+ messages in thread
From: Jiawen Wu @ 2026-04-30  8:25 UTC (permalink / raw)
  To: netdev
  Cc: Mengyuan Lou, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Richard Cochran, Russell King,
	Simon Horman, Kees Cook, Larysa Zaremba, Breno Leitao, Joe Damato,
	Jacob Keller, Fabio Baltieri, Jiawen Wu

Support AER driver to handle the PCIe errors.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
 drivers/net/ethernet/wangxun/libwx/wx_err.c   | 107 ++++++++++++++++++
 drivers/net/ethernet/wangxun/libwx/wx_err.h   |   2 +
 drivers/net/ethernet/wangxun/libwx/wx_type.h  |   1 +
 drivers/net/ethernet/wangxun/ngbe/ngbe_main.c |   9 +-
 .../net/ethernet/wangxun/txgbe/txgbe_main.c   |   8 +-
 5 files changed, 123 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
index 124011c3d5b1..32d063ee52f4 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
@@ -3,11 +3,118 @@
 
 #include <linux/netdevice.h>
 #include <linux/pci.h>
+#include <linux/aer.h>
 
 #include "wx_type.h"
 #include "wx_lib.h"
 #include "wx_err.h"
 
+/**
+ * wx_io_error_detected - called when PCI error is detected
+ * @pdev: Pointer to PCI device
+ * @state: The current pci connection state
+ *
+ * Return: pci_ers_result_t.
+ *
+ * This function is called after a PCI bus error affecting
+ * this device has been detected.
+ */
+static pci_ers_result_t wx_io_error_detected(struct pci_dev *pdev,
+					     pci_channel_state_t state)
+{
+	struct wx *wx = pci_get_drvdata(pdev);
+	struct net_device *netdev;
+
+	netdev = wx->netdev;
+	if (!netif_device_present(netdev))
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	rtnl_lock();
+	netif_device_detach(netdev);
+
+	if (netif_running(netdev))
+		wx->close_suspend(wx);
+
+	if (state == pci_channel_io_perm_failure) {
+		rtnl_unlock();
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+		pci_disable_device(pdev);
+	rtnl_unlock();
+
+	/* Request a slot reset. */
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+/**
+ * wx_io_slot_reset - called after the pci bus has been reset.
+ * @pdev: Pointer to PCI device
+ *
+ * Return: pci_ers_result_t.
+ *
+ * Restart the card from scratch, as if from a cold-boot.
+ */
+static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
+{
+	struct wx *wx = pci_get_drvdata(pdev);
+	pci_ers_result_t result;
+
+	if (pci_enable_device_mem(pdev)) {
+		wx_err(wx, "Cannot re-enable PCI device after reset.\n");
+		result = PCI_ERS_RESULT_DISCONNECT;
+	} else {
+		/* make all bar access done before reset. */
+		smp_mb__before_atomic();
+		clear_bit(WX_STATE_DISABLED, wx->state);
+		pci_set_master(pdev);
+		pci_restore_state(pdev);
+		pci_wake_from_d3(pdev, false);
+
+		wx->do_reset(wx->netdev, false);
+		result = PCI_ERS_RESULT_RECOVERED;
+	}
+
+	pci_aer_clear_nonfatal_status(pdev);
+
+	return result;
+}
+
+/**
+ * wx_io_resume - called when traffic can start flowing again.
+ * @pdev: Pointer to PCI device
+ *
+ * This callback is called when the error recovery driver tells us that
+ * its OK to resume normal operation.
+ */
+static void wx_io_resume(struct pci_dev *pdev)
+{
+	struct wx *wx = pci_get_drvdata(pdev);
+	struct net_device *netdev;
+	int err;
+
+	netdev = wx->netdev;
+	rtnl_lock();
+	if (netif_running(netdev)) {
+		err = netdev->netdev_ops->ndo_open(netdev);
+		if (err) {
+			wx_err(wx, "Failed to open netdev after reset\n");
+			goto out;
+		}
+	}
+	netif_device_attach(netdev);
+out:
+	rtnl_unlock();
+}
+
+const struct pci_error_handlers wx_err_handler = {
+	.error_detected = wx_io_error_detected,
+	.slot_reset = wx_io_slot_reset,
+	.resume = wx_io_resume,
+};
+EXPORT_SYMBOL(wx_err_handler);
+
 static void wx_reset_subtask(struct wx *wx)
 {
 	if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.h b/drivers/net/ethernet/wangxun/libwx/wx_err.h
index e317e6c8d928..8b1a7863b5b1 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_err.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_err.h
@@ -7,6 +7,8 @@
 #ifndef _WX_ERR_H_
 #define _WX_ERR_H_
 
+extern const struct pci_error_handlers wx_err_handler;
+
 void wx_handle_errors_subtask(struct wx *wx);
 void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue);
 void wx_handle_tx_hang(struct wx_ring *tx_ring, unsigned int next);
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 4b72835ddec1..81e12609d3fa 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -1215,6 +1215,7 @@ enum wx_state {
 	WX_STATE_PTP_RUNNING,
 	WX_STATE_PTP_TX_IN_PROGRESS,
 	WX_STATE_SERVICE_SCHED,
+	WX_STATE_DISABLED,
 	WX_STATE_NBITS		/* must be last */
 };
 
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index 2bd00eade11d..244123f203de 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -570,7 +570,8 @@ static void ngbe_dev_shutdown(struct pci_dev *pdev, bool *enable_wake)
 	*enable_wake = !!wufc;
 	wx_control_hw(wx, false);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+		pci_disable_device(pdev);
 }
 
 static void ngbe_shutdown(struct pci_dev *pdev)
@@ -856,6 +857,7 @@ static int ngbe_probe(struct pci_dev *pdev,
 		goto err_register;
 
 	pci_set_drvdata(pdev, wx);
+	pci_save_state(pdev);
 
 	return 0;
 
@@ -907,7 +909,8 @@ static void ngbe_remove(struct pci_dev *pdev)
 	kfree(wx->mac_table);
 	wx_clear_interrupt_scheme(wx);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+		pci_disable_device(pdev);
 }
 
 static int ngbe_suspend(struct pci_dev *pdev, pm_message_t state)
@@ -934,6 +937,7 @@ static int ngbe_resume(struct pci_dev *pdev)
 		wx_err(wx, "Cannot enable PCI device from suspend\n");
 		return err;
 	}
+	clear_bit(WX_STATE_DISABLED, wx->state);
 	pci_set_master(pdev);
 	device_wakeup_disable(&pdev->dev);
 
@@ -958,6 +962,7 @@ static struct pci_driver ngbe_driver = {
 	.resume   = ngbe_resume,
 	.shutdown = ngbe_shutdown,
 	.sriov_configure = wx_pci_sriov_configure,
+	.err_handler = &wx_err_handler,
 };
 
 module_pci_driver(ngbe_driver);
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 3bfb3328b8f3..a89b0a8643b3 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -552,7 +552,8 @@ static void txgbe_dev_shutdown(struct pci_dev *pdev)
 
 	wx_control_hw(wx, false);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+		pci_disable_device(pdev);
 }
 
 static void txgbe_shutdown(struct pci_dev *pdev)
@@ -900,6 +901,7 @@ static int txgbe_probe(struct pci_dev *pdev,
 		goto err_remove_phy;
 
 	pci_set_drvdata(pdev, wx);
+	pci_save_state(pdev);
 
 	netif_tx_stop_all_queues(netdev);
 
@@ -970,7 +972,8 @@ static void txgbe_remove(struct pci_dev *pdev)
 	kfree(wx->mac_table);
 	wx_clear_interrupt_scheme(wx);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
+		pci_disable_device(pdev);
 }
 
 static struct pci_driver txgbe_driver = {
@@ -980,6 +983,7 @@ static struct pci_driver txgbe_driver = {
 	.remove   = txgbe_remove,
 	.shutdown = txgbe_shutdown,
 	.sriov_configure = wx_pci_sriov_configure,
+	.err_handler = &wx_err_handler,
 };
 
 module_pci_driver(txgbe_driver);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops
  2026-04-30  8:25 ` [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops Jiawen Wu
@ 2026-05-03  2:15   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-05-03  2:15 UTC (permalink / raw)
  To: jiawenwu
  Cc: Jakub Kicinski, netdev, mengyuanlou, andrew+netdev, davem,
	edumazet, pabeni, richardcochran, linux, horms, kees,
	larysa.zaremba, leitao, joe, jacob.e.keller, fabio.baltieri

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: ngbe: implement libwx reset ops

This commit implements the wx->do_reset() operation for the ngbe driver,
allowing the library module to trigger a reset by bringing the interface
down and back up.

> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index d8e3827a8b1f2..bd905e2675758 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c

[ ... ]

> +static void ngbe_reinit_locked(struct wx *wx)
> +{
> +	netif_trans_update(wx->netdev);
> +
> +	mutex_lock(&wx->reset_lock);
> +	set_bit(WX_STATE_RESETTING, wx->state);
> +
> +	ngbe_down(wx);
> +	ngbe_up(wx);
> +
> +	clear_bit(WX_STATE_RESETTING, wx->state);

Does this introduce a regression by using WX_STATE_RESETTING as an ad-hoc
lock here?

While this pattern appears to exist elsewhere in the driver, adding it
to ngbe_reinit_locked() relies on a design where background readers might
check test_bit(WX_STATE_RESETTING, wx->state) without holding reset_lock.

Since the flag check and subsequent accesses are not atomic, a reader could
test the flag just before it is set and then proceed to access shared state
concurrently with the ngbe_down() and ngbe_up() teardown path.

Could this result in torn or corrupted reads, and would it be safer to use a
proper synchronization mechanism like RCU or a shared lock for readers?

> +	mutex_unlock(&wx->reset_lock);
> +}
> +
> +void ngbe_do_reset(struct net_device *netdev)

[ ... ]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process
  2026-04-30  8:25 ` [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process Jiawen Wu
@ 2026-05-03  2:15   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-05-03  2:15 UTC (permalink / raw)
  To: jiawenwu
  Cc: Jakub Kicinski, netdev, mengyuanlou, andrew+netdev, davem,
	edumazet, pabeni, richardcochran, linux, horms, kees,
	larysa.zaremba, leitao, joe, jacob.e.keller, fabio.baltieri

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: add Tx timeout process

Implement .ndo_tx_timeout to handle Tx side timeout events, triggering
a driver reset process. It also adds logic to track potential hangs and
clear false detections during 802.3 pause frames.

> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> new file mode 100644
> index 0000000000000..ba5f23cefc0f9
> --- /dev/null
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> @@ -0,0 +1,126 @@

[ ... ]

> +static void wx_reset_subtask(struct wx *wx)
> +{
> +	if (!test_bit(WX_FLAG_NEED_PF_RESET, wx->flags))
> +		return;
> +
> +	rtnl_lock();
> +
> +	if (!netif_running(wx->netdev) ||
> +	    test_bit(WX_STATE_RESETTING, wx->state))
> +		goto out;

If the interface is being brought down or a reset is in progress, does
this leave WX_FLAG_NEED_PF_RESET set indefinitely? When the interface is
brought back up, could this stale flag trigger an immediate, redundant
hardware reset?

Also, could this rtnl_lock() call cause an AB-BA deadlock? If the network
interface is administratively brought down, the networking core acquires
rtnl_lock() and invokes the driver's .ndo_stop routine.

If the teardown path calls cancel_work_sync(&wx->service_task), the close
path holding rtnl_lock() will wait for service_task to finish, while this
service_task waits indefinitely to acquire rtnl_lock().

> +
> +	wx_warn(wx, "Reset adapter.\n");
> +
> +	if (test_and_clear_bit(WX_FLAG_NEED_PF_RESET, wx->flags)) {
> +		if (wx->do_reset)
> +			wx->do_reset(wx->netdev);
> +	}
> +
> +out:
> +	rtnl_unlock();
> +}

[ ... ]

> +void wx_tx_timeout(struct net_device *netdev, unsigned int txqueue)
> +{
> +	struct wx *wx = netdev_priv(netdev);
> +	u32 head, tail;
> +	int i;
> +
> +	for (i = 0; i < wx->num_tx_queues; i++) {
> +		struct wx_ring *tx_ring = wx->tx_ring[i];
> +
> +		if (test_bit(WX_TX_DETECT_HANG, tx_ring->state) &&
> +		    wx_check_tx_hang(tx_ring))

Since wx_tx_timeout() can execute concurrently with NAPI's wx_clean_tx_irq()
on healthy queues, is it safe to call wx_check_tx_hang() here?

wx_check_tx_hang() mutates per-queue NAPI state like
ring->tx_stats.tx_done_old and ring->state without synchronization. Modifying
this state concurrently could corrupt the hang detection state machine.

> +			wx_warn(wx, "Real tx hang detected on queue %d\n", i);
> +
> +		head = rd32(wx, WX_PX_TR_RP(tx_ring->reg_idx));
> +		tail = rd32(wx, WX_PX_TR_WP(tx_ring->reg_idx));
> +		wx_warn(wx,
> +			"tx ring %d next_to_use is %d, next_to_clean is %d\n",
> +			i, tx_ring->next_to_use,
> +			tx_ring->next_to_clean);
> +		wx_warn(wx, "tx ring %d hw rp is 0x%x, wp is 0x%x\n",
> +			i, head, tail);
> +	}
> +
> +	wx_tx_timeout_reset(wx);
> +}

[ ... ]

> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index bd905e2675758..e9561996b970f 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> @@ -147,6 +148,7 @@ static void ngbe_service_task(struct work_struct *work)
>  {
>  	struct wx *wx = container_of(work, struct wx, service_task);
>  
> +	wx_handle_errors_subtask(wx);
>  	wx_update_stats(wx);

Is there a race condition here that could cause spurious Tx hang resets?

wx_handle_errors_subtask() sets the WX_TX_DETECT_HANG bit.
wx_update_stats() clears the WX_HANG_CHECK_ARMED bit when XOFF pause
frames are received.

If a hardware interrupt schedules NAPI between these two calls,
wx_clean_tx_irq() will execute wx_check_tx_hang() and might test the
WX_HANG_CHECK_ARMED bit from the previous interval (since wx_update_stats()
hasn't cleared it yet).

Would calling wx_update_stats() before wx_handle_errors_subtask() prevent
this false hang detection?

>  
>  	wx_service_event_complete(wx);

[ ... ]

> diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> index 8b7c3753bb6ac..5793da5b7bab4 100644
> --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> @@ -128,6 +129,7 @@ static void txgbe_service_task(struct work_struct *work)
>  {
>  	struct wx *wx = container_of(work, struct wx, service_task);
>  
> +	wx_handle_errors_subtask(wx);
>  	txgbe_module_detection_subtask(wx);
>  	txgbe_link_config_subtask(wx);
>  	wx_update_stats(wx);

Similar to ngbe_service_task(), could NAPI execution between
wx_handle_errors_subtask() and wx_update_stats() trigger a false Tx hang
here as well?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence
  2026-04-30  8:25 ` [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence Jiawen Wu
@ 2026-05-03  2:15   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-05-03  2:15 UTC (permalink / raw)
  To: jiawenwu
  Cc: Jakub Kicinski, netdev, mengyuanlou, andrew+netdev, davem,
	edumazet, pabeni, richardcochran, linux, horms, kees,
	larysa.zaremba, leitao, joe, jacob.e.keller, fabio.baltieri

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: extract the close_suspend sequence

Refactor the .ndo_close implementation by extracting the necessary
hardware shutdown sequence into a dedicated close_suspend function.
This is for later implementation of PCIe error callback function in
libwx.

> diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> index ec14dd47cd428..2bd00eade11d2 100644
> --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
> +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
[ ... ]
> @@ -547,7 +555,7 @@ static void ngbe_dev_shutdown(struct pci_dev *pdev, bool *enable_wake)
>  	netif_device_detach(netdev);
>  
>  	if (netif_running(netdev))
> -		ngbe_close(netdev);
> +		ngbe_close_suspend(wx);

Is there a chance this change to ngbe_close_suspend could cause issues
with the PTP clock state during suspend and resume?

Previously, calling ngbe_close would invoke wx_ptp_stop, which unregisters
the PTP clock and stops its background worker. The new ngbe_close_suspend
only calls wx_ptp_suspend.

This leaves the PTP clock registered and its worker thread active since
wx_ptp_do_aux_work will continue to reschedule itself.

On system resume, if ngbe_resume calls ngbe_open, which then calls
wx_ptp_init, wx_ptp_init will unconditionally call
seqlock_init(&wx->hw_tc_lock).

Will this re-initialize an active lock, potentially leading to lockdep
splats or silent lock corruption if the worker thread or a userspace
process accesses the PTP chardev concurrently?

>  	wx_clear_interrupt_scheme(wx);
>  	rtnl_unlock();
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent()
  2026-04-30  8:25 ` [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent() Jiawen Wu
@ 2026-05-03  2:15   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-05-03  2:15 UTC (permalink / raw)
  To: jiawenwu
  Cc: Jakub Kicinski, netdev, mengyuanlou, andrew+netdev, davem,
	edumazet, pabeni, richardcochran, linux, horms, kees,
	larysa.zaremba, leitao, joe, jacob.e.keller, fabio.baltieri

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: clear stored DMA addresses after dma_free_coherent()

This commit clears the ring DMA addresses after freeing their resources via
dma_free_coherent() in the wangxun driver. It aims to prevent use-after-free
of the DMA rings at shutdown when WOL is enabled.

> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> index 3216dee778be7..51599f6b878e7 100644
> --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
> @@ -2462,6 +2462,7 @@ void wx_free_isb_resources(struct wx *wx)
>  	dma_free_coherent(&pdev->dev, sizeof(u32) * 4,
>  			  wx->isb_mem, wx->isb_dma);
>  	wx->isb_mem = NULL;
> +	wx->isb_dma = 0;
>  }

This isn't a regression introduced by this patch, but can dma_free_coherent()
be called with a NULL wx->isb_mem virtual address here?

Since wx->isb_mem is only allocated when the interface is brought up, if the
module is removed while the interface is down, wx->isb_mem will be NULL. Does
passing a NULL virtual address to dma_free_coherent() cause a panic here, and
should it be guarded by a NULL check?

[ ... ]

> @@ -2678,6 +2679,7 @@ static void wx_free_rx_resources(struct wx_ring *rx_ring)
>  			  rx_ring->desc, rx_ring->dma);
>  
>  	rx_ring->desc = NULL;
> +	rx_ring->dma = 0;

If rx_ring->dma is set to 0 here, and Wake-on-LAN is enabled, does this result
in the hardware being programmed to DMA directly to physical address 0?

Looking at ngbe_dev_shutdown(), it frees these resources and then calls
wx_configure_rx(). Inside wx_configure_rx_ring(), the driver programs the
base address registers WX_PX_RR_BAL with ring->dma, which is now 0. When a
WOL packet arrives, could this cause silent memory corruption or fatal
IOMMU faults?

Additionally, this wasn't introduced by this patch, but does the shutdown
sequence trigger a NULL pointer dereference when WOL is active?

Earlier in wx_free_rx_resources(), rx_ring->rx_buffer_info is freed and set
to NULL, but ring->count retains its original value. During WOL shutdown,
ngbe_dev_shutdown() frees these resources and then reconfigures the RX rings
via wx_configure_rx_ring(). 

If wx_configure_rx_ring() executes:
memset(ring->rx_buffer_info, 0, sizeof(struct wx_rx_buffer) * ring->count);

with a NULL pointer and a count > 0, won't this cause a kernel panic?

[ ... ]

> @@ -2906,6 +2910,7 @@ static int wx_setup_rx_resources(struct wx_ring *rx_ring)
>  
>  err_desc:
>  	dma_free_coherent(dev, rx_ring->size, rx_ring->desc, rx_ring->dma);
> +	rx_ring->dma = 0;
>  err:
>  	kvfree(rx_ring->rx_buffer_info);
>  	rx_ring->rx_buffer_info = NULL;

Should rx_ring->desc also be set to NULL here?

If wx_alloc_page_pool() fails, the driver jumps to err_desc and frees the DMA
memory, but leaves rx_ring->desc pointing to the freed memory. Since wx_ring
structures can be reused across interface up/down cycles, could this dangling
pointer bypass the if (!rx_ring->desc) safety check in wx_free_rx_resources()
and cause a double-free later?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops
  2026-04-30  8:25 ` [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops Jiawen Wu
@ 2026-05-03  2:15   ` Jakub Kicinski
  0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2026-05-03  2:15 UTC (permalink / raw)
  To: jiawenwu
  Cc: Jakub Kicinski, netdev, mengyuanlou, andrew+netdev, davem,
	edumazet, pabeni, richardcochran, linux, horms, kees,
	larysa.zaremba, leitao, joe, jacob.e.keller, fabio.baltieri

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: wangxun: implement pci_error_handlers ops

Support AER driver to handle the PCIe errors.

> diff --git a/drivers/net/ethernet/wangxun/libwx/wx_err.c b/drivers/net/ethernet/wangxun/libwx/wx_err.c
> --- a/drivers/net/ethernet/wangxun/libwx/wx_err.c
> +++ b/drivers/net/ethernet/wangxun/libwx/wx_err.c
[ ... ]
> +static pci_ers_result_t wx_io_error_detected(struct pci_dev *pdev,
> +					     pci_channel_state_t state)
> +{
> +	struct wx *wx = pci_get_drvdata(pdev);
> +	struct net_device *netdev;
> +
> +	netdev = wx->netdev;

If an AER event triggers during ngbe_probe() or txgbe_probe() before
pci_set_drvdata() is called, wx will be NULL here.

Could this lead to a NULL pointer dereference when accessing wx->netdev?

Should the handler validate the wx pointer, or should pci_set_drvdata()
be moved earlier in the probe sequence?

> +	if (!netif_device_present(netdev))
> +		return PCI_ERS_RESULT_DISCONNECT;
> +
> +	rtnl_lock();
> +	netif_device_detach(netdev);
> +
> +	if (netif_running(netdev))
> +		wx->close_suspend(wx);
> +
> +	if (state == pci_channel_io_perm_failure) {
> +		rtnl_unlock();
> +		return PCI_ERS_RESULT_DISCONNECT;
> +	}

Does calling wx->close_suspend(wx) before checking for
pci_channel_io_perm_failure cause illegal MMIO accesses?

If the device is in a permanent failure state, it is physically
disconnected, and MMIO accesses could result in Master Aborts or system
lockups.

Should the check for permanent failure precede the hardware teardown?

> +	if (!test_and_set_bit(WX_STATE_DISABLED, wx->state))
> +		pci_disable_device(pdev);
> +	rtnl_unlock();
> +
> +	/* Request a slot reset. */
> +	return PCI_ERS_RESULT_NEED_RESET;
> +}

Should the background service task (wx->service_task) be canceled here
using cancel_work_sync()?

Since WX_STATE_DISABLED is not checked within the service task's
execution path, it appears the task could wake up and attempt to interact
with the device while its MMIO space is disabled.

[ ... ]
> +static pci_ers_result_t wx_io_slot_reset(struct pci_dev *pdev)
> +{
> +	struct wx *wx = pci_get_drvdata(pdev);
> +	pci_ers_result_t result;
> +
> +	if (pci_enable_device_mem(pdev)) {
> +		wx_err(wx, "Cannot re-enable PCI device after reset.\n");
> +		result = PCI_ERS_RESULT_DISCONNECT;
> +	} else {
> +		/* make all bar access done before reset. */
> +		smp_mb__before_atomic();
> +		clear_bit(WX_STATE_DISABLED, wx->state);
> +		pci_set_master(pdev);
> +		pci_restore_state(pdev);
> +		pci_wake_from_d3(pdev, false);
> +
> +		wx->do_reset(wx->netdev, false);

Is it safe to call wx->do_reset() here without holding the rtnl_lock()?

Other paths that reset the hardware (such as wx_reset_subtask()) appear
to acquire rtnl_lock() to serialize against concurrent network
configurations.

Could failing to hold the lock here allow concurrent userspace requests
to access hardware registers while the device is in a reset state?

> +		result = PCI_ERS_RESULT_RECOVERED;
> +	}
> +
> +	pci_aer_clear_nonfatal_status(pdev);
> +
> +	return result;
> +}
[ ... ]
> +static void wx_io_resume(struct pci_dev *pdev)
> +{
> +	struct wx *wx = pci_get_drvdata(pdev);
> +	struct net_device *netdev;
> +	int err;
> +
> +	netdev = wx->netdev;
> +	rtnl_lock();
> +	if (netif_running(netdev)) {
> +		err = netdev->netdev_ops->ndo_open(netdev);

By calling ndo_open() directly instead of dev_open(), is there a risk of
bypassing the core network state machine?

If ndo_open() fails, it cleans up its allocations, but the core network
stack's __LINK_STATE_START bit remains set.

If a user subsequently administratively downs the interface, the network
core might unconditionally call ndo_stop(), potentially leading to a
double-free when it attempts to clean up the already-freed resources.

> +		if (err) {
> +			wx_err(wx, "Failed to open netdev after reset\n");
> +			goto out;
> +		}
> +	}
> +	netif_device_attach(netdev);
> +out:
> +	rtnl_unlock();
> +}
[ ... ]
> diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
> +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
[ ... ]
> @@ -900,6 +901,7 @@ static int txgbe_probe(struct pci_dev *pdev,
>  		goto err_remove_phy;
>  
>  	pci_set_drvdata(pdev, wx);
> +	pci_save_state(pdev);
>  
>  	netif_tx_stop_all_queues(netdev);

This isn't a bug introduced by this patch, but does the ordering here
expose the interface to userspace before queue initialization?

Since register_netdev() is called prior to this code, the interface
becomes visible to userspace.

If a management daemon immediately brings the interface UP, invoking
ndo_open() and waking the TX queues, could this netif_tx_stop_all_queues()
erroneously stop the queues of the logically UP interface, resulting in a
silent TX hang?

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-05-03  2:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30  8:25 [PATCH net-next v2 0/6] net: wangxun: timeout and error Jiawen Wu
2026-04-30  8:25 ` [PATCH net-next v2 1/6] net: ngbe: implement libwx reset ops Jiawen Wu
2026-05-03  2:15   ` Jakub Kicinski
2026-04-30  8:25 ` [PATCH net-next v2 2/6] net: wangxun: add Tx timeout process Jiawen Wu
2026-05-03  2:15   ` Jakub Kicinski
2026-04-30  8:25 ` [PATCH net-next v2 3/6] net: wangxun: add reinit parameter to wx->do_reset callback Jiawen Wu
2026-04-30  8:25 ` [PATCH net-next v2 4/6] net: wangxun: extract the close_suspend sequence Jiawen Wu
2026-05-03  2:15   ` Jakub Kicinski
2026-04-30  8:25 ` [PATCH net-next v2 5/6] net: wangxun: clear stored DMA addresses after dma_free_coherent() Jiawen Wu
2026-05-03  2:15   ` Jakub Kicinski
2026-04-30  8:25 ` [PATCH net-next v2 6/6] net: wangxun: implement pci_error_handlers ops Jiawen Wu
2026-05-03  2:15   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox