* [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
@ 2026-03-06 19:42 ` Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
When an AF_XDP zero-copy application terminates abruptly (e.g.,
kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
igb_clean_rx_irq_zc() keeps returning budget (no descriptors, no
buffers to allocate, xsk_buff_alloc() returns NULL) which makes
napi_complete_done() re-arm the poll indefinitely.
Meanwhile, igb_down() → napi_synchronize() waits for a NAPI poll cycle
that signals completion with done < budget — which never happens. This
blocks igb_down() forever, and the 5-second TX watchdog fires because
no TX completions are processed while NAPI is stuck. Since igb_down()
never finishes, igb_up() is never called, and the TX queue remains
permanently stalled.
Fix this by adding an __IGB_DOWN check at the top of
igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
going down. This allows napi_synchronize() in igb_down() to complete,
matching the pattern already used in igb_clean_tx_irq().
Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..ca4aa4d935d5 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
u16 entries_to_alloc;
struct sk_buff *skb;
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return 0;
+
/* xdp_prog cannot be NULL in the ZC path */
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
@ 2026-03-06 19:42 ` Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
When igb_xdp_setup() transitions between XDP and non-XDP mode on a
running device, it calls igb_close() followed by igb_open(). During
this window the adapter is down and trans_start is stale, so the TX
watchdog can fire a spurious timeout.
The resulting schedule_work(&adapter->reset_task) races with the
igb_open() path: the reset task may run while the device is being
brought back up, or immediately after, causing unexpected ring
reinitialisation and register writes.
Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). If the
adapter is down (either during normal close or during the XDP close/open
transition), there is nothing useful a reset can do — the subsequent
igb_open() will reinitialise everything.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 223a10cae4a9..ddb7ce9e97bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6651,6 +6651,15 @@ static void igb_tx_timeout(struct net_device *netdev, unsigned int __always_unus
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ /* Do not schedule a reset if the adapter is already going down or
+ * being reconfigured (e.g., XDP program transition via igb_close/
+ * igb_open). The stale trans_start from before the close will
+ * trigger a spurious timeout that resolves once igb_open()
+ * completes.
+ */
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return;
+
/* Do the reset outside of interrupt context */
adapter->tx_timeout_count++;
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup()
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
@ 2026-03-06 19:42 ` Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
igb_xdp_setup() calls igb_close() + igb_open() when transitioning
between XDP and non-XDP mode on a running device. This has two issues:
1. When removing an XDP program that has AF_XDP zero-copy sockets,
ndo_xsk_wakeup() may be executing concurrently under rcu_read_lock().
If igb_close() tears down the rings while ndo_xsk_wakeup() is still
accessing them, it races with the teardown. Add synchronize_rcu()
before igb_close() when removing an XDP program to ensure all
in-flight ndo_xsk_wakeup() calls complete first.
2. The igb_close()/igb_open() window leaves trans_start stale from
before the close: the TX watchdog can fire a spurious timeout and
queue a reset_task that races with igb_open(). Add
netif_trans_update() after igb_open() to refresh the timestamp, and
cancel_work() to drain any reset_task queued during the window.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ddb7ce9e97bf..9ba944bf67b4 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2913,6 +2913,9 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
/* device is up and bpf is added/removed, must setup the RX queues */
if (need_reset && running) {
+ if (!prog)
+ /* Wait until ndo_xsk_wakeup completes. */
+ synchronize_rcu();
igb_close(dev);
} else {
for (i = 0; i < adapter->num_rx_queues; i++)
@@ -2936,6 +2939,16 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
if (running)
igb_open(dev);
+ /* Refresh trans_start to prevent the TX watchdog from firing on a
+ * stale timestamp from before igb_close(). Cancel any reset_task
+ * that igb_tx_timeout() may have queued between igb_close() setting
+ * __IGB_DOWN and the actual napi_synchronize() completion.
+ */
+ if (need_reset && running) {
+ netif_trans_update(dev);
+ cancel_work(&adapter->reset_task);
+ }
+
return 0;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy
@ 2026-03-06 21:13 Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
0 siblings, 1 reply; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When an AF_XDP zero-copy application exits while an XDP program remains
attached, igb can permanently stall a TX queue associated with the
AF_XDP socket. The interface stops forwarding traffic and typically
requires a driver reload to recover.
Reproducer:
1. Attach an XDP program to igb
2. Run an AF_XDP zero-copy application
3. kill -9 the application
The TX watchdog eventually fires and the interface becomes
unresponsive. Reproduced on Intel I210 with Linux 6.17.
igb_clean_rx_irq_zc() lacks a __IGB_DOWN guard. When the AF_XDP process
exits the XSK pool is destroyed, but NAPI continues polling. The
function then repeatedly returns the full budget, which prevents
napi_complete_done() from completing. As a result igb_down() blocks in
napi_synchronize() and TX completions stop being processed, eventually
triggering the TX watchdog.
Patch 1 adds a __IGB_DOWN guard to igb_clean_rx_irq_zc() to break the
infinite NAPI poll loop.
Patch 2 prevents igb_tx_timeout() from scheduling reset_task during XDP
transitions when the device is shutting down.
Patch 3 adds synchronization in igb_xdp_setup() to ensure that pending
ndo_xsk_wakeup() calls complete before the teardown continues, and
refreshes trans_start after igb_open() to prevent false TX timeouts.
igc handles a similar stale trans_start situation via
txq_trans_cond_update() (commit 86ea56c5b0c7). This patch adds
equivalent protection for igb during XDP transitions.
Tested on Intel I210:
- AF_XDP ZC app exit with XDP attached
- XDP detach while AF_XDP running
- repeated XDP attach/detach cycles
Alex Dvoretsky (3):
igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
igb: skip reset in igb_tx_timeout() during XDP transition
igb: add XDP transition guards in igb_xdp_setup()
drivers/net/ethernet/intel/igb/igb_main.c | 15 +++++++++++++++
drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
2 files changed, 18 insertions(+)
--
2.51.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
@ 2026-03-06 21:13 ` Alex Dvoretsky
0 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When igb_xdp_setup() transitions between XDP and non-XDP mode on a
running device, it calls igb_close() followed by igb_open(). During
this window the adapter is down while trans_start still contains the
timestamp from before igb_close(), so the TX watchdog can fire a
spurious timeout.
The resulting schedule_work(&adapter->reset_task) races with the
igb_open() path: the reset task may run while the device is being
brought back up, or immediately after, causing unexpected ring
reinitialisation and register writes.
Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). A
reset is unnecessary because the device will be fully reinitialised
by the subsequent igb_open().
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 223a10cae4a9..ddb7ce9e97bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6651,6 +6651,10 @@ static void igb_tx_timeout(struct net_device *netdev, unsigned int __always_unus
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ /* Ignore timeout if the adapter is going down. */
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return;
+
/* Do the reset outside of interrupt context */
adapter->tx_timeout_count++;
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-06 21:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.