* [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
@ 2026-03-06 19:42 ` Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
When an AF_XDP zero-copy application terminates abruptly (e.g.,
kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
igb_clean_rx_irq_zc() keeps returning budget (no descriptors, no
buffers to allocate, xsk_buff_alloc() returns NULL) which makes
napi_complete_done() re-arm the poll indefinitely.
Meanwhile, igb_down() → napi_synchronize() waits for a NAPI poll cycle
that signals completion with done < budget — which never happens. This
blocks igb_down() forever, and the 5-second TX watchdog fires because
no TX completions are processed while NAPI is stuck. Since igb_down()
never finishes, igb_up() is never called, and the TX queue remains
permanently stalled.
Fix this by adding an __IGB_DOWN check at the top of
igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
going down. This allows napi_synchronize() in igb_down() to complete,
matching the pattern already used in igb_clean_tx_irq().
Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..ca4aa4d935d5 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
u16 entries_to_alloc;
struct sk_buff *skb;
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return 0;
+
/* xdp_prog cannot be NULL in the ZC path */
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
@ 2026-03-06 19:42 ` Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
When igb_xdp_setup() transitions between XDP and non-XDP mode on a
running device, it calls igb_close() followed by igb_open(). During
this window the adapter is down and trans_start is stale, so the TX
watchdog can fire a spurious timeout.
The resulting schedule_work(&adapter->reset_task) races with the
igb_open() path: the reset task may run while the device is being
brought back up, or immediately after, causing unexpected ring
reinitialisation and register writes.
Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). If the
adapter is down (either during normal close or during the XDP close/open
transition), there is nothing useful a reset can do — the subsequent
igb_open() will reinitialise everything.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 223a10cae4a9..ddb7ce9e97bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6651,6 +6651,15 @@ static void igb_tx_timeout(struct net_device *netdev, unsigned int __always_unus
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ /* Do not schedule a reset if the adapter is already going down or
+ * being reconfigured (e.g., XDP program transition via igb_close/
+ * igb_open). The stale trans_start from before the close will
+ * trigger a spurious timeout that resolves once igb_open()
+ * completes.
+ */
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return;
+
/* Do the reset outside of interrupt context */
adapter->tx_timeout_count++;
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup()
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
@ 2026-03-06 19:42 ` Alex Dvoretsky
2 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 19:42 UTC (permalink / raw)
To: alex; +Cc: Alex Dvoretsky, stable
igb_xdp_setup() calls igb_close() + igb_open() when transitioning
between XDP and non-XDP mode on a running device. This has two issues:
1. When removing an XDP program that has AF_XDP zero-copy sockets,
ndo_xsk_wakeup() may be executing concurrently under rcu_read_lock().
If igb_close() tears down the rings while ndo_xsk_wakeup() is still
accessing them, it races with the teardown. Add synchronize_rcu()
before igb_close() when removing an XDP program to ensure all
in-flight ndo_xsk_wakeup() calls complete first.
2. The igb_close()/igb_open() window leaves trans_start stale from
before the close: the TX watchdog can fire a spurious timeout and
queue a reset_task that races with igb_open(). Add
netif_trans_update() after igb_open() to refresh the timestamp, and
cancel_work() to drain any reset_task queued during the window.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ddb7ce9e97bf..9ba944bf67b4 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2913,6 +2913,9 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
/* device is up and bpf is added/removed, must setup the RX queues */
if (need_reset && running) {
+ if (!prog)
+ /* Wait until ndo_xsk_wakeup completes. */
+ synchronize_rcu();
igb_close(dev);
} else {
for (i = 0; i < adapter->num_rx_queues; i++)
@@ -2936,6 +2939,16 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
if (running)
igb_open(dev);
+ /* Refresh trans_start to prevent the TX watchdog from firing on a
+ * stale timestamp from before igb_close(). Cancel any reset_task
+ * that igb_tx_timeout() may have queued between igb_close() setting
+ * __IGB_DOWN and the actual napi_synchronize() completion.
+ */
+ if (need_reset && running) {
+ netif_trans_update(dev);
+ cancel_work(&adapter->reset_task);
+ }
+
return 0;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
@ 2026-03-06 21:13 ` Alex Dvoretsky
0 siblings, 0 replies; 4+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When igb_xdp_setup() transitions between XDP and non-XDP mode on a
running device, it calls igb_close() followed by igb_open(). During
this window the adapter is down while trans_start still contains the
timestamp from before igb_close(), so the TX watchdog can fire a
spurious timeout.
The resulting schedule_work(&adapter->reset_task) races with the
igb_open() path: the reset task may run while the device is being
brought back up, or immediately after, causing unexpected ring
reinitialisation and register writes.
Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). A
reset is unnecessary because the device will be fully reinitialised
by the subsequent igb_open().
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 223a10cae4a9..ddb7ce9e97bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6651,6 +6651,10 @@ static void igb_tx_timeout(struct net_device *netdev, unsigned int __always_unus
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ /* Ignore timeout if the adapter is going down. */
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return;
+
/* Do the reset outside of interrupt context */
adapter->tx_timeout_count++;
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-06 21:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260306194226.995095-1-advoretsky@gmail.com>
2026-03-06 19:42 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 19:42 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.