* [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy
@ 2026-03-06 21:13 Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When an AF_XDP zero-copy application exits while an XDP program remains
attached, igb can permanently stall a TX queue associated with the
AF_XDP socket. The interface stops forwarding traffic and typically
requires a driver reload to recover.
Reproducer:
1. Attach an XDP program to igb
2. Run an AF_XDP zero-copy application
3. kill -9 the application
The TX watchdog eventually fires and the interface becomes
unresponsive. Reproduced on Intel I210 with Linux 6.17.
igb_clean_rx_irq_zc() lacks a __IGB_DOWN guard. When the AF_XDP process
exits the XSK pool is destroyed, but NAPI continues polling. The
function then repeatedly returns the full budget, which prevents
napi_complete_done() from completing. As a result igb_down() blocks in
napi_synchronize() and TX completions stop being processed, eventually
triggering the TX watchdog.
Patch 1 adds a __IGB_DOWN guard to igb_clean_rx_irq_zc() to break the
infinite NAPI poll loop.
Patch 2 prevents igb_tx_timeout() from scheduling reset_task during XDP
transitions when the device is shutting down.
Patch 3 adds synchronization in igb_xdp_setup() to ensure that pending
ndo_xsk_wakeup() calls complete before the teardown continues, and
refreshes trans_start after igb_open() to prevent false TX timeouts.
igc handles a similar stale trans_start situation via
txq_trans_cond_update() (commit 86ea56c5b0c7). This patch adds
equivalent protection for igb during XDP transitions.
Tested on Intel I210:
- AF_XDP ZC app exit with XDP attached
- XDP detach while AF_XDP running
- repeated XDP attach/detach cycles
Alex Dvoretsky (3):
igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
igb: skip reset in igb_tx_timeout() during XDP transition
igb: add XDP transition guards in igb_xdp_setup()
drivers/net/ethernet/intel/igb/igb_main.c | 15 +++++++++++++++
drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
2 files changed, 18 insertions(+)
--
2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
2026-03-06 21:13 [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
@ 2026-03-06 21:13 ` Alex Dvoretsky
2026-03-10 7:46 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-11 8:52 ` Maciej Fijalkowski
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 2 replies; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When an AF_XDP zero-copy application terminates abruptly (e.g.,
kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
igb_clean_rx_irq_zc() repeatedly returns the full budget (no
descriptors, no buffers to allocate, xsk_buff_alloc() returns NULL)
which makes napi_complete_done() re-arm the poll indefinitely.
Meanwhile igb_down() calls napi_synchronize(), which waits for a NAPI
poll cycle that completes with done < budget. This never happens, so
igb_down() blocks indefinitely. The 5-second TX watchdog fires because
no TX completions are processed while NAPI is stuck. Since igb_down()
never finishes, igb_up() is never called, and the TX queue remains
permanently stalled.
Fix this by adding an __IGB_DOWN check at the top of
igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
going down. This allows napi_synchronize() in igb_down() to complete,
matching the pattern already used in igb_clean_tx_irq().
Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..ca4aa4d935d5 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
u16 entries_to_alloc;
struct sk_buff *skb;
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return 0;
+
/* xdp_prog cannot be NULL in the ZC path */
xdp_prog = READ_ONCE(rx_ring->xdp_prog);
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
2026-03-06 21:13 [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
@ 2026-03-06 21:13 ` Alex Dvoretsky
2026-03-10 7:46 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-06 21:13 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2 siblings, 1 reply; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
When igb_xdp_setup() transitions between XDP and non-XDP mode on a
running device, it calls igb_close() followed by igb_open(). During
this window the adapter is down while trans_start still contains the
timestamp from before igb_close(), so the TX watchdog can fire a
spurious timeout.
The resulting schedule_work(&adapter->reset_task) races with the
igb_open() path: the reset task may run while the device is being
brought back up, or immediately after, causing unexpected ring
reinitialisation and register writes.
Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). A
reset is unnecessary because the device will be fully reinitialised
by the subsequent igb_open().
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 223a10cae4a9..ddb7ce9e97bf 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6651,6 +6651,10 @@ static void igb_tx_timeout(struct net_device *netdev, unsigned int __always_unus
struct igb_adapter *adapter = netdev_priv(netdev);
struct e1000_hw *hw = &adapter->hw;
+ /* Ignore timeout if the adapter is going down. */
+ if (test_bit(__IGB_DOWN, &adapter->state))
+ return;
+
/* Do the reset outside of interrupt context */
adapter->tx_timeout_count++;
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup()
2026-03-06 21:13 [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
@ 2026-03-06 21:13 ` Alex Dvoretsky
2026-03-10 7:47 ` [Intel-wired-lan] " Loktionov, Aleksandr
2 siblings, 1 reply; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-06 21:13 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, przemyslaw.kitszel, stable, kurt,
maciej.fijalkowski, Alex Dvoretsky
igb_xdp_setup() calls igb_close() + igb_open() when transitioning
between XDP and non-XDP mode on a running device. This has two issues:
1. ndo_xsk_wakeup() runs under rcu_read_lock() and may still access
the rings while igb_xdp_setup() removes the XDP program. Without
waiting for an RCU grace period, igb_close() can tear down the
rings while ndo_xsk_wakeup() is still executing. Add
synchronize_rcu() before igb_close() when removing an XDP program
to ensure all in-flight RCU readers complete first.
2. The igb_close()/igb_open() window leaves trans_start stale from
before the close: the TX watchdog can fire a spurious timeout and
queue a reset_task that races with igb_open(). Add
netif_trans_update() after igb_open() to refresh the timestamp, and
cancel_work() to cancel any reset_task that may have been queued
while the device was down.
Note: cancel_work_sync() cannot be used here because igb_reset_task()
takes rtnl_lock, which is already held by the ndo_bpf caller. Plain
cancel_work() is sufficient: if reset_task is already running, it blocks
on rtnl_lock and will check __IGB_DOWN when it acquires it.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ddb7ce9e97bf..9ba944bf67b4 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2913,6 +2913,9 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
/* device is up and bpf is added/removed, must setup the RX queues */
if (need_reset && running) {
+ if (!prog)
+ /* Wait for RCU readers (e.g. ndo_xsk_wakeup). */
+ synchronize_rcu();
igb_close(dev);
} else {
for (i = 0; i < adapter->num_rx_queues; i++)
@@ -2936,6 +2939,14 @@ static int igb_xdp_setup(struct net_device *dev, struct netdev_bpf *bpf)
if (running)
igb_open(dev);
+ /* Refresh watchdog timestamp after reopen and cancel any
+ * reset task queued while the device was down.
+ */
+ if (need_reset && running) {
+ netif_trans_update(dev);
+ cancel_work(&adapter->reset_task);
+ }
+
return 0;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* RE: [Intel-wired-lan] [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
@ 2026-03-10 7:46 ` Loktionov, Aleksandr
2026-03-11 8:52 ` Maciej Fijalkowski
1 sibling, 0 replies; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-03-10 7:46 UTC (permalink / raw)
To: Alex Dvoretsky, intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org, Nguyen, Anthony L, Kitszel, Przemyslaw,
stable@vger.kernel.org, kurt@linutronix.de, Fijalkowski, Maciej
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Alex Dvoretsky
> Sent: Friday, March 6, 2026 10:13 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; stable@vger.kernel.org;
> kurt@linutronix.de; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Alex Dvoretsky <advoretsky@gmail.com>
> Subject: [Intel-wired-lan] [PATCH net 1/3] igb: check __IGB_DOWN in
> igb_clean_rx_irq_zc()
>
> When an AF_XDP zero-copy application terminates abruptly (e.g., kill -
> 9), the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget (no
> descriptors, no buffers to allocate, xsk_buff_alloc() returns NULL)
> which makes napi_complete_done() re-arm the poll indefinitely.
>
> Meanwhile igb_down() calls napi_synchronize(), which waits for a NAPI
> poll cycle that completes with done < budget. This never happens, so
> igb_down() blocks indefinitely. The 5-second TX watchdog fires because
> no TX completions are processed while NAPI is stuck. Since igb_down()
> never finishes, igb_up() is never called, and the TX queue remains
> permanently stalled.
>
> Fix this by adding an __IGB_DOWN check at the top of
> igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
> going down. This allows napi_synchronize() in igb_down() to complete,
> matching the pattern already used in igb_clean_tx_irq().
>
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
> drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..ca4aa4d935d5 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector
> *q_vector,
> u16 entries_to_alloc;
> struct sk_buff *skb;
>
> + if (test_bit(__IGB_DOWN, &adapter->state))
> + return 0;
> +
> /* xdp_prog cannot be NULL in the ZC path */
> xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>
> --
> 2.51.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Intel-wired-lan] [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
@ 2026-03-10 7:46 ` Loktionov, Aleksandr
0 siblings, 0 replies; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-03-10 7:46 UTC (permalink / raw)
To: Alex Dvoretsky, intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org, Nguyen, Anthony L, Kitszel, Przemyslaw,
stable@vger.kernel.org, kurt@linutronix.de, Fijalkowski, Maciej
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Alex Dvoretsky
> Sent: Friday, March 6, 2026 10:13 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; stable@vger.kernel.org;
> kurt@linutronix.de; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Alex Dvoretsky <advoretsky@gmail.com>
> Subject: [Intel-wired-lan] [PATCH net 2/3] igb: skip reset in
> igb_tx_timeout() during XDP transition
>
> When igb_xdp_setup() transitions between XDP and non-XDP mode on a
> running device, it calls igb_close() followed by igb_open(). During
> this window the adapter is down while trans_start still contains the
> timestamp from before igb_close(), so the TX watchdog can fire a
> spurious timeout.
>
> The resulting schedule_work(&adapter->reset_task) races with the
> igb_open() path: the reset task may run while the device is being
> brought back up, or immediately after, causing unexpected ring
> reinitialisation and register writes.
>
> Fix this by checking __IGB_DOWN at the top of igb_tx_timeout(). A
> reset is unnecessary because the device will be fully reinitialised by
> the subsequent igb_open().
>
> Fixes: 9cbc948b5a20 ("igb: add XDP support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index 223a10cae4a9..ddb7ce9e97bf 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -6651,6 +6651,10 @@ static void igb_tx_timeout(struct net_device
> *netdev, unsigned int __always_unus
> struct igb_adapter *adapter = netdev_priv(netdev);
> struct e1000_hw *hw = &adapter->hw;
>
> + /* Ignore timeout if the adapter is going down. */
> + if (test_bit(__IGB_DOWN, &adapter->state))
> + return;
> +
> /* Do the reset outside of interrupt context */
> adapter->tx_timeout_count++;
>
> --
> 2.51.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [Intel-wired-lan] [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup()
2026-03-06 21:13 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
@ 2026-03-10 7:47 ` Loktionov, Aleksandr
0 siblings, 0 replies; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-03-10 7:47 UTC (permalink / raw)
To: Alex Dvoretsky, intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org, Nguyen, Anthony L, Kitszel, Przemyslaw,
stable@vger.kernel.org, kurt@linutronix.de, Fijalkowski, Maciej
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Alex Dvoretsky
> Sent: Friday, March 6, 2026 10:13 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; stable@vger.kernel.org;
> kurt@linutronix.de; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Alex Dvoretsky <advoretsky@gmail.com>
> Subject: [Intel-wired-lan] [PATCH net 3/3] igb: add XDP transition
> guards in igb_xdp_setup()
>
> igb_xdp_setup() calls igb_close() + igb_open() when transitioning
> between XDP and non-XDP mode on a running device. This has two issues:
>
> 1. ndo_xsk_wakeup() runs under rcu_read_lock() and may still access
> the rings while igb_xdp_setup() removes the XDP program. Without
> waiting for an RCU grace period, igb_close() can tear down the
> rings while ndo_xsk_wakeup() is still executing. Add
> synchronize_rcu() before igb_close() when removing an XDP program
> to ensure all in-flight RCU readers complete first.
>
> 2. The igb_close()/igb_open() window leaves trans_start stale from
> before the close: the TX watchdog can fire a spurious timeout and
> queue a reset_task that races with igb_open(). Add
> netif_trans_update() after igb_open() to refresh the timestamp, and
> cancel_work() to cancel any reset_task that may have been queued
> while the device was down.
>
> Note: cancel_work_sync() cannot be used here because igb_reset_task()
> takes rtnl_lock, which is already held by the ndo_bpf caller. Plain
> cancel_work() is sufficient: if reset_task is already running, it
> blocks on rtnl_lock and will check __IGB_DOWN when it acquires it.
>
> Fixes: 9cbc948b5a20 ("igb: add XDP support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index ddb7ce9e97bf..9ba944bf67b4 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -2913,6 +2913,9 @@ static int igb_xdp_setup(struct net_device *dev,
> struct netdev_bpf *bpf)
>
> /* device is up and bpf is added/removed, must setup the RX
> queues */
> if (need_reset && running) {
> + if (!prog)
> + /* Wait for RCU readers (e.g. ndo_xsk_wakeup). */
> + synchronize_rcu();
> igb_close(dev);
> } else {
> for (i = 0; i < adapter->num_rx_queues; i++) @@ -2936,6
> +2939,14 @@ static int igb_xdp_setup(struct net_device *dev, struct
> netdev_bpf *bpf)
> if (running)
> igb_open(dev);
>
> + /* Refresh watchdog timestamp after reopen and cancel any
> + * reset task queued while the device was down.
> + */
> + if (need_reset && running) {
> + netif_trans_update(dev);
> + cancel_work(&adapter->reset_task);
> + }
> +
> return 0;
> }
>
> --
> 2.51.0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-10 7:46 ` [Intel-wired-lan] " Loktionov, Aleksandr
@ 2026-03-11 8:52 ` Maciej Fijalkowski
2026-03-11 20:45 ` [PATCH net v2] igb: remove napi_synchronize() in igb_down() Alex Dvoretsky
1 sibling, 1 reply; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-11 8:52 UTC (permalink / raw)
To: Alex Dvoretsky
Cc: intel-wired-lan, netdev, anthony.l.nguyen, przemyslaw.kitszel,
stable, kurt
On Fri, Mar 06, 2026 at 10:13:08PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g.,
> kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget (no
> descriptors, no buffers to allocate, xsk_buff_alloc() returns NULL)
> which makes napi_complete_done() re-arm the poll indefinitely.
>
> Meanwhile igb_down() calls napi_synchronize(), which waits for a NAPI
> poll cycle that completes with done < budget. This never happens, so
> igb_down() blocks indefinitely. The 5-second TX watchdog fires because
> no TX completions are processed while NAPI is stuck. Since igb_down()
> never finishes, igb_up() is never called, and the TX queue remains
> permanently stalled.
>
> Fix this by adding an __IGB_DOWN check at the top of
> igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
> going down. This allows napi_synchronize() in igb_down() to complete,
> matching the pattern already used in igb_clean_tx_irq().
How about getting rid of napi_synchronize() instead of hurting hot path?
napi_disable() sets NAPI_STATE_DISABLE which should prevent further polls
for you. Did you try that approach?
>
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
> drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..ca4aa4d935d5 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
> u16 entries_to_alloc;
> struct sk_buff *skb;
>
> + if (test_bit(__IGB_DOWN, &adapter->state))
> + return 0;
> +
> /* xdp_prog cannot be NULL in the ZC path */
> xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v2] igb: remove napi_synchronize() in igb_down()
2026-03-11 8:52 ` Maciej Fijalkowski
@ 2026-03-11 20:45 ` Alex Dvoretsky
2026-03-12 8:53 ` Loktionov, Aleksandr
0 siblings, 1 reply; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-11 20:45 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, maciej.fijalkowski, aleksandr.loktionov, anthony.l.nguyen,
przemyslaw.kitszel, kurt, stable, Alex Dvoretsky
When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9),
the XSK buffer pool is destroyed but NAPI polling continues.
igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
napi_complete_done() from clearing NAPI_STATE_SCHED.
igb_down() calls napi_synchronize() before napi_disable() for each queue
vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear,
which never happens. igb_down() blocks indefinitely, the TX watchdog
fires, and the TX queue remains permanently stalled.
napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE.
After a full-budget poll, __napi_poll() checks napi_disable_pending(). If
set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop
that napi_synchronize() cannot.
napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible
panic caused by Rx traffic arrival while interface is down").
napi_disable() provides stronger guarantees: it prevents further
scheduling and waits for any active poll to exit.
Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
preceding napi_synchronize() in their down paths.
Remove redundant napi_synchronize() call.
Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
Cc: stable@vger.kernel.org
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
Thanks for the suggestion, Maciej. I tested removing napi_synchronize()
and it fixes the issue cleanly — napi_disable() handles the stuck poll
via NAPI_STATE_DISABLE without needing any hot-path changes.
v2:
- Replaced 3-patch series with single napi_synchronize() removal,
per Maciej Fijalkowski's suggestion. napi_disable() handles the
stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
transition guards in igb_xdp_setup(), all unnecessary.
- Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
Zero tx_timeout events.
drivers/net/ethernet/intel/igb/igb_main.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 12e8e30d8a2d..a1b3c5e4f7d2 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2203,7 +2203,6 @@ void igb_down(struct igb_adapter *adapter)
for (i = 0; i < adapter->num_q_vectors; i++) {
if (adapter->q_vector[i]) {
- napi_synchronize(&adapter->q_vector[i]->napi);
igb_set_queue_napi(adapter, i, NULL);
napi_disable(&adapter->q_vector[i]->napi);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* RE: [PATCH net v2] igb: remove napi_synchronize() in igb_down()
2026-03-11 20:45 ` [PATCH net v2] igb: remove napi_synchronize() in igb_down() Alex Dvoretsky
@ 2026-03-12 8:53 ` Loktionov, Aleksandr
2026-03-12 13:52 ` [PATCH net v3] " Alex Dvoretsky
0 siblings, 1 reply; 12+ messages in thread
From: Loktionov, Aleksandr @ 2026-03-12 8:53 UTC (permalink / raw)
To: Alex Dvoretsky, intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org, Fijalkowski, Maciej, Nguyen, Anthony L,
Kitszel, Przemyslaw, kurt@linutronix.de, stable@vger.kernel.org
> -----Original Message-----
> From: Alex Dvoretsky <advoretsky@gmail.com>
> Sent: Wednesday, March 11, 2026 9:45 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; kurt@linutronix.de;
> stable@vger.kernel.org; Alex Dvoretsky <advoretsky@gmail.com>
> Subject: [PATCH net v2] igb: remove napi_synchronize() in igb_down()
>
> When an AF_XDP zero-copy application terminates abruptly (e.g., kill -
> 9), the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
> napi_complete_done() from clearing NAPI_STATE_SCHED.
>
> igb_down() calls napi_synchronize() before napi_disable() for each
> queue vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to
> clear, which never happens. igb_down() blocks indefinitely, the TX
> watchdog fires, and the TX queue remains permanently stalled.
>
> napi_disable() already handles this correctly: it sets
> NAPI_STATE_DISABLE.
> After a full-budget poll, __napi_poll() checks napi_disable_pending().
> If set, it forces completion and clears NAPI_STATE_SCHED, breaking the
> loop that napi_synchronize() cannot.
>
> napi_synchronize() was added in commit 41f149a285da ("igb: Fix
> possible panic caused by Rx traffic arrival while interface is down").
> napi_disable() provides stronger guarantees: it prevents further
> scheduling and waits for any active poll to exit.
> Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
> preceding napi_synchronize() in their down paths.
>
> Remove redundant napi_synchronize() call.
>
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
> Thanks for the suggestion, Maciej. I tested removing
> napi_synchronize() and it fixes the issue cleanly — napi_disable()
> handles the stuck poll via NAPI_STATE_DISABLE without needing any hot-
> path changes.
>
> v2:
> - Replaced 3-patch series with single napi_synchronize() removal,
> per Maciej Fijalkowski's suggestion. napi_disable() handles the
> stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
> checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
> transition guards in igb_xdp_setup(), all unnecessary.
> - Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
> traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
> Zero tx_timeout events.
>
> drivers/net/ethernet/intel/igb/igb_main.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> b/drivers/net/ethernet/intel/igb/igb_main.c
> index 12e8e30d8a2d..a1b3c5e4f7d2 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -2203,7 +2203,6 @@ void igb_down(struct igb_adapter *adapter)
>
> for (i = 0; i < adapter->num_q_vectors; i++) {
> if (adapter->q_vector[i]) {
> - napi_synchronize(&adapter->q_vector[i]->napi);
> igb_set_queue_napi(adapter, i, NULL);
> napi_disable(&adapter->q_vector[i]->napi);
Ok. But I’d swap the two remaining calls so we don’t modify any per‑queue NAPI plumbing while the poll could still be running.
What do you think?
- igb_set_queue_napi(adapter, i, NULL);
- napi_disable(&adapter->q_vector[i]->napi);
+ napi_disable(&adapter->q_vector[i]->napi);
+ igb_set_queue_napi(adapter, i, NULL);
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> }
> --
> 2.51.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v3] igb: remove napi_synchronize() in igb_down()
2026-03-12 8:53 ` Loktionov, Aleksandr
@ 2026-03-12 13:52 ` Alex Dvoretsky
2026-03-13 9:29 ` Maciej Fijalkowski
0 siblings, 1 reply; 12+ messages in thread
From: Alex Dvoretsky @ 2026-03-12 13:52 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, maciej.fijalkowski, aleksandr.loktionov, anthony.l.nguyen,
przemyslaw.kitszel, kurt, stable, Alex Dvoretsky
When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9),
the XSK buffer pool is destroyed but NAPI polling continues.
igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
napi_complete_done() from clearing NAPI_STATE_SCHED.
igb_down() calls napi_synchronize() before napi_disable() for each queue
vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear,
which never happens. igb_down() blocks indefinitely, the TX watchdog
fires, and the TX queue remains permanently stalled.
napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE.
After a full-budget poll, __napi_poll() checks napi_disable_pending(). If
set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop
that napi_synchronize() cannot.
napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible
panic caused by Rx traffic arrival while interface is down").
napi_disable() provides stronger guarantees: it prevents further
scheduling and waits for any active poll to exit.
Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
preceding napi_synchronize() in their down paths.
Remove redundant napi_synchronize() call and reorder napi_disable()
before igb_set_queue_napi() so the queue-to-NAPI mapping is only
cleared after polling has fully stopped.
Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
Cc: stable@vger.kernel.org
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
---
Agreed, that looks cleaner — no reason to touch the NAPI plumbing while
the poll could still be running.
v3:
- Reorder napi_disable() before igb_set_queue_napi() per Aleksandr
Loktionov's suggestion.
v2:
- Replaced 3-patch series with single napi_synchronize() removal,
per Maciej Fijalkowski's suggestion. napi_disable() handles the
stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
transition guards in igb_xdp_setup(), all unnecessary.
- Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
Zero tx_timeout events.
drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 7c41e32256fa..0793842cb937 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2203,9 +2203,8 @@ void igb_down(struct igb_adapter *adapter)
for (i = 0; i < adapter->num_q_vectors; i++) {
if (adapter->q_vector[i]) {
- napi_synchronize(&adapter->q_vector[i]->napi);
- igb_set_queue_napi(adapter, i, NULL);
napi_disable(&adapter->q_vector[i]->napi);
+ igb_set_queue_napi(adapter, i, NULL);
}
}
--
2.51.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v3] igb: remove napi_synchronize() in igb_down()
2026-03-12 13:52 ` [PATCH net v3] " Alex Dvoretsky
@ 2026-03-13 9:29 ` Maciej Fijalkowski
0 siblings, 0 replies; 12+ messages in thread
From: Maciej Fijalkowski @ 2026-03-13 9:29 UTC (permalink / raw)
To: Alex Dvoretsky
Cc: intel-wired-lan, netdev, aleksandr.loktionov, anthony.l.nguyen,
przemyslaw.kitszel, kurt, stable
On Thu, Mar 12, 2026 at 02:52:55PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9),
> the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
> napi_complete_done() from clearing NAPI_STATE_SCHED.
>
> igb_down() calls napi_synchronize() before napi_disable() for each queue
> vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear,
> which never happens. igb_down() blocks indefinitely, the TX watchdog
> fires, and the TX queue remains permanently stalled.
>
> napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE.
> After a full-budget poll, __napi_poll() checks napi_disable_pending(). If
> set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop
> that napi_synchronize() cannot.
>
> napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible
> panic caused by Rx traffic arrival while interface is down").
> napi_disable() provides stronger guarantees: it prevents further
> scheduling and waits for any active poll to exit.
> Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
> preceding napi_synchronize() in their down paths.
>
> Remove redundant napi_synchronize() call and reorder napi_disable()
> before igb_set_queue_napi() so the queue-to-NAPI mapping is only
> cleared after polling has fully stopped.
>
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> Agreed, that looks cleaner — no reason to touch the NAPI plumbing while
> the poll could still be running.
>
> v3:
> - Reorder napi_disable() before igb_set_queue_napi() per Aleksandr
> Loktionov's suggestion.
>
> v2:
> - Replaced 3-patch series with single napi_synchronize() removal,
> per Maciej Fijalkowski's suggestion. napi_disable() handles the
> stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
> checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
> transition guards in igb_xdp_setup(), all unnecessary.
> - Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
> traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
> Zero tx_timeout events.
>
> drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 7c41e32256fa..0793842cb937 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -2203,9 +2203,8 @@ void igb_down(struct igb_adapter *adapter)
>
> for (i = 0; i < adapter->num_q_vectors; i++) {
> if (adapter->q_vector[i]) {
> - napi_synchronize(&adapter->q_vector[i]->napi);
> - igb_set_queue_napi(adapter, i, NULL);
> napi_disable(&adapter->q_vector[i]->napi);
> + igb_set_queue_napi(adapter, i, NULL);
> }
> }
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-13 9:29 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-06 21:13 [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-10 7:46 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-11 8:52 ` Maciej Fijalkowski
2026-03-11 20:45 ` [PATCH net v2] igb: remove napi_synchronize() in igb_down() Alex Dvoretsky
2026-03-12 8:53 ` Loktionov, Aleksandr
2026-03-12 13:52 ` [PATCH net v3] " Alex Dvoretsky
2026-03-13 9:29 ` Maciej Fijalkowski
2026-03-06 21:13 ` [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-10 7:46 ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-06 21:13 ` [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2026-03-10 7:47 ` [Intel-wired-lan] " Loktionov, Aleksandr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox