All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Alex Dvoretsky <advoretsky@gmail.com>
Cc: <intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<anthony.l.nguyen@intel.com>, <przemyslaw.kitszel@intel.com>,
	<stable@vger.kernel.org>, <kurt@linutronix.de>
Subject: Re: [Intel-wired-lan] [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
Date: Wed, 11 Mar 2026 09:52:19 +0100	[thread overview]
Message-ID: <abEtQwISGizUXIwf@boxer> (raw)
In-Reply-To: <20260306211310.1213330-2-advoretsky@gmail.com>

On Fri, Mar 06, 2026 at 10:13:08PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g.,
> kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget (no
> descriptors, no buffers to allocate, xsk_buff_alloc() returns NULL)
> which makes napi_complete_done() re-arm the poll indefinitely.
> 
> Meanwhile igb_down() calls napi_synchronize(), which waits for a NAPI
> poll cycle that completes with done < budget. This never happens, so
> igb_down() blocks indefinitely. The 5-second TX watchdog fires because
> no TX completions are processed while NAPI is stuck. Since igb_down()
> never finishes, igb_up() is never called, and the TX queue remains
> permanently stalled.
> 
> Fix this by adding an __IGB_DOWN check at the top of
> igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
> going down. This allows napi_synchronize() in igb_down() to complete,
> matching the pattern already used in igb_clean_tx_irq().

How about getting rid of napi_synchronize() instead of hurting hot path?

napi_disable() sets NAPI_STATE_DISABLE which should prevent further polls
for you. Did you try that approach?

> 
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..ca4aa4d935d5 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
>  	u16 entries_to_alloc;
>  	struct sk_buff *skb;
>  
> +	if (test_bit(__IGB_DOWN, &adapter->state))
> +		return 0;
> +
>  	/* xdp_prog cannot be NULL in the ZC path */
>  	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>  
> -- 
> 2.51.0
> 

WARNING: multiple messages have this Message-ID (diff)
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Alex Dvoretsky <advoretsky@gmail.com>
Cc: <intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<anthony.l.nguyen@intel.com>, <przemyslaw.kitszel@intel.com>,
	<stable@vger.kernel.org>, <kurt@linutronix.de>
Subject: Re: [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc()
Date: Wed, 11 Mar 2026 09:52:19 +0100	[thread overview]
Message-ID: <abEtQwISGizUXIwf@boxer> (raw)
In-Reply-To: <20260306211310.1213330-2-advoretsky@gmail.com>

On Fri, Mar 06, 2026 at 10:13:08PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g.,
> kill -9), the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget (no
> descriptors, no buffers to allocate, xsk_buff_alloc() returns NULL)
> which makes napi_complete_done() re-arm the poll indefinitely.
> 
> Meanwhile igb_down() calls napi_synchronize(), which waits for a NAPI
> poll cycle that completes with done < budget. This never happens, so
> igb_down() blocks indefinitely. The 5-second TX watchdog fires because
> no TX completions are processed while NAPI is stuck. Since igb_down()
> never finishes, igb_up() is never called, and the TX queue remains
> permanently stalled.
> 
> Fix this by adding an __IGB_DOWN check at the top of
> igb_clean_rx_irq_zc(), returning 0 immediately when the adapter is
> going down. This allows napi_synchronize() in igb_down() to complete,
> matching the pattern already used in igb_clean_tx_irq().

How about getting rid of napi_synchronize() instead of hurting hot path?

napi_disable() sets NAPI_STATE_DISABLE which should prevent further polls
for you. Did you try that approach?

> 
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_xsk.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..ca4aa4d935d5 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -351,6 +351,9 @@ int igb_clean_rx_irq_zc(struct igb_q_vector *q_vector,
>  	u16 entries_to_alloc;
>  	struct sk_buff *skb;
>  
> +	if (test_bit(__IGB_DOWN, &adapter->state))
> +		return 0;
> +
>  	/* xdp_prog cannot be NULL in the ZC path */
>  	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>  
> -- 
> 2.51.0
> 

  parent reply	other threads:[~2026-03-11  8:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` Alex Dvoretsky
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:46   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:46     ` Loktionov, Aleksandr
2026-03-11  8:52   ` Maciej Fijalkowski [this message]
2026-03-11  8:52     ` Maciej Fijalkowski
2026-03-11 20:45     ` [Intel-wired-lan] [PATCH net v2] igb: remove napi_synchronize() in igb_down() Alex Dvoretsky
2026-03-11 20:45       ` Alex Dvoretsky
2026-03-12  8:53       ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-12  8:53         ` Loktionov, Aleksandr
2026-03-12 13:52         ` [Intel-wired-lan] [PATCH net v3] " Alex Dvoretsky
2026-03-12 13:52           ` Alex Dvoretsky
2026-03-13  9:29           ` [Intel-wired-lan] " Maciej Fijalkowski
2026-03-13  9:29             ` Maciej Fijalkowski
2026-03-30 10:16             ` [Intel-wired-lan] " Holda, Patryk
2026-03-30 10:16               ` Holda, Patryk
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:46   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:46     ` Loktionov, Aleksandr
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:47   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:47     ` Loktionov, Aleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abEtQwISGizUXIwf@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=advoretsky@gmail.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kurt@linutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.