All of lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Alex Dvoretsky <advoretsky@gmail.com>
Cc: <intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<aleksandr.loktionov@intel.com>, <anthony.l.nguyen@intel.com>,
	<przemyslaw.kitszel@intel.com>, <kurt@linutronix.de>,
	<stable@vger.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH net v3] igb: remove napi_synchronize() in igb_down()
Date: Fri, 13 Mar 2026 10:29:29 +0100	[thread overview]
Message-ID: <abPY+aT0SWuixsmN@boxer> (raw)
In-Reply-To: <20260312135257.71610-1-advoretsky@gmail.com>

On Thu, Mar 12, 2026 at 02:52:55PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9),
> the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
> napi_complete_done() from clearing NAPI_STATE_SCHED.
> 
> igb_down() calls napi_synchronize() before napi_disable() for each queue
> vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear,
> which never happens. igb_down() blocks indefinitely, the TX watchdog
> fires, and the TX queue remains permanently stalled.
> 
> napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE.
> After a full-budget poll, __napi_poll() checks napi_disable_pending(). If
> set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop
> that napi_synchronize() cannot.
> 
> napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible
> panic caused by Rx traffic arrival while interface is down").
> napi_disable() provides stronger guarantees: it prevents further
> scheduling and waits for any active poll to exit.
> Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
> preceding napi_synchronize() in their down paths.
> 
> Remove redundant napi_synchronize() call and reorder napi_disable()
> before igb_set_queue_napi() so the queue-to-NAPI mapping is only
> cleared after polling has fully stopped.
> 
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> ---
> Agreed, that looks cleaner — no reason to touch the NAPI plumbing while
> the poll could still be running.
> 
> v3:
>   - Reorder napi_disable() before igb_set_queue_napi() per Aleksandr
>     Loktionov's suggestion.
> 
> v2:
>   - Replaced 3-patch series with single napi_synchronize() removal,
>     per Maciej Fijalkowski's suggestion. napi_disable() handles the
>     stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
>     checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
>     transition guards in igb_xdp_setup(), all unnecessary.
>   - Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
>     traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
>     Zero tx_timeout events.
> 
>  drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 7c41e32256fa..0793842cb937 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -2203,9 +2203,8 @@ void igb_down(struct igb_adapter *adapter)
>  
>  	for (i = 0; i < adapter->num_q_vectors; i++) {
>  		if (adapter->q_vector[i]) {
> -			napi_synchronize(&adapter->q_vector[i]->napi);
> -			igb_set_queue_napi(adapter, i, NULL);
>  			napi_disable(&adapter->q_vector[i]->napi);
> +			igb_set_queue_napi(adapter, i, NULL);
>  		}
>  	}
>  
> -- 
> 2.51.0
> 

WARNING: multiple messages have this Message-ID (diff)
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Alex Dvoretsky <advoretsky@gmail.com>
Cc: <intel-wired-lan@lists.osuosl.org>, <netdev@vger.kernel.org>,
	<aleksandr.loktionov@intel.com>, <anthony.l.nguyen@intel.com>,
	<przemyslaw.kitszel@intel.com>, <kurt@linutronix.de>,
	<stable@vger.kernel.org>
Subject: Re: [PATCH net v3] igb: remove napi_synchronize() in igb_down()
Date: Fri, 13 Mar 2026 10:29:29 +0100	[thread overview]
Message-ID: <abPY+aT0SWuixsmN@boxer> (raw)
In-Reply-To: <20260312135257.71610-1-advoretsky@gmail.com>

On Thu, Mar 12, 2026 at 02:52:55PM +0100, Alex Dvoretsky wrote:
> When an AF_XDP zero-copy application terminates abruptly (e.g., kill -9),
> the XSK buffer pool is destroyed but NAPI polling continues.
> igb_clean_rx_irq_zc() repeatedly returns the full budget, preventing
> napi_complete_done() from clearing NAPI_STATE_SCHED.
> 
> igb_down() calls napi_synchronize() before napi_disable() for each queue
> vector. napi_synchronize() spins waiting for NAPI_STATE_SCHED to clear,
> which never happens. igb_down() blocks indefinitely, the TX watchdog
> fires, and the TX queue remains permanently stalled.
> 
> napi_disable() already handles this correctly: it sets NAPI_STATE_DISABLE.
> After a full-budget poll, __napi_poll() checks napi_disable_pending(). If
> set, it forces completion and clears NAPI_STATE_SCHED, breaking the loop
> that napi_synchronize() cannot.
> 
> napi_synchronize() was added in commit 41f149a285da ("igb: Fix possible
> panic caused by Rx traffic arrival while interface is down").
> napi_disable() provides stronger guarantees: it prevents further
> scheduling and waits for any active poll to exit.
> Other Intel drivers (ixgbe, ice, i40e) use napi_disable() without a
> preceding napi_synchronize() in their down paths.
> 
> Remove redundant napi_synchronize() call and reorder napi_disable()
> before igb_set_queue_napi() so the queue-to-NAPI mapping is only
> cleared after polling has fully stopped.
> 
> Fixes: 2c6196013f84 ("igb: Add AF_XDP zero-copy Rx support")
> Cc: stable@vger.kernel.org
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Alex Dvoretsky <advoretsky@gmail.com>

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> ---
> Agreed, that looks cleaner — no reason to touch the NAPI plumbing while
> the poll could still be running.
> 
> v3:
>   - Reorder napi_disable() before igb_set_queue_napi() per Aleksandr
>     Loktionov's suggestion.
> 
> v2:
>   - Replaced 3-patch series with single napi_synchronize() removal,
>     per Maciej Fijalkowski's suggestion. napi_disable() handles the
>     stuck NAPI poll via NAPI_STATE_DISABLE, making the __IGB_DOWN
>     checks in igb_clean_rx_irq_zc() and igb_tx_timeout(), and the
>     transition guards in igb_xdp_setup(), all unnecessary.
>   - Tested on Intel I210 (igb) with AF_XDP zero-copy: full E2E
>     traffic suite, graceful shutdown, and 5x kill-9 stress cycles.
>     Zero tx_timeout events.
> 
>  drivers/net/ethernet/intel/igb/igb_main.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 7c41e32256fa..0793842cb937 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -2203,9 +2203,8 @@ void igb_down(struct igb_adapter *adapter)
>  
>  	for (i = 0; i < adapter->num_q_vectors; i++) {
>  		if (adapter->q_vector[i]) {
> -			napi_synchronize(&adapter->q_vector[i]->napi);
> -			igb_set_queue_napi(adapter, i, NULL);
>  			napi_disable(&adapter->q_vector[i]->napi);
> +			igb_set_queue_napi(adapter, i, NULL);
>  		}
>  	}
>  
> -- 
> 2.51.0
> 

  reply	other threads:[~2026-03-13  9:29 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 21:13 [Intel-wired-lan] [PATCH net 0/3] igb: fix TX stall during XDP teardown with AF_XDP zero-copy Alex Dvoretsky
2026-03-06 21:13 ` Alex Dvoretsky
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 1/3] igb: check __IGB_DOWN in igb_clean_rx_irq_zc() Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:46   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:46     ` Loktionov, Aleksandr
2026-03-11  8:52   ` Maciej Fijalkowski
2026-03-11  8:52     ` Maciej Fijalkowski
2026-03-11 20:45     ` [Intel-wired-lan] [PATCH net v2] igb: remove napi_synchronize() in igb_down() Alex Dvoretsky
2026-03-11 20:45       ` Alex Dvoretsky
2026-03-12  8:53       ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-12  8:53         ` Loktionov, Aleksandr
2026-03-12 13:52         ` [Intel-wired-lan] [PATCH net v3] " Alex Dvoretsky
2026-03-12 13:52           ` Alex Dvoretsky
2026-03-13  9:29           ` Maciej Fijalkowski [this message]
2026-03-13  9:29             ` Maciej Fijalkowski
2026-03-30 10:16             ` [Intel-wired-lan] " Holda, Patryk
2026-03-30 10:16               ` Holda, Patryk
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 2/3] igb: skip reset in igb_tx_timeout() during XDP transition Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:46   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:46     ` Loktionov, Aleksandr
2026-03-06 21:13 ` [Intel-wired-lan] [PATCH net 3/3] igb: add XDP transition guards in igb_xdp_setup() Alex Dvoretsky
2026-03-06 21:13   ` Alex Dvoretsky
2026-03-10  7:47   ` [Intel-wired-lan] " Loktionov, Aleksandr
2026-03-10  7:47     ` Loktionov, Aleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abPY+aT0SWuixsmN@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=advoretsky@gmail.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kurt@linutronix.de \
    --cc=netdev@vger.kernel.org \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.