public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net: napi: Avoid gro timer misfiring at end of busypoll
@ 2026-05-06  9:08 Dragos Tatulea
  2026-05-06 20:49 ` Joe Damato
  0 siblings, 1 reply; 2+ messages in thread
From: Dragos Tatulea @ 2026-05-06  9:08 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Björn Töpel, Daniel Borkmann
  Cc: dtatulea, Gal Pressman, Joe Damato, Frederik Deweerdt,
	Martin Karsten, Tariq Toukan, Cosmin Ratiu, netdev, linux-kernel

When in irq deferral mode (defer-hard-irqs > 0), a short enough
gro-flush timeout can trigger before NAPI_STATE_SCHED is cleared if the
last poll in busy_poll_stop() takes too long. This can have the effect
of leaving the queue stuck with interrupts disabled and no timer armed
which results in a tx timeout if there is no subsequent busypoll cycle.

To prevent this, defer the gro-flush timer arm after the last poll.

Fixes: 7fd3253a7de6 ("net: Introduce preferred busy-polling")
Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
---
Changes since RFC [1]:
- Sending only fix to net.
- Made commit message clearer and more succint.
- Fixed timer arming to happen after clearing the NAPI_STATE_SCHED bit
- Arm timer after clearing NAPI_STATE_SCHED and drop IRQ disable.

[1] https://lore.kernel.org/all/20260428175134.1197036-3-dtatulea@nvidia.com/
---
 net/core/dev.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 06c195906231..3ebd69988d51 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6862,9 +6862,9 @@ static void skb_defer_free_flush(void)
 
 #if defined(CONFIG_NET_RX_BUSY_POLL)
 
-static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule)
+static void __busy_poll_stop(struct napi_struct *napi, unsigned long timeout)
 {
-	if (!skip_schedule) {
+	if (!timeout) {
 		gro_normal_list(&napi->gro);
 		__napi_schedule(napi);
 		return;
@@ -6874,6 +6874,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule)
 	gro_flush_normal(&napi->gro, HZ >= 1000);
 
 	clear_bit(NAPI_STATE_SCHED, &napi->state);
+	hrtimer_start(&napi->timer, ns_to_ktime(timeout),
+		      HRTIMER_MODE_REL_PINNED);
 }
 
 enum {
@@ -6885,8 +6887,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock,
 			   unsigned flags, u16 budget)
 {
 	struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx;
-	bool skip_schedule = false;
-	unsigned long timeout;
+	unsigned long timeout = 0;
 	int rc;
 
 	/* Busy polling means there is a high chance device driver hard irq
@@ -6906,10 +6907,12 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock,
 
 	if (flags & NAPI_F_PREFER_BUSY_POLL) {
 		napi->defer_hard_irqs_count = napi_get_defer_hard_irqs(napi);
-		timeout = napi_get_gro_flush_timeout(napi);
-		if (napi->defer_hard_irqs_count && timeout) {
-			hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED);
-			skip_schedule = true;
+		if (napi->defer_hard_irqs_count) {
+			/* A short enough gro flush timeout and long enough
+			 * poll can result in timer firing too early.
+			 * Timer will be armed later if necessary.
+			 */
+			timeout = napi_get_gro_flush_timeout(napi);
 		}
 	}
 
@@ -6924,7 +6927,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock,
 	trace_napi_poll(napi, rc, budget);
 	netpoll_poll_unlock(have_poll_lock);
 	if (rc == budget)
-		__busy_poll_stop(napi, skip_schedule);
+		__busy_poll_stop(napi, timeout);
 	bpf_net_ctx_clear(bpf_net_ctx);
 	local_bh_enable();
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH net v2] net: napi: Avoid gro timer misfiring at end of busypoll
  2026-05-06  9:08 [PATCH net v2] net: napi: Avoid gro timer misfiring at end of busypoll Dragos Tatulea
@ 2026-05-06 20:49 ` Joe Damato
  0 siblings, 0 replies; 2+ messages in thread
From: Joe Damato @ 2026-05-06 20:49 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Björn Töpel, Daniel Borkmann,
	Gal Pressman, Frederik Deweerdt, Martin Karsten, Tariq Toukan,
	Cosmin Ratiu, netdev, linux-kernel

On Wed, May 06, 2026 at 09:08:08AM +0000, Dragos Tatulea wrote:
> When in irq deferral mode (defer-hard-irqs > 0), a short enough
> gro-flush timeout can trigger before NAPI_STATE_SCHED is cleared if the
> last poll in busy_poll_stop() takes too long. This can have the effect
> of leaving the queue stuck with interrupts disabled and no timer armed
> which results in a tx timeout if there is no subsequent busypoll cycle.
> 
> To prevent this, defer the gro-flush timer arm after the last poll.
> 
> Fixes: 7fd3253a7de6 ("net: Introduce preferred busy-polling")
> Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
> Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
> ---
> Changes since RFC [1]:
> - Sending only fix to net.
> - Made commit message clearer and more succint.
> - Fixed timer arming to happen after clearing the NAPI_STATE_SCHED bit
> - Arm timer after clearing NAPI_STATE_SCHED and drop IRQ disable.
> 
> [1] https://lore.kernel.org/all/20260428175134.1197036-3-dtatulea@nvidia.com/
> ---
>  net/core/dev.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)

Good catch on this bug. This fix looks right to me.

Reviewed-by: Joe Damato <joe@dama.to>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-06 20:49 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06  9:08 [PATCH net v2] net: napi: Avoid gro timer misfiring at end of busypoll Dragos Tatulea
2026-05-06 20:49 ` Joe Damato

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox