public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel
@ 2026-03-23 17:49 hawk
  2026-03-23 18:48 ` Eric Dumazet
  2026-03-25  4:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: hawk @ 2026-03-23 17:49 UTC (permalink / raw)
  To: netdev
  Cc: davem, edumazet, kuba, pabeni, horms, linux-kernel, kernel-team,
	Jonas Köppeler, Jesper Dangaard Brouer, Chris Arges,
	Toke Høiland-Jørgensen

From: Jonas Köppeler <j.koeppeler@tu-berlin.de>

When codel_dequeue() finds an empty queue, it resets vars->dropping
but does not reset vars->first_above_time.  The reference CoDel
algorithm (Nichols & Jacobson, ACM Queue 2012) resets both:

  dodeque_result codel_queue_t::dodeque(time_t now) {
      ...
      if (r.p == NULL) {
          first_above_time = 0;   // <-- Linux omits this
      }
      ...
  }

Note that codel_should_drop() does reset first_above_time when called
with a NULL skb, but codel_dequeue() returns early before ever calling
codel_should_drop() in the empty-queue case.  The post-drop code paths
do reach codel_should_drop(NULL) and correctly reset the timer, so a
dropped packet breaks the cycle -- but the next delivered packet
re-arms first_above_time and the cycle repeats.

For sparse flows such as ICMP ping (one packet every 200ms-1s), the
first packet arms first_above_time, the flow goes empty, and the
second packet arrives after the interval has elapsed and gets dropped.
The pattern repeats, producing sustained loss on flows that are not
actually congested.

Test: veth pair, fq_codel, BQL disabled, 30000 iptables rules in the
consumer namespace (NAPI-64 cycle ~14ms, well above fq_codel's 5ms
target), ping at 5 pps under UDP flood:

  Before fix:  26% ping packet loss
  After fix:    0% ping packet loss

Fix by resetting first_above_time to zero in the empty-queue path
of codel_dequeue(), matching the reference algorithm.

Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
Fixes: d068ca2ae2e6 ("codel: split into multiple files")
Co-developed-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Reported-by: Chris Arges <carges@cloudflare.com>
Tested-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/all/20260318134826.1281205-7-hawk@kernel.org/
---
 include/net/codel_impl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h
index 78a27ac73070..b2c359c6dd1b 100644
--- a/include/net/codel_impl.h
+++ b/include/net/codel_impl.h
@@ -158,6 +158,7 @@ static struct sk_buff *codel_dequeue(void *ctx,
 	bool drop;
 
 	if (!skb) {
+		vars->first_above_time = 0;
 		vars->dropping = false;
 		return skb;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel
  2026-03-23 17:49 [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel hawk
@ 2026-03-23 18:48 ` Eric Dumazet
  2026-03-25  4:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2026-03-23 18:48 UTC (permalink / raw)
  To: hawk
  Cc: netdev, davem, kuba, pabeni, horms, linux-kernel, kernel-team,
	Jonas Köppeler, Chris Arges,
	Toke Høiland-Jørgensen

On Mon, Mar 23, 2026 at 10:49 AM <hawk@kernel.org> wrote:
>
> From: Jonas Köppeler <j.koeppeler@tu-berlin.de>
>
> When codel_dequeue() finds an empty queue, it resets vars->dropping
> but does not reset vars->first_above_time.  The reference CoDel
> algorithm (Nichols & Jacobson, ACM Queue 2012) resets both:
>
>   dodeque_result codel_queue_t::dodeque(time_t now) {
>       ...
>       if (r.p == NULL) {
>           first_above_time = 0;   // <-- Linux omits this
>       }
>       ...
>   }
>
> Note that codel_should_drop() does reset first_above_time when called
> with a NULL skb, but codel_dequeue() returns early before ever calling
> codel_should_drop() in the empty-queue case.  The post-drop code paths
> do reach codel_should_drop(NULL) and correctly reset the timer, so a
> dropped packet breaks the cycle -- but the next delivered packet
> re-arms first_above_time and the cycle repeats.
>
> For sparse flows such as ICMP ping (one packet every 200ms-1s), the
> first packet arms first_above_time, the flow goes empty, and the
> second packet arrives after the interval has elapsed and gets dropped.
> The pattern repeats, producing sustained loss on flows that are not
> actually congested.
>
> Test: veth pair, fq_codel, BQL disabled, 30000 iptables rules in the
> consumer namespace (NAPI-64 cycle ~14ms, well above fq_codel's 5ms
> target), ping at 5 pps under UDP flood:
>
>   Before fix:  26% ping packet loss
>   After fix:    0% ping packet loss
>
> Fix by resetting first_above_time to zero in the empty-queue path
> of codel_dequeue(), matching the reference algorithm.
>
> Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
> Fixes: d068ca2ae2e6 ("codel: split into multiple files")
> Co-developed-by: Jesper Dangaard Brouer <hawk@kernel.org>
> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
> Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
> Reported-by: Chris Arges <carges@cloudflare.com>
> Tested-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Link: https://lore.kernel.org/all/20260318134826.1281205-7-hawk@kernel.org/
> ---
>  include/net/codel_impl.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/net/codel_impl.h b/include/net/codel_impl.h
> index 78a27ac73070..b2c359c6dd1b 100644
> --- a/include/net/codel_impl.h
> +++ b/include/net/codel_impl.h
> @@ -158,6 +158,7 @@ static struct sk_buff *codel_dequeue(void *ctx,
>         bool drop;
>
>         if (!skb) {
> +               vars->first_above_time = 0;
>                 vars->dropping = false;
>                 return skb;
>         }

Nice catch, thanks.

I guess I wanted to avoid the ktime_get() cost as much as I could....

Reviewed-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel
  2026-03-23 17:49 [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel hawk
  2026-03-23 18:48 ` Eric Dumazet
@ 2026-03-25  4:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-25  4:10 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, davem, edumazet, kuba, pabeni, horms, linux-kernel,
	kernel-team, j.koeppeler, carges, toke

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 23 Mar 2026 18:49:20 +0100 you wrote:
> From: Jonas Köppeler <j.koeppeler@tu-berlin.de>
> 
> When codel_dequeue() finds an empty queue, it resets vars->dropping
> but does not reset vars->first_above_time.  The reference CoDel
> algorithm (Nichols & Jacobson, ACM Queue 2012) resets both:
> 
>   dodeque_result codel_queue_t::dodeque(time_t now) {
>       ...
>       if (r.p == NULL) {
>           first_above_time = 0;   // <-- Linux omits this
>       }
>       ...
>   }
> 
> [...]

Here is the summary with links:
  - net_sched: codel: fix stale state for empty flows in fq_codel
    https://git.kernel.org/netdev/net/c/815980fe6dbb

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-25  4:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 17:49 [PATCH] net_sched: codel: fix stale state for empty flows in fq_codel hawk
2026-03-23 18:48 ` Eric Dumazet
2026-03-25  4:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox