Re: [PATCH v2 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue

public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed

From: Stephen Hemminger <stephen@networkplumber.org>
To: chia-yu.chang@nokia-bell-labs.com
Cc: victor@mojatatu.com, hxzene@gmail.com,
	linux-hardening@vger.kernel.org, kees@kernel.org,
	gustavoars@kernel.org, jhs@mojatatu.com, jiri@resnulli.us,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, horms@kernel.org, ij@kernel.org,
	ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com,
	g.white@cablelabs.com, ingemar.s.johansson@ericsson.com,
	mirja.kuehlewind@ericsson.com, cheshire@apple.com,
	rs.ietf@gmx.at, Jason_Livingood@comcast.com,
	vidhi_goel@apple.com
Subject: Re: [PATCH v2 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue
Date: Thu, 16 Apr 2026 10:55:05 -0700	[thread overview]
Message-ID: <20260416105505.22383f01@phoenix.local> (raw)
In-Reply-To: <20260416170906.66432-1-chia-yu.chang@nokia-bell-labs.com>

On Thu, 16 Apr 2026 19:09:06 +0200
chia-yu.chang@nokia-bell-labs.com wrote:

> From: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
> 
> Fix dualpi2_change() to correctly enforce updated limit and memlimit values
> after a configuration change of the dualpi2 qdisc.
> 
> Before this patch, dualpi2_change() always attempted to dequeue packets via
> the root qdisc (C-queue) when reducing backlog or memory usage, and
> unconditionally assumed that a valid skb will be returned. When traffic
> classification results in packets being queued in the L-queue while the
> C-queue is empty, this leads to a NULL skb dereference during limit or
> memlimit enforcement.
> 
> This is fixed by first dequeuing from the C-queue path if it is non-empty.
> Once the C-queue is empty, packets are dequeued directly from the L-queue.
> Return values from qdisc_dequeue_internal() are checked for both queues. When
> dequeuing from the L-queue, the parent qdisc qlen and backlog counters are
> updated explicitly to keep overall qdisc statistics consistent.
> 
> Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2 qdisc")
> Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com>
> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
> ---

I was a little concerned about the complexity of managing qlen here.
But could not find anything obvious.

Turned to AI review and it found some things:

Right fix direction and the reported crash is real. A few issues before this is ready:

1. The `c_len` construction is fragile. Declared `int`, initialized
from a `u32 - u32`. If the invariant `qdisc_qlen(sch) >=
qdisc_qlen(q->l_queue)` is ever violated, you get a large positive
value, the C-queue branch is taken on an empty C-queue,
`qdisc_dequeue_internal()` returns NULL, and the loop breaks out
without draining the L-queue -- leaving the qdisc over limit. Simpler
and more robust to just compare the two qlens directly and drop the
delta variable entirely.

2. Missing else/termination. If both branches' conditions are false
(neither `c_len` nor `qdisc_qlen(q->l_queue)`) but the outer `while`
still holds because `memory_used > memory_limit`, the loop spins
forever. An explicit `else break;` guards against an accounting desync
becoming a hang.

3. Whitespace: two lines in the L-queue branch use spaces instead of tabs --

+                        q->memory_used -= skb->truesize;
+                        rtnl_qdisc_drop(skb, q->l_queue);

checkpatch will flag this.

4. Comment style. The three-line comment at the end of the L-queue
branch doesn't follow the net subsystem multi-line comment style
(leading ' * ' on continuation lines, closing ' */' on its own line).
Once the code is cleaner, the comment could also just be dropped or
shortened to one line.

5. The accounting in the L-queue branch is correct, but only if you
trace the enqueue invariants carefully: L-queue packets are counted in
*both* `sch` and `q->l_queue` on enqueue (see dualpi2_enqueue_skb lines
413-423), `qdisc_dequeue_internal(q->l_queue, true)` adjusts l_queue's
side, and the explicit `--sch->q.qlen` + `qdisc_qstats_backlog_dec(sch,
skb)` adjusts sch's side. Separately, the C-queue branch now quietly
relies on the post-CVE-2025-39677 semantics of
`qdisc_dequeue_internal()` handling parent backlog -- which is why the
pre-patch `qdisc_qstats_backlog_dec(sch, skb)` could be removed.
Neither of these load-bearing invariants is documented in the code or
the commit message. Please add an inline comment in the L-queue branch
explaining the double-count-on-enqueue, and mention the
qdisc_dequeue_internal() dependency in the commit log.

6. Commit message / subject. Subject reads as if only the L-queue path
changed, but the whole drain loop was restructured. Something like
"sch_dualpi2: drain both C-queue and L-queue in dualpi2_change()" would
describe it better. Also, on NULL return from qdisc_dequeue_internal()
the loop silently breaks -- if that ever triggers it means qdisc_qlen()
> 0 but dequeue returned NULL, which is a real invariant violation.
> Worth a WARN_ON_ONCE().

Suggested shape:

	while (qdisc_qlen(sch) > sch->limit ||
	       q->memory_used > q->memory_limit) {
		struct sk_buff *skb;

		if (qdisc_qlen(sch) > qdisc_qlen(q->l_queue)) {
			skb = qdisc_dequeue_internal(sch, true);
			if (!skb)
				break;
			q->memory_used -= skb->truesize;
			rtnl_qdisc_drop(skb, sch);
		} else if (qdisc_qlen(q->l_queue)) {
			skb = qdisc_dequeue_internal(q->l_queue, true);
			if (!skb)
				break;
			/* L-queue packets are counted in both sch and
			 * l_queue on enqueue; qdisc_dequeue_internal()
			 * handled l_queue, account sch here.
			 */
			sch->q.qlen--;
			qdisc_qstats_backlog_dec(sch, skb);
			q->memory_used -= skb->truesize;
			rtnl_qdisc_drop(skb, q->l_queue);
			qdisc_qstats_drop(sch);
		} else {
			break;
		}
	}

As with any AI feedback, expect it to generate hints but also be wrong.

next prev parent reply	other threads:[~2026-04-16 17:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16 17:09 [PATCH v2 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue chia-yu.chang
2026-04-16 17:55 ` Stephen Hemminger [this message]
2026-04-16 18:30   ` Chia-Yu Chang (Nokia)
2026-04-16 19:35     ` Ilpo Järvinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260416105505.22383f01@phoenix.local \
    --to=stephen@networkplumber.org \
    --cc=Jason_Livingood@comcast.com \
    --cc=cheshire@apple.com \
    --cc=chia-yu.chang@nokia-bell-labs.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=g.white@cablelabs.com \
    --cc=gustavoars@kernel.org \
    --cc=horms@kernel.org \
    --cc=hxzene@gmail.com \
    --cc=ij@kernel.org \
    --cc=ingemar.s.johansson@ericsson.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kees@kernel.org \
    --cc=koen.de_schepper@nokia-bell-labs.com \
    --cc=kuba@kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mirja.kuehlewind@ericsson.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rs.ietf@gmx.at \
    --cc=victor@mojatatu.com \
    --cc=vidhi_goel@apple.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox