Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/4] Avoid mistaken parent class deactivation during peek
@ 2026-06-10 19:28 Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Several qdiscs (fq_codel, codel and dualpi2) may drop packets while
peeking at their queue. When that happens they call
qdisc_tree_reduce_backlog() to notify the parent of the backlog/qlen
change. The problem is that they do so *before* reincrementing the qlen
that peek had temporarily decremented.

If the qlen momentarily drops to zero while peek still has an skb to
return, qdisc_tree_reduce_backlog() ends up invoking the parent's
qlen_notify() callback even though the child is not actually empty. The
parent then deactivates the class, while the child still holds a packet.
For parents such as QFQ this desync corrupts the active class list and
leads to wild memory accesses and NULL pointer dereferences (see the
per-patch splats). For HFSC it might lead to stalls [1].

Fix all three qdiscs the same way: only call qdisc_tree_reduce_backlog()
once the qlen has been restored, so the parent never observes a
transient empty child during peek.

Patch 1 fixes this for fq_codel, patch 2 for codel, patch 3 for dualpi2
and patch 4 adds test cases for these 3 setups.

Note: Patch 1 is one of two fixes for the stall reported in [1]; the
companion fix is "net/sched: sch_hfsc: Don't make class passive twice",
sent separately.

Note2: A possible cleaner fix is to create a new helper function for peek
that only calls qdisc_tree_reduce_backlog after reincrementing the qlen.
This would be called from the 3 vulnerable qdiscs, however we thought this
might make it harder for backporting so, if people agree, we can submit
this cleaner version to net-next after this one is merged.

[1] https://lore.kernel.org/netdev/CAN2cbVe79oj0O9==m4+4x3v+O+qzRagA=2=wkrp9i9=CqYvyZA@mail.gmail.com/

Victor Nogueira (4):
  net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  net/sched: sch_codel: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  net/sched: sch_dualpi2: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  selftests/tc-testing: Verify child qdisc will not mistakenly
    deactivate QFQ parent

 net/sched/sch_codel.c                         |  48 ++++-
 net/sched/sch_dualpi2.c                       |  41 +++-
 net/sched/sch_fq_codel.c                      |  41 +++-
 .../tc-testing/tc-tests/infra/qdiscs.json     | 184 ++++++++++++++++++
 4 files changed, 305 insertions(+), 9 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
  2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
@ 2026-06-10 19:28 ` Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 2/4] net/sched: sch_codel: " Victor Nogueira
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Whenever fq_codel drops packets during peek, it calls
qdisc_tree_reduce_backlog. An issue arises because it calls
qdisc_tree_reduce_backlog before it reincrements the qlen. If qlen drops
to zero, but peek returns an skb, the parent's qlen_notify callback will be
executed even though fq_codel still has 1 packet on the queue and, thus,
will mistakenly deactivate the parent's class causing issues like a recent
report [1] and a wild memory access in qfq:

[   29.371146][  T360] Oops: general protection fault, probably for non-canonical address 0xfbd59c0000000024: 0000 [#1] SMP KASAN NOPTI
[   29.371666][  T360] KASAN: maybe wild-memory-access in range [0xdead000000000120-0xdead000000000127]
[   29.371987][  T360] CPU: 6 UID: 0 PID: 360 Comm: tc Not tainted 7.1.0-rc5-00285-gc530e5b2dbc6-dirty #82 PREEMPT(full)
[   29.372384][  T360] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   29.372620][  T360] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:1029 (discriminator 2) include/linux/list.h:1043 (discriminator 2) net/sched/sch_qfq.c:1369 (discriminator 2) net/sched/sch_qfq.c:1395 (discriminator 2)) sch_qfq
[   29.373544][  T360] RSP: 0018:ffff888102417370 EFLAGS: 00010216
[   29.373800][  T360] RAX: 0000000000000000 RBX: ffff88811224d568 RCX: dffffc0000000000
[   29.374079][  T360] RDX: 1ffff11021fe1543 RSI: ffff88810ff0aa00 RDI: dffffc0000000000
[   29.374368][  T360] RBP: ffff88811224c280 R08: dead000000000122 R09: 1bd5a00000000024
[   29.374649][  T360] R10: fffffbfff7940329 R11: fffffbfff7940329 R12: 0000000000000000
[   29.374926][  T360] R13: dead000000000100 R14: ffff88811224d580 R15: ffff88811224d578
[   29.375207][  T360] FS:  00007f5b794e5780(0000) GS:ffff88815d1e9000(0000) knlGS:0000000000000000
[   29.375545][  T360] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.375823][  T360] CR2: 000055ffb091f000 CR3: 000000010a305000 CR4: 0000000000750ef0
[   29.376103][  T360] PKRU: 55555554
[   29.376258][  T360] Call Trace:
[   29.376401][  T360]  <TASK>
...
[   29.376885][  T360] qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1487) sch_qfq
[   29.377074][  T360]  qdisc_reset (net/sched/sch_generic.c:1057)
[   29.377414][  T360]  __qdisc_destroy (net/sched/sch_generic.c:1096)
[   29.377600][  T360]  qdisc_graft (net/sched/sch_api.c:1062 net/sched/sch_api.c:1053 net/sched/sch_api.c:1159)
[   29.378593][  T360]  tc_get_qdisc (net/sched/sch_api.c:1528 net/sched/sch_api.c:1556)

Fix this by only calling qdisc_tree_reduce_backlog in peek after the
qlen is restored.

[1] http://lore.kernel.org/netdev/CAN2cbVe79oj0O9==m4+4x3v+O+qzRagA=2=wkrp9i9=CqYvyZA@mail.gmail.com/

Fixes: 342debc12183 ("codel: remove sch->q.qlen check before qdisc_tree_reduce_backlog()")
Reported-by: Anirudh Gupta <anirudhrudr@gmail.com>
Closes: https://lore.kernel.org/netdev/CAN2cbVe79oj0O9==m4+4x3v+O+qzRagA=2=wkrp9i9=CqYvyZA@mail.gmail.com/
Tested-by: Anirudh Gupta <anirudhrudr@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 net/sched/sch_fq_codel.c | 41 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 24db54684e8a..9aebf25ddc79 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -280,7 +280,7 @@ static void drop_func(struct sk_buff *skb, void *ctx)
 	qdisc_qstats_drop(sch);
 }
 
-static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
+static struct sk_buff *__fq_codel_dequeue(struct Qdisc *sch)
 {
 	struct fq_codel_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *skb;
@@ -317,12 +317,49 @@ static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
 	qdisc_bstats_update(sch, skb);
 	WRITE_ONCE(flow->deficit, flow->deficit - qdisc_pkt_len(skb));
 
+	return skb;
+}
+
+static void fq_codel_dequeue_drop(struct Qdisc *sch)
+{
+	struct fq_codel_sched_data *q = qdisc_priv(sch);
+
 	if (q->cstats.drop_count) {
 		qdisc_tree_reduce_backlog(sch, q->cstats.drop_count,
 					  q->cstats.drop_len);
 		q->cstats.drop_count = 0;
 		q->cstats.drop_len = 0;
 	}
+}
+
+static struct sk_buff *fq_codel_dequeue(struct Qdisc *sch)
+{
+	struct sk_buff *skb;
+
+	skb =  __fq_codel_dequeue(sch);
+
+	fq_codel_dequeue_drop(sch);
+
+	return skb;
+}
+
+static struct sk_buff *fq_codel_peek(struct Qdisc *sch)
+{
+	struct sk_buff *skb = skb_peek(&sch->gso_skb);
+
+	if (!skb) {
+		skb = __fq_codel_dequeue(sch);
+
+		if (skb) {
+			__skb_queue_head(&sch->gso_skb, skb);
+			/* it's still part of the queue */
+			qdisc_qstats_backlog_inc(sch, skb);
+			sch->q.qlen++;
+		}
+
+		fq_codel_dequeue_drop(sch);
+	}
+
 	return skb;
 }
 
@@ -725,7 +762,7 @@ static struct Qdisc_ops fq_codel_qdisc_ops __read_mostly = {
 	.priv_size	=	sizeof(struct fq_codel_sched_data),
 	.enqueue	=	fq_codel_enqueue,
 	.dequeue	=	fq_codel_dequeue,
-	.peek		=	qdisc_peek_dequeued,
+	.peek		=	fq_codel_peek,
 	.init		=	fq_codel_init,
 	.reset		=	fq_codel_reset,
 	.destroy	=	fq_codel_destroy,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 2/4] net/sched: sch_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
  2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
@ 2026-06-10 19:28 ` Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 3/4] net/sched: sch_dualpi2: " Victor Nogueira
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Whenever codel drops packets during peek, it calls
qdisc_tree_reduce_backlog. An issue arises because it calls
qdisc_tree_reduce_backlog before it reincrements the qlen. If qlen drops
to zero, but peek returns an skb, the parent's qlen_notify callback will
be executed even though codel still has 1 packet on the queue and, thus,
will mistakenly deactivate the parent's class causing issues like a wild
memory access when qfq has codel as a child:

[   36.339843][  T370] Oops: general protection fault, probably for non-canonical address 0xfbd59c0000000024: 0000 [#1] SMP KASAN NOPTI
[   36.340408][  T370] KASAN: maybe wild-memory-access in range [0xdead000000000120-0xdead000000000127]
[   36.340737][  T370] CPU: 2 UID: 0 PID: 370 Comm: tc Not tainted 7.1.0-rc5-00287-g66e13b626592 #87 PREEMPT(full)
[   36.341113][  T370] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   36.341357][  T370] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:1029 (discriminator 2) include/linux/list.h:1043 (discriminator 2) net/sched/sch_qfq.c:1369 (discriminator 2) net/sched/sch_qfq.c:1395 (discriminator 2)) sch_qfq
[   36.342221][  T370] RSP: 0018:ffff8881100ef370 EFLAGS: 00010216
[   36.342422][  T370] RAX: 0000000000000000 RBX: ffff8881058a9568 RCX: dffffc0000000000
[   36.342664][  T370] RDX: 1ffff11021064dc3 RSI: ffff888108326e00 RDI: dffffc0000000000
[   36.342905][  T370] RBP: ffff8881058a8280 R08: dead000000000122 R09: 1bd5a00000000024
[   36.343140][  T370] R10: fffffbfff2940329 R11: fffffbfff2940329 R12: 0000000000000000
[   36.343383][  T370] R13: dead000000000100 R14: ffff8881058a9580 R15: ffff8881058a9578
[   36.343631][  T370] FS:  00007fc04b0ca780(0000) GS:ffff888184fef000(0000) knlGS:0000000000000000
[   36.343911][  T370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.344116][  T370] CR2: 0000557c02c02000 CR3: 000000010e0ba000 CR4: 0000000000750ef0
[   36.344359][  T370] PKRU: 55555554
[   36.344481][  T370] Call Trace:
...
[   36.345054][  T370] qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1487) sch_qfq
[   36.345222][  T370]  qdisc_reset (net/sched/sch_generic.c:1057)
[   36.345503][  T370]  __qdisc_destroy (net/sched/sch_generic.c:1096)
[   36.345677][  T370]  qdisc_graft (net/sched/sch_api.c:1062 net/sched/sch_api.c:1053 net/sched/sch_api.c:1159)
[   36.346335][  T370]  tc_get_qdisc (net/sched/sch_api.c:1528 net/sched/sch_api.c:1556)

Fix this by only calling qdisc_tree_reduce_backlog in peek after the
qlen is restored.

Fixes: 342debc12183 ("codel: remove sch->q.qlen check before qdisc_tree_reduce_backlog()")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 net/sched/sch_codel.c | 48 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/net/sched/sch_codel.c b/net/sched/sch_codel.c
index 317aae0ec7bd..7d5076196aff 100644
--- a/net/sched/sch_codel.c
+++ b/net/sched/sch_codel.c
@@ -56,7 +56,7 @@ static void drop_func(struct sk_buff *skb, void *ctx)
 	qdisc_qstats_drop(sch);
 }
 
-static struct sk_buff *codel_qdisc_dequeue(struct Qdisc *sch)
+static struct sk_buff *__codel_qdisc_dequeue(struct Qdisc *sch)
 {
 	struct codel_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *skb;
@@ -65,13 +65,51 @@ static struct sk_buff *codel_qdisc_dequeue(struct Qdisc *sch)
 			    &q->stats, qdisc_pkt_len, codel_get_enqueue_time,
 			    drop_func, dequeue_func);
 
+	if (skb)
+		qdisc_bstats_update(sch, skb);
+	return skb;
+}
+
+static void codel_dequeue_drop(struct Qdisc *sch)
+{
+	struct codel_sched_data *q = qdisc_priv(sch);
+
 	if (q->stats.drop_count) {
-		qdisc_tree_reduce_backlog(sch, q->stats.drop_count, q->stats.drop_len);
+		qdisc_tree_reduce_backlog(sch, q->stats.drop_count,
+					  q->stats.drop_len);
 		q->stats.drop_count = 0;
 		q->stats.drop_len = 0;
 	}
-	if (skb)
-		qdisc_bstats_update(sch, skb);
+}
+
+static struct sk_buff *codel_qdisc_dequeue(struct Qdisc *sch)
+{
+	struct sk_buff *skb;
+
+	skb = __codel_qdisc_dequeue(sch);
+
+	codel_dequeue_drop(sch);
+
+	return skb;
+}
+
+static struct sk_buff *codel_peek(struct Qdisc *sch)
+{
+	struct sk_buff *skb = skb_peek(&sch->gso_skb);
+
+	if (!skb) {
+		skb = __codel_qdisc_dequeue(sch);
+
+		if (skb) {
+			__skb_queue_head(&sch->gso_skb, skb);
+			/* it's still part of the queue */
+			qdisc_qstats_backlog_inc(sch, skb);
+			sch->q.qlen++;
+		}
+
+		codel_dequeue_drop(sch);
+	}
+
 	return skb;
 }
 
@@ -257,7 +295,7 @@ static struct Qdisc_ops codel_qdisc_ops __read_mostly = {
 
 	.enqueue	=	codel_qdisc_enqueue,
 	.dequeue	=	codel_qdisc_dequeue,
-	.peek		=	qdisc_peek_dequeued,
+	.peek		=	codel_peek,
 	.init		=	codel_init,
 	.reset		=	codel_reset,
 	.change 	=	codel_change,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 3/4] net/sched: sch_dualpi2: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
  2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 2/4] net/sched: sch_codel: " Victor Nogueira
@ 2026-06-10 19:28 ` Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent Victor Nogueira
  2026-06-13  0:30 ` [PATCH net 0/4] Avoid mistaken parent class deactivation during peek patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Whenever dualpi2 drops packets during peek, it calls
qdisc_tree_reduce_backlog. An issue arises because it calls
qdisc_tree_reduce_backlog before it reincrements the qlen. If qlen drops
to zero, but peek returns an skb, the parent's qlen_notify callback will be
executed even though dualpi2 still has 1 packet on the queue and, thus,
mistakenly deactivates the parent's class which leads to a null-ptr-deref:

[  101.427314][  T599] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] SMP KASAN NOPTI
[  101.427755][  T599] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[  101.428048][  T599] CPU: 2 UID: 0 PID: 599 Comm: ping Not tainted 7.1.0-rc5-00284-gbce53c430ed7 #102 PREEMPT(full)
[  101.428400][  T599] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  101.428608][  T599] RIP: 0010:qfq_dequeue (net/sched/sch_qfq.c:1150) sch_qfq
[  101.428821][  T599] Code: 00 fc ff df 80 3c 02 00 0f 85 46 0c 00 00 4c 8d 73 48 48 89 9d b8 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 2d 0c 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b
All code
[  101.429348][  T599] RSP: 0018:ffff8881110df4f0 EFLAGS: 00010216
[  101.429541][  T599] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: dffffc0000000000
[  101.429763][  T599] RDX: 0000000000000009 RSI: 00000024c0000000 RDI: ffff88811436c2b0
[  101.429985][  T599] RBP: ffff88811436c000 R08: ffff88811436c280 R09: 1ffff11021277523
[  101.430206][  T599] R10: 1ffff11021277526 R11: 1ffff11021277527 R12: 00000024c0000000
[  101.430423][  T599] R13: ffff88811436c2b8 R14: 0000000000000048 R15: 0000000020000000
[  101.430642][  T599] FS:  00007f61813e1c40(0000) GS:ffff8881691ef000(0000) knlGS:0000000000000000
[  101.430913][  T599] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  101.431100][  T599] CR2: 00005651650850a8 CR3: 000000010ca0b000 CR4: 0000000000750ef0
[  101.431320][  T599] PKRU: 55555554
[  101.431433][  T599] Call Trace:
[  101.431544][  T599]  <TASK>
[  101.431628][  T599]  __qdisc_run (net/sched/sch_generic.c:322 net/sched/sch_generic.c:427 net/sched/sch_generic.c:445)
[  101.431792][  T599]  ? dev_qdisc_enqueue (./include/trace/events/qdisc.h:49 (discriminator 22) net/core/dev.c:4176 (discriminator 22))
[  101.431941][  T599]  __dev_queue_xmit (./include/net/pkt_sched.h:120 ./include/net/pkt_sched.h:117 net/core/dev.c:4292 net/core/dev.c:4831)

Fix this by only calling qdisc_tree_reduce_backlog in peek after the
qlen is restored.

Fixes: 8f9516daedd6 ("sched: Add enqueue/dequeue of dualpi2 qdisc")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 net/sched/sch_dualpi2.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c
index a22489c14458..05285775b454 100644
--- a/net/sched/sch_dualpi2.c
+++ b/net/sched/sch_dualpi2.c
@@ -579,7 +579,7 @@ static void drop_and_retry(struct dualpi2_sched_data *q, struct sk_buff *skb,
 	qdisc_qstats_drop(sch);
 }
 
-static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch)
+static struct sk_buff *__dualpi2_qdisc_dequeue(struct Qdisc *sch)
 {
 	struct dualpi2_sched_data *q = qdisc_priv(sch);
 	struct sk_buff *skb;
@@ -605,12 +605,49 @@ static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch)
 		break;
 	}
 
+	return skb;
+}
+
+static void dualpi2_dequeue_drop(struct Qdisc *sch)
+{
+	struct dualpi2_sched_data *q = qdisc_priv(sch);
+
 	if (q->deferred_drops_cnt) {
 		qdisc_tree_reduce_backlog(sch, q->deferred_drops_cnt,
 					  q->deferred_drops_len);
 		q->deferred_drops_cnt = 0;
 		q->deferred_drops_len = 0;
 	}
+}
+
+static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch)
+{
+	struct sk_buff *skb;
+
+	skb = __dualpi2_qdisc_dequeue(sch);
+
+	dualpi2_dequeue_drop(sch);
+
+	return skb;
+}
+
+static struct sk_buff *dualpi2_peek(struct Qdisc *sch)
+{
+	struct sk_buff *skb = skb_peek(&sch->gso_skb);
+
+	if (!skb) {
+		skb = __dualpi2_qdisc_dequeue(sch);
+
+		if (skb) {
+			__skb_queue_head(&sch->gso_skb, skb);
+			/* it's still part of the queue */
+			qdisc_qstats_backlog_inc(sch, skb);
+			sch->q.qlen++;
+		}
+
+		dualpi2_dequeue_drop(sch);
+	}
+
 	return skb;
 }
 
@@ -1165,7 +1202,7 @@ static struct Qdisc_ops dualpi2_qdisc_ops __read_mostly = {
 	.priv_size	= sizeof(struct dualpi2_sched_data),
 	.enqueue	= dualpi2_qdisc_enqueue,
 	.dequeue	= dualpi2_qdisc_dequeue,
-	.peek		= qdisc_peek_dequeued,
+	.peek		= dualpi2_peek,
 	.init		= dualpi2_init,
 	.destroy	= dualpi2_destroy,
 	.reset		= dualpi2_reset,
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net 4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent
  2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
                   ` (2 preceding siblings ...)
  2026-06-10 19:28 ` [PATCH net 3/4] net/sched: sch_dualpi2: " Victor Nogueira
@ 2026-06-10 19:28 ` Victor Nogueira
  2026-06-13  0:30 ` [PATCH net 0/4] Avoid mistaken parent class deactivation during peek patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Create 3 test cases:
- Verify fq_codel won't mistakenly deactivate QFQ parent class during peek
- Verify codel won't mistakenly deactivate QFQ parent class during peek
- Verify dualpi2 won't mistakenly deactivate QFQ parent class during peek

Verify that these 3 qdiscs (fq_codel, codel, dualpi2) will not call
qdisc_tree_reduce_backlog with an incorrect qlen (0) during peek and
mistakenly deactivate a parent class.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
 .../tc-testing/tc-tests/infra/qdiscs.json     | 184 ++++++++++++++++++
 1 file changed, 184 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
index 82c38a13dfbf..e83e31b932dc 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
@@ -1326,5 +1326,189 @@
         "teardown": [
             "$TC qdisc del dev $DUMMY handle 1: root"
         ]
+    },
+    {
+        "id": "c797",
+        "name": "Verify fq_codel won't mistakenly deactivate QFQ parent class during peek",
+        "category": [
+            "qdisc",
+            "qfq",
+            "fq_codel"
+        ],
+        "plugins": {
+            "requires": "nsPlugin"
+        },
+        "setup": [
+            "$IP link set dev $DUMMY up || true",
+            "$IP addr add 10.10.10.10/24 dev $DUMMY || true",
+            "$TC qdisc add dev $DUMMY root handle 1: qfq",
+            "$TC class add dev $DUMMY parent 1: classid 1:1 qfq weight 1 maxpkt 1000",
+            "$TC class add dev $DUMMY parent 1: classid 1:2 qfq weight 1 maxpkt 1000",
+            "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 plug limit 1024",
+            "$IP l set dev $DUMMY mtu 1500",
+            "$TC qdisc add dev $DUMMY parent 1:2 handle 10: fq_codel target 1 interval 1 flows 1",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 1 u32 match ip dst 10.10.10.1/32 flowid 1:1",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 2 u32 match ip dst 10.10.10.2/32 flowid 1:2",
+            "$IP l set dev $DUMMY mtu 65336",
+            "ping -c 1 -I $DUMMY 10.10.10.1 -W0.01 > /dev/null || true",
+            "ping -c 3 -s 2000 -I $DUMMY 10.10.10.2 -W0.01 > /dev/null || true",
+            "sleep 0.1"
+        ],
+        "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 2:0 plug release_indefinite",
+        "expExitCode": "0",
+        "verifyCmd": "$TC -s -j qdisc show dev $DUMMY",
+        "matchJSON": [
+            {
+                "kind": "qfq",
+                "handle": "1:",
+                "packets": 3,
+                "drops": 1,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "plug",
+                "handle": "2:",
+                "packets": 1,
+                "drops": 0,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "fq_codel",
+                "handle": "10:",
+                "packets": 2,
+                "drops": 1,
+                "backlog": 0,
+                "qlen": 0
+            }
+        ],
+        "teardown": [
+            "$TC qdisc del dev $DUMMY root",
+            "$IP addr del 10.10.10.10/24 dev $DUMMY || true"
+        ]
+    },
+    {
+        "id": "82d9",
+        "name": "Verify codel won't mistakenly deactivate QFQ parent class during peek",
+        "category": [
+            "qdisc",
+            "qfq",
+            "codel"
+        ],
+        "plugins": {
+            "requires": "nsPlugin"
+        },
+        "setup": [
+            "$IP link set dev $DUMMY up || true",
+            "$IP addr add 10.10.10.10/24 dev $DUMMY || true",
+            "$TC qdisc add dev $DUMMY root handle 1: qfq",
+            "$TC class add dev $DUMMY parent 1: classid 1:1 qfq weight 1 maxpkt 1000",
+            "$TC class add dev $DUMMY parent 1: classid 1:2 qfq weight 1 maxpkt 1000",
+            "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 plug limit 1024",
+            "$IP l set dev $DUMMY mtu 1500",
+            "$TC qdisc add dev $DUMMY parent 1:2 handle 10: codel target 1ms interval 1ms",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 1 u32 match ip dst 10.10.10.1/32 flowid 1:1",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 2 u32 match ip dst 10.10.10.2/32 flowid 1:2",
+            "$IP l set dev $DUMMY mtu 65336",
+            "ping -c 1 -I $DUMMY 10.10.10.1 -W0.01 > /dev/null || true",
+            "ping -c 3 -s 2000 -I $DUMMY 10.10.10.2 -W0.01 > /dev/null || true",
+            "sleep 0.1"
+        ],
+        "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 2:0 plug release_indefinite",
+        "expExitCode": "0",
+        "verifyCmd": "$TC -s -j qdisc show dev $DUMMY",
+        "matchJSON": [
+            {
+                "kind": "qfq",
+                "handle": "1:",
+                "packets": 3,
+                "drops": 1,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "plug",
+                "handle": "2:",
+                "packets": 1,
+                "drops": 0,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "codel",
+                "handle": "10:",
+                "packets": 2,
+                "drops": 1,
+                "backlog": 0,
+                "qlen": 0
+            }
+        ],
+        "teardown": [
+            "$TC qdisc del dev $DUMMY root",
+            "$IP addr del 10.10.10.10/24 dev $DUMMY || true"
+        ]
+    },
+    {
+        "id": "d3da",
+        "name": "Verify dualpi2 won't mistakenly deactivate QFQ parent class during peek",
+        "category": [
+            "qdisc",
+            "qfq",
+            "dualpi2"
+        ],
+        "plugins": {
+            "requires": "nsPlugin"
+        },
+        "setup": [
+            "$IP link set dev $DUMMY up || true",
+            "$IP addr add 10.10.10.10/24 dev $DUMMY || true" ,
+            "$TC qdisc add dev $DUMMY root handle 1: qfq",
+            "$TC class add dev $DUMMY parent 1: classid 1:1 qfq weight 1 maxpkt 1000",
+            "$TC class add dev $DUMMY parent 1: classid 1:2 qfq weight 1 maxpkt 1000",
+            "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 plug limit 1024",
+            "$TC qdisc add dev $DUMMY parent 1:2 handle 10: dualpi2 step_thresh 500ms",
+            "$TC filter add dev $DUMMY parent 10: protocol ip prio 1 matchall classid 10:1 action ok",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 1 u32 match ip dst 10.10.10.1/32 flowid 1:1",
+            "$TC filter add dev $DUMMY parent 1: protocol ip prio 2 u32 match ip dst 10.10.10.2/32 flowid 1:2",
+            "ping -c 1 -I $DUMMY 10.10.10.1 -W0.01 || true",
+            "ping -c 3 -i 0.1 -I $DUMMY 10.10.10.2 -W0.01 || true",
+            "sleep 0.7",
+            "ping -c 1 -I $DUMMY 10.10.10.2 -W0.01 || true",
+            "$TC qdisc change dev $DUMMY handle 2:0 plug release_indefinite"
+        ],
+        "cmdUnderTest": "ping -c 1 -I $DUMMY 10.10.10.1 -W0.01",
+        "expExitCode": "1",
+        "verifyCmd": "$TC -s -j qdisc show dev $DUMMY",
+        "matchJSON": [
+            {
+                "kind": "qfq",
+                "handle": "1:",
+                "packets": 4,
+                "drops": 2,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "plug",
+                "handle": "2:",
+                "packets": 2,
+                "drops": 0,
+                "backlog": 0,
+                "qlen": 0
+            },
+            {
+                "kind": "dualpi2",
+                "handle": "10:",
+                "packets": 2,
+                "drops": 2,
+                "backlog": 0,
+                "qlen": 0
+            }
+        ],
+        "teardown": [
+            "$TC qdisc del dev $DUMMY root",
+            "$IP addr del 10.10.10.10/24 dev $DUMMY || true"
+        ]
     }
 ]
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net 0/4] Avoid mistaken parent class deactivation during peek
  2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
                   ` (3 preceding siblings ...)
  2026-06-10 19:28 ` [PATCH net 4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent Victor Nogueira
@ 2026-06-13  0:30 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-06-13  0:30 UTC (permalink / raw)
  To: Victor Nogueira
  Cc: davem, edumazet, kuba, pabeni, horms, jhs, jiri, netdev,
	anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 10 Jun 2026 16:28:51 -0300 you wrote:
> Several qdiscs (fq_codel, codel and dualpi2) may drop packets while
> peeking at their queue. When that happens they call
> qdisc_tree_reduce_backlog() to notify the parent of the backlog/qlen
> change. The problem is that they do so *before* reincrementing the qlen
> that peek had temporarily decremented.
> 
> If the qlen momentarily drops to zero while peek still has an skb to
> return, qdisc_tree_reduce_backlog() ends up invoking the parent's
> qlen_notify() callback even though the child is not actually empty. The
> parent then deactivates the class, while the child still holds a packet.
> For parents such as QFQ this desync corrupts the active class list and
> leads to wild memory accesses and NULL pointer dereferences (see the
> per-patch splats). For HFSC it might lead to stalls [1].
> 
> [...]

Here is the summary with links:
  - [net,1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
    https://git.kernel.org/netdev/net/c/097f6fc7b1ae
  - [net,2/4] net/sched: sch_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
    https://git.kernel.org/netdev/net/c/52f1da34c9f4
  - [net,3/4] net/sched: sch_dualpi2: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen
    https://git.kernel.org/netdev/net/c/15cd0c93bf4f
  - [net,4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent
    https://git.kernel.org/netdev/net/c/101f1047c2f6

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-13  0:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
2026-06-10 19:28 ` [PATCH net 2/4] net/sched: sch_codel: " Victor Nogueira
2026-06-10 19:28 ` [PATCH net 3/4] net/sched: sch_dualpi2: " Victor Nogueira
2026-06-10 19:28 ` [PATCH net 4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent Victor Nogueira
2026-06-13  0:30 ` [PATCH net 0/4] Avoid mistaken parent class deactivation during peek patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox