Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/4] Avoid mistaken parent class deactivation during peek
@ 2026-06-10 19:28 Victor Nogueira
  2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Victor Nogueira @ 2026-06-10 19:28 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, horms, jhs, jiri
  Cc: netdev, anirudhrudr, pctammela, ij, henrist, chia-yu.chang

Several qdiscs (fq_codel, codel and dualpi2) may drop packets while
peeking at their queue. When that happens they call
qdisc_tree_reduce_backlog() to notify the parent of the backlog/qlen
change. The problem is that they do so *before* reincrementing the qlen
that peek had temporarily decremented.

If the qlen momentarily drops to zero while peek still has an skb to
return, qdisc_tree_reduce_backlog() ends up invoking the parent's
qlen_notify() callback even though the child is not actually empty. The
parent then deactivates the class, while the child still holds a packet.
For parents such as QFQ this desync corrupts the active class list and
leads to wild memory accesses and NULL pointer dereferences (see the
per-patch splats). For HFSC it might lead to stalls [1].

Fix all three qdiscs the same way: only call qdisc_tree_reduce_backlog()
once the qlen has been restored, so the parent never observes a
transient empty child during peek.

Patch 1 fixes this for fq_codel, patch 2 for codel, patch 3 for dualpi2
and patch 4 adds test cases for these 3 setups.

Note: Patch 1 is one of two fixes for the stall reported in [1]; the
companion fix is "net/sched: sch_hfsc: Don't make class passive twice",
sent separately.

Note2: A possible cleaner fix is to create a new helper function for peek
that only calls qdisc_tree_reduce_backlog after reincrementing the qlen.
This would be called from the 3 vulnerable qdiscs, however we thought this
might make it harder for backporting so, if people agree, we can submit
this cleaner version to net-next after this one is merged.

[1] https://lore.kernel.org/netdev/CAN2cbVe79oj0O9==m4+4x3v+O+qzRagA=2=wkrp9i9=CqYvyZA@mail.gmail.com/

Victor Nogueira (4):
  net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  net/sched: sch_codel: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  net/sched: sch_dualpi2: Do not call qdisc_tree_reduce_backlog during
    peek before restoring qlen
  selftests/tc-testing: Verify child qdisc will not mistakenly
    deactivate QFQ parent

 net/sched/sch_codel.c                         |  48 ++++-
 net/sched/sch_dualpi2.c                       |  41 +++-
 net/sched/sch_fq_codel.c                      |  41 +++-
 .../tc-testing/tc-tests/infra/qdiscs.json     | 184 ++++++++++++++++++
 4 files changed, 305 insertions(+), 9 deletions(-)

-- 
2.54.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-13  0:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 19:28 [PATCH net 0/4] Avoid mistaken parent class deactivation during peek Victor Nogueira
2026-06-10 19:28 ` [PATCH net 1/4] net/sched: sch_fq_codel: Do not call qdisc_tree_reduce_backlog during peek before restoring qlen Victor Nogueira
2026-06-10 19:28 ` [PATCH net 2/4] net/sched: sch_codel: " Victor Nogueira
2026-06-10 19:28 ` [PATCH net 3/4] net/sched: sch_dualpi2: " Victor Nogueira
2026-06-10 19:28 ` [PATCH net 4/4] selftests/tc-testing: Verify child qdisc will not mistakenly deactivate QFQ parent Victor Nogueira
2026-06-13  0:30 ` [PATCH net 0/4] Avoid mistaken parent class deactivation during peek patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox