From: Jamal Hadi Salim <jhs@mojatatu.com>
To: netdev@vger.kernel.org
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
davem@davemloft.net, kuba@kernel.org, edumazet@google.com,
pabeni@redhat.com, horms@kernel.org, jiri@resnulli.us,
victor@mojatatu.com, pctammela@mojatatu.com,
ghandatmanas@gmail.com, rakshitawasthi17@gmail.com,
security@kernel.org
Subject: [PATCH net 1/3] net/sched: sch_red: Replace direct dequeue call with peek and qdisc_dequeue_peeked
Date: Thu, 30 Apr 2026 11:29:55 -0400 [thread overview]
Message-ID: <20260430152957.194015-2-jhs@mojatatu.com> (raw)
In-Reply-To: <20260430152957.194015-1-jhs@mojatatu.com>
When red qdisc has children (eg qfq qdisc) whose peek() callback is
qdisc_peek_dequeued(), we could get a kernel panic. When the parent of such
qdiscs (eg illustrated in patch #3 as tbf) wants to retrieve an skb from
its child (red in this case), it will do the following:
1a. do a peek() - and when sensing there's an skb the child can offer, then
- the child in this case(red) calls its child's (qfq) peek.
qfq does the right thing and will return the gso_skb queue packet.
Note: if there wasnt a gso_skb entry then qfq will store it there.
1b. invoke a dequeue() on the child (red). And herein lies the problem.
- red will call the child's dequeue() which will essentially just
try to grab something of qfq's queue.
[ 78.667668][ T363] KASAN: null-ptr-deref in range [0x0000000000000048-0x000000000000004f]
[ 78.667927][ T363] CPU: 1 UID: 0 PID: 363 Comm: ping Not tainted 7.1.0-rc1-00033-g46f74a3f7d57-dirty #790 PREEMPT(full)
[ 78.668263][ T363] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 78.668486][ T363] RIP: 0010:qfq_dequeue+0x446/0xc90 [sch_qfq]
[ 78.668718][ T363] Code: 54 c0 e8 dd 90 00 f1 48 c7 c7 e0 03 54 c0 48 89 de e8 ce 90 00 f1 48 8d 7b 48 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a 48 c1 ea 03 <80> 3c 02 00 74 05 e8 ef a1 e1 f1 48 8b 7b 48 48 8d 54 24 58 48 8d
[ 78.669312][ T363] RSP: 0018:ffff88810de573e0 EFLAGS: 00010216
[ 78.669533][ T363] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 78.669790][ T363] RDX: 0000000000000009 RSI: 0000000000000004 RDI: 0000000000000048
[ 78.670044][ T363] RBP: ffff888110dc4000 R08: ffffffffb1b0885a R09: fffffbfff6ba9078
[ 78.670297][ T363] R10: 0000000000000003 R11: ffff888110e31c80 R12: 0000001880000000
[ 78.670560][ T363] R13: ffff888110dc4150 R14: ffff888110dc42b8 R15: 0000000000000200
[ 78.670814][ T363] FS: 00007f66a8f09c40(0000) GS:ffff888163428000(0000) knlGS:0000000000000000
[ 78.671110][ T363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 78.671324][ T363] CR2: 000055db4c6a30a8 CR3: 000000010da67000 CR4: 0000000000750ef0
[ 78.671585][ T363] PKRU: 55555554
[ 78.671713][ T363] Call Trace:
[ 78.671843][ T363] <TASK>
[ 78.671936][ T363] ? __pfx_qfq_dequeue+0x10/0x10 [sch_qfq]
[ 78.672148][ T363] ? __pfx__printk+0x10/0x10
[ 78.672322][ T363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 78.672496][ T363] ? lockdep_hardirqs_on_prepare+0xa8/0x1a0
[ 78.672706][ T363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 78.672875][ T363] ? trace_hardirqs_on+0x19/0x1a0
[ 78.673047][ T363] red_dequeue+0x65/0x270 [sch_red]
[ 78.673217][ T363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 78.673385][ T363] tbf_dequeue.cold+0xb0/0x70c [sch_tbf]
[ 78.673566][ T363] __qdisc_run+0x169/0x1900
The right thing to do in #1b is to grab the skb off gso_skb queue.
This patchset fixes that issue by changing #1b to use qdisc_dequeue_peeked()
method instead.
Fixes: 77be155cba4e ("pkt_sched: Add peek emulation for non-work-conserving qdiscs.")
Reported-by: Manas <ghandatmanas@gmail.com>
Reported-by: Rakshit Awasthi <rakshitawasthi17@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/sch_red.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index 432b8a3000a5..4d0e44a2e7c6 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -162,7 +162,7 @@ static struct sk_buff *red_dequeue(struct Qdisc *sch)
struct red_sched_data *q = qdisc_priv(sch);
struct Qdisc *child = q->qdisc;
- skb = child->dequeue(child);
+ skb = qdisc_dequeue_peeked(child);
if (skb) {
qdisc_bstats_update(sch, skb);
qdisc_qstats_backlog_dec(sch, skb);
--
2.34.1
next prev parent reply other threads:[~2026-04-30 15:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-30 15:29 [PATCH net 0/3] Replace direct dequeue call with qdisc_dequeue_peeked Jamal Hadi Salim
2026-04-30 15:29 ` Jamal Hadi Salim [this message]
2026-04-30 15:41 ` [PATCH net 1/3] net/sched: sch_red: Replace direct dequeue call with peek and qdisc_dequeue_peeked Eric Dumazet
2026-04-30 15:29 ` [PATCH net 2/3] net/sched: sch_sfb: " Jamal Hadi Salim
2026-04-30 15:42 ` Eric Dumazet
2026-04-30 15:29 ` [PATCH net 3/3] selftests/tc-testing: Add tests that force red and sfb to dequeue from child's gso_skb Jamal Hadi Salim
2026-05-02 18:20 ` [PATCH net 0/3] Replace direct dequeue call with qdisc_dequeue_peeked patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260430152957.194015-2-jhs@mojatatu.com \
--to=jhs@mojatatu.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=ghandatmanas@gmail.com \
--cc=horms@kernel.org \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pctammela@mojatatu.com \
--cc=rakshitawasthi17@gmail.com \
--cc=security@kernel.org \
--cc=victor@mojatatu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox