* [PATCH net 1/2] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree
@ 2025-07-14 3:12 William Liu
2025-07-14 3:14 ` [PATCH net 2/2] selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying William Liu
0 siblings, 1 reply; 3+ messages in thread
From: William Liu @ 2025-07-14 3:12 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, pabeni, kuba, savy, jiri, davem, edumazet,
horms, William Liu
htb_lookup_leaf has a BUG_ON that can trigger with the following:
tc qdisc del dev lo root
tc qdisc add dev lo root handle 1: htb default 1
tc class add dev lo parent 1: classid 1:1 htb rate 64bit
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2:1 handle 3: blackhole
ping -I lo -c1 -W0.001 127.0.0.1
The root cause is the following:
1. htb_dequeue calls htb_dequeue_tree which calls the dequeue handler on
the selected leaf qdisc
2. netem_dequeue calls enqueue on the child qdisc
3. blackhole_enqueue drops the packet and returns a value that is not
just NET_XMIT_SUCCESS
4. Because of this, netem_dequeue calls qdisc_tree_reduce_backlog, and
since qlen is now 0, it calls htb_qlen_notify -> htb_deactivate ->
htb_deactiviate_prios -> htb_remove_class_from_row -> htb_safe_rb_erase
5. As this is the only class in the selected hprio rbtree,
__rb_change_child in __rb_erase_augmented sets the rb_root pointer to
NULL
6. Because blackhole_dequeue returns NULL, netem_dequeue returns NULL,
which causes htb_dequeue_tree to call htb_lookup_leaf with the same
hprio rbtree, and fail the BUG_ON
The function graph for this scenario is shown here:
0) | htb_enqueue() {
0) + 13.635 us | netem_enqueue();
0) 4.719 us | htb_activate_prios();
0) # 2249.199 us | }
0) | htb_dequeue() {
0) 2.355 us | htb_lookup_leaf();
0) | netem_dequeue() {
0) + 11.061 us | blackhole_enqueue();
0) | qdisc_tree_reduce_backlog() {
0) | qdisc_lookup_rcu() {
0) 1.873 us | qdisc_match_from_root();
0) 6.292 us | }
0) 1.894 us | htb_search();
0) | htb_qlen_notify() {
0) 2.655 us | htb_deactivate_prios();
0) 6.933 us | }
0) + 25.227 us | }
0) 1.983 us | blackhole_dequeue();
0) + 86.553 us | }
0) # 2932.761 us | qdisc_warn_nonwc();
0) | htb_lookup_leaf() {
0) | BUG_ON();
------------------------------------------
The full original bug report can be seen here [1].
We can fix this just by returning NULL instead of the BUG_ON,
as htb_dequeue_tree returns NULL when htb_lookup_leaf returns
NULL.
[1] https://lore.kernel.org/netdev/pF5XOOIim0IuEfhI-SOxTgRvNoDwuux7UHKnE_Y5-zVd4wmGvNk2ceHjKb8ORnzw0cGwfmVu42g9dL7XyJLf1NEzaztboTWcm0Ogxuojoeo=@willsroot.io/
Fixes: 512bb43eb542 ("pkt_sched: sch_htb: Optimize WARN_ONs in htb_dequeue_tree() etc.")
Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
---
net/sched/sch_htb.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 14bf71f57057..c968ea763774 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -821,7 +821,9 @@ static struct htb_class *htb_lookup_leaf(struct htb_prio *hprio, const int prio)
u32 *pid;
} stk[TC_HTB_MAXDEPTH], *sp = stk;
- BUG_ON(!hprio->row.rb_node);
+ if (unlikely(!hprio->row.rb_node))
+ return NULL;
+
sp->root = hprio->row.rb_node;
sp->pptr = &hprio->ptr;
sp->pid = &hprio->last_ptr_id;
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH net 2/2] selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
2025-07-14 3:12 [PATCH net 1/2] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree William Liu
@ 2025-07-14 3:14 ` William Liu
2025-07-16 23:53 ` Jakub Kicinski
0 siblings, 1 reply; 3+ messages in thread
From: William Liu @ 2025-07-14 3:14 UTC (permalink / raw)
To: netdev
Cc: jhs, xiyou.wangcong, pabeni, kuba, savy, jiri, davem, edumazet,
horms, William Liu
Ensure that any deactivation and row emptying that occurs
during htb_dequeue_tree does not cause a kernel panic.
This scenario originally triggered a kernel BUG_ON, and
we are checking for a graceful fail now.
Signed-off-by: William Liu <will@willsroot.io>
Signed-off-by: Savino Dicanosa <savy@syst3mfailure.io>
---
.../tc-testing/tc-tests/infra/qdiscs.json | 26 +++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
index 5c6851e8d311..a8b58ffd8404 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
@@ -672,5 +672,31 @@
"teardown": [
"$TC qdisc del dev $DUMMY root handle 1: drr"
]
+ },
+ {
+ "id": "5456",
+ "name": "Test htb_dequeue_tree with deactivation and row emptying",
+ "category": [
+ "qdisc",
+ "htb"
+ ],
+ "plugins": {
+ "requires": "nsPlugin"
+ },
+ "setup": [
+ "$IP link set dev $DUMMY up || true",
+ "$IP addr add 10.10.11.10/24 dev $DUMMY || true",
+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
+ "$TC class add dev $DUMMY parent 1: classid 1:1 htb rate 64bit ",
+ "$TC qdisc add dev $DUMMY parent 1:1 handle 2: netem",
+ "$TC qdisc add dev $DUMMY parent 2:1 handle 3: blackhole"
+ ],
+ "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.11.11",
+ "expExitCode": "1",
+ "verifyCmd": "$TC -j qdisc show dev $DUMMY",
+ "matchJSON": [],
+ "teardown": [
+ "$TC qdisc del dev $DUMMY root"
+ ]
}
]
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net 2/2] selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying
2025-07-14 3:14 ` [PATCH net 2/2] selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying William Liu
@ 2025-07-16 23:53 ` Jakub Kicinski
0 siblings, 0 replies; 3+ messages in thread
From: Jakub Kicinski @ 2025-07-16 23:53 UTC (permalink / raw)
To: William Liu
Cc: netdev, jhs, xiyou.wangcong, pabeni, savy, jiri, davem, edumazet,
horms
On Mon, 14 Jul 2025 03:14:25 +0000 William Liu wrote:
> Ensure that any deactivation and row emptying that occurs
> during htb_dequeue_tree does not cause a kernel panic.
> This scenario originally triggered a kernel BUG_ON, and
> we are checking for a graceful fail now.
The test doesn't apply anymore, could you respin?
Perhaps try to add the testcase somewhere in the middle rather than
at the end, maybe next to an existing HTB or netem test?
--
pw-bot: cr
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-16 23:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-14 3:12 [PATCH net 1/2] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree William Liu
2025-07-14 3:14 ` [PATCH net 2/2] selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying William Liu
2025-07-16 23:53 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).