From: Cong Wang <xiyou.wangcong@gmail.com>
To: netdev@vger.kernel.org
Cc: jhs@mojatatu.com, will@willsroot.io, stephen@networkplumber.org,
Cong Wang <xiyou.wangcong@gmail.com>,
Savino Dicanosa <savy@syst3mfailure.io>
Subject: [Patch v4 net 1/6] net_sched: Implement the right netem duplication behavior
Date: Sat, 19 Jul 2025 15:03:36 -0700 [thread overview]
Message-ID: <20250719220341.1615951-2-xiyou.wangcong@gmail.com> (raw)
In-Reply-To: <20250719220341.1615951-1-xiyou.wangcong@gmail.com>
In the old behavior, duplicated packets were sent back to the root qdisc,
which could create dangerous infinite loops in hierarchical setups -
imagine a scenario where each level of a multi-stage netem hierarchy kept
feeding duplicates back to the top, potentially causing system instability
or resource exhaustion.
The new behavior elegantly solves this by enqueueing duplicates to the same
qdisc that created them, ensuring that packet duplication occurs exactly
once per netem stage in a controlled, predictable manner. This change
enables users to safely construct complex network emulation scenarios using
netem hierarchies (like the 4x multiplication demonstrated in testing)
without worrying about runaway packet generation, while still preserving
the intended duplication effects.
Another advantage of this approach is that it eliminates the enqueue reentrant
behaviour which triggered many vulnerabilities. See the last patch in this
patchset which updates the test cases for such vulnerabilities.
Now users can confidently chain multiple netem qdiscs together to achieve
sophisticated network impairment combinations, knowing that each stage will
apply its effects exactly once to the packet flow, making network testing
scenarios more reliable and results more deterministic.
I tested netem packet duplication in two configurations:
1. Nest netem-to-netem hierarchy using parent/child attachment
2. Single netem using prio qdisc with netem leaf
Setup commands and results:
Single netem hierarchy (prio + netem):
tc qdisc add dev lo root handle 1: prio bands 3 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
tc filter add dev lo parent 1:0 protocol ip matchall classid 1:1
tc qdisc add dev lo parent 1:1 handle 10: netem limit 4 duplicate 100%
Result: 2x packet multiplication (1→2 packets)
2 echo requests + 4 echo replies = 6 total packets
Expected behavior: Only one netem stage exists in this hierarchy, so
1 ping becomes 2 packets (100% duplication). The 2 echo requests generate
2 echo replies, which also get duplicated to 4 replies, yielding the
predictable total of 6 packets (2 requests + 4 replies).
Nest netem hierarchy (netem + netem):
tc qdisc add dev lo root handle 1: netem limit 1000 duplicate 100%
tc qdisc add dev lo parent 1: handle 2: netem limit 1000 duplicate 100%
Result: 4x packet multiplication (1→2→4 packets)
4 echo requests + 16 echo replies = 20 total packets
Expected behavior: Root netem duplicates 1 ping to 2 packets, child netem
receives 2 packets and duplicates each to create 4 total packets. Since
ping operates bidirectionally, 4 echo requests generate 4 echo replies,
which also get duplicated through the same hierarchy (4→8→16), resulting
in the predictable total of 20 packets (4 requests + 16 replies).
The new netem duplication behavior does not break the documented
semantics of "creates a copy of the packet before queuing." The man page
description remains true since duplication occurs before the queuing
process, creating both original and duplicate packets that are then
enqueued. The documentation does not specify which qdisc should receive
the duplicates, only that copying happens before queuing. The implementation
choice to enqueue duplicates to the same qdisc (rather than root) is an
internal detail that maintains the documented behavior while preventing
infinite loops in hierarchical configurations.
Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication")
Reported-by: William Liu <will@willsroot.io>
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
net/sched/sch_netem.c | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index fdd79d3ccd8c..191f64bd68ff 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -165,6 +165,7 @@ struct netem_sched_data {
*/
struct netem_skb_cb {
u64 time_to_send;
+ u8 duplicate : 1;
};
static inline struct netem_skb_cb *netem_skb_cb(struct sk_buff *skb)
@@ -460,8 +461,16 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
skb->prev = NULL;
/* Random duplication */
- if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
- ++count;
+ if (q->duplicate) {
+ bool dup = true;
+
+ if (netem_skb_cb(skb)->duplicate) {
+ netem_skb_cb(skb)->duplicate = 0;
+ dup = false;
+ }
+ if (dup && q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
+ ++count;
+ }
/* Drop packet? */
if (loss_event(q)) {
@@ -532,17 +541,12 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
}
/*
- * If doing duplication then re-insert at top of the
- * qdisc tree, since parent queuer expects that only one
- * skb will be queued.
+ * If doing duplication then re-insert at the same qdisc,
+ * as going back to the root would induce loops.
*/
if (skb2) {
- struct Qdisc *rootq = qdisc_root_bh(sch);
- u32 dupsave = q->duplicate; /* prevent duplicating a dup... */
-
- q->duplicate = 0;
- rootq->enqueue(skb2, rootq, to_free);
- q->duplicate = dupsave;
+ netem_skb_cb(skb2)->duplicate = 1;
+ qdisc_enqueue(skb2, sch, to_free);
skb2 = NULL;
}
--
2.34.1
next prev parent reply other threads:[~2025-07-19 22:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-19 22:03 [Patch v4 net 0/6] netem: Fix skb duplication logic and prevent infinite loops Cong Wang
2025-07-19 22:03 ` Cong Wang [this message]
2025-07-19 22:03 ` [Patch v4 net 2/6] net_sched: Check the return value of qfq_choose_next_agg() Cong Wang
2025-07-20 23:34 ` Xiang Mei
2025-07-19 22:03 ` [Patch v4 net 3/6] selftests/tc-testing: Add a nested netem duplicate test Cong Wang
2025-07-19 22:03 ` [Patch v4 net 4/6] selftests/tc-testing: Add a test case for piro with netem duplicate Cong Wang
2025-07-19 22:03 ` [Patch v4 net 5/6] selftests/tc-testing: Add a test case for mq " Cong Wang
2025-07-19 22:03 ` [Patch v4 net 6/6] selftests/tc-testing: Update test cases " Cong Wang
2025-07-21 14:00 ` [Patch v4 net 0/6] netem: Fix skb duplication logic and prevent infinite loops Jamal Hadi Salim
2025-07-21 14:27 ` Jamal Hadi Salim
2025-07-21 14:54 ` Stephen Hemminger
2025-07-30 19:17 ` Cong Wang
2025-08-01 22:04 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250719220341.1615951-2-xiyou.wangcong@gmail.com \
--to=xiyou.wangcong@gmail.com \
--cc=jhs@mojatatu.com \
--cc=netdev@vger.kernel.org \
--cc=savy@syst3mfailure.io \
--cc=stephen@networkplumber.org \
--cc=will@willsroot.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).