* [PATCH v7 0/5] netem: bug fixes
@ 2026-04-15 14:27 Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev; +Cc: Stephen Hemminger
These bugs were found when doing AI assisted review of sch_netem.c
during investigation of the packet duplication recursion problem
addressed in Jamal's series.
The fixes cover:
- probability gaps in the 4-state Markov loss model
- queue limit not accounting for reordered packets
- PRNG reseeded on every tc change, breaking reproducibility
- slot delay configuration not validated for inverted ranges
- slot delay arithmetic overflow for ranges above ~2.1 seconds
v7 - queue limit check Fixes: goes back further to earlier change
- use NL_SET_ERR_MSG_ATTR
Stephen Hemminger (5):
net/sched: netem: fix probability gaps in 4-state loss model
net/sched: netem: fix queue limit check to include reordered packets
net/sched: netem: only reseed PRNG when seed is explicitly provided
net/sched: netem: check for invalid slot range
net/sched: netem: fix slot delay calculation overflow
net/sched/sch_netem.c | 44 +++++++++++++++++++++++++++++++------------
1 file changed, 32 insertions(+), 12 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
@ 2026-04-15 14:27 ` Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 2/5] net/sched: netem: fix queue limit check to include reordered packets Stephen Hemminger
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list
The 4-state Markov chain in loss_4state() has gaps at the boundaries
between transition probability ranges. The comparisons use:
if (rnd < a4)
else if (a4 < rnd && rnd < a1 + a4)
When rnd equals a boundary value exactly, neither branch matches and
no state transition occurs. The redundant lower-bound check (a4 < rnd)
is already implied by being in the else branch.
Remove the unnecessary lower-bound comparisons so the ranges are
contiguous and every random value produces a transition, matching
the GI (General and Intuitive) loss model specification.
This bug goes back to original implementation of this model.
Fixes: 661b79725fea ("netem: revised correlated loss generator")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 20df1c08b1e9..8ee72cac1faf 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -227,10 +227,10 @@ static bool loss_4state(struct netem_sched_data *q)
if (rnd < clg->a4) {
clg->state = LOST_IN_GAP_PERIOD;
return true;
- } else if (clg->a4 < rnd && rnd < clg->a1 + clg->a4) {
+ } else if (rnd < clg->a1 + clg->a4) {
clg->state = LOST_IN_BURST_PERIOD;
return true;
- } else if (clg->a1 + clg->a4 < rnd) {
+ } else {
clg->state = TX_IN_GAP_PERIOD;
}
@@ -247,9 +247,9 @@ static bool loss_4state(struct netem_sched_data *q)
case LOST_IN_BURST_PERIOD:
if (rnd < clg->a3)
clg->state = TX_IN_BURST_PERIOD;
- else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
+ else if (rnd < clg->a2 + clg->a3) {
clg->state = TX_IN_GAP_PERIOD;
- } else if (clg->a2 + clg->a3 < rnd) {
+ } else {
clg->state = LOST_IN_BURST_PERIOD;
return true;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v7 2/5] net/sched: netem: fix queue limit check to include reordered packets
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
@ 2026-04-15 14:27 ` Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided Stephen Hemminger
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Martin Ottens, open list
The queue limit check in netem_enqueue() uses q->t_len which only
counts packets in the internal tfifo. Packets placed in sch->q by
the reorder path (__qdisc_enqueue_head) are not counted, allowing
the total queue occupancy to exceed sch->limit under reordering.
Include sch->q.qlen in the limit check.
Fixes: f8d4bc455047 ("net/sched: netem: account for backlog updates from child qdisc")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8ee72cac1faf..d400a730eadd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -524,7 +524,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
1 << get_random_u32_below(8);
}
- if (unlikely(q->t_len >= sch->limit)) {
+ if (unlikely(sch->q.qlen >= sch->limit)) {
/* re-link segs, so that qdisc_drop_all() frees them all */
skb->next = segs;
qdisc_drop_all(skb, sch, to_free);
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v7 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 2/5] net/sched: netem: fix queue limit check to include reordered packets Stephen Hemminger
@ 2026-04-15 14:27 ` Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 4/5] net/sched: netem: check for invalid slot range Stephen Hemminger
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
François Michel, open list
netem_change() unconditionally reseeds the PRNG on every tc change
command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed
is generated, destroying reproducibility for users who set a
deterministic seed on a previous change.
Move the initial random seed generation to netem_init() and only
reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly
provided by the user.
Fixes: 4072d97ddc44 ("netem: add prng attribute to netem_sched_data")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index d400a730eadd..556f9747f0e7 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -1112,11 +1112,10 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
/* capping jitter to the range acceptable by tabledist() */
q->jitter = min_t(s64, abs(q->jitter), INT_MAX);
- if (tb[TCA_NETEM_PRNG_SEED])
+ if (tb[TCA_NETEM_PRNG_SEED]) {
q->prng.seed = nla_get_u64(tb[TCA_NETEM_PRNG_SEED]);
- else
- q->prng.seed = get_random_u64();
- prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+ prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+ }
unlock:
sch_tree_unlock(sch);
@@ -1139,6 +1138,9 @@ static int netem_init(struct Qdisc *sch, struct nlattr *opt,
return -EINVAL;
q->loss_model = CLG_RANDOM;
+ q->prng.seed = get_random_u64();
+ prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+
ret = netem_change(sch, opt, extack);
if (ret)
pr_info("netem: change failed\n");
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v7 4/5] net/sched: netem: check for invalid slot range
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
` (2 preceding siblings ...)
2026-04-15 14:27 ` [PATCH v7 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided Stephen Hemminger
@ 2026-04-15 14:27 ` Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 5/5] net/sched: netem: fix slot delay calculation overflow Stephen Hemminger
2026-04-17 16:02 ` [PATCH v7 0/5] netem: bug fixes Simon Horman
5 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Yousuk Seung, Neal Cardwell, open list
Reject slot configuration where min_delay exceeds max_delay.
The delay range computation in get_slot_next() underflows in
this case, producing bogus results.
Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 556f9747f0e7..8593e62f3c6a 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -827,6 +827,19 @@ static int get_dist_table(struct disttable **tbl, const struct nlattr *attr)
return 0;
}
+static int validate_slot(const struct nlattr *attr,
+ struct netlink_ext_ack *extack)
+{
+ const struct tc_netem_slot *c = nla_data(attr);
+
+ if (c->min_delay > c->max_delay) {
+ NL_SET_ERR_MSG_ATTR(extack, attr,
+ "slot min delay greater than max delay");
+ return -EINVAL;
+ }
+ return 0;
+}
+
static void get_slot(struct netem_sched_data *q, const struct nlattr *attr)
{
const struct tc_netem_slot *c = nla_data(attr);
@@ -1040,6 +1053,12 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
goto table_free;
}
+ if (tb[TCA_NETEM_SLOT]) {
+ ret = validate_slot(tb[TCA_NETEM_SLOT], extack);
+ if (ret)
+ goto table_free;
+ }
+
sch_tree_lock(sch);
/* backup q->clg and q->loss_model */
old_clg = q->clg;
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v7 5/5] net/sched: netem: fix slot delay calculation overflow
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
` (3 preceding siblings ...)
2026-04-15 14:27 ` [PATCH v7 4/5] net/sched: netem: check for invalid slot range Stephen Hemminger
@ 2026-04-15 14:27 ` Stephen Hemminger
2026-04-17 16:02 ` [PATCH v7 0/5] netem: bug fixes Simon Horman
5 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2026-04-15 14:27 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Yousuk Seung, Neal Cardwell, open list
get_slot_next() computes a random delay between min_delay and
max_delay using:
get_random_u32() * (max_delay - min_delay) >> 32
This overflows signed 64-bit arithmetic when the delay range exceeds
approximately 2.1 seconds (2^31 nanoseconds), producing a negative
result that effectively disables slot-based pacing. This is a
realistic configuration for WAN emulation (e.g., slot 1s 5s).
Use mul_u64_u32_shr() which handles the widening multiply without
overflow.
Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8593e62f3c6a..41e56908ab0c 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -659,9 +659,8 @@ static void get_slot_next(struct netem_sched_data *q, u64 now)
if (!q->slot_dist)
next_delay = q->slot_config.min_delay +
- (get_random_u32() *
- (q->slot_config.max_delay -
- q->slot_config.min_delay) >> 32);
+ mul_u64_u32_shr(q->slot_config.max_delay - q->slot_config.min_delay,
+ get_random_u32(), 32);
else
next_delay = tabledist(q->slot_config.dist_delay,
(s32)(q->slot_config.dist_jitter),
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v7 0/5] netem: bug fixes
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
` (4 preceding siblings ...)
2026-04-15 14:27 ` [PATCH v7 5/5] net/sched: netem: fix slot delay calculation overflow Stephen Hemminger
@ 2026-04-17 16:02 ` Simon Horman
5 siblings, 0 replies; 7+ messages in thread
From: Simon Horman @ 2026-04-17 16:02 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
On Wed, Apr 15, 2026 at 07:27:03AM -0700, Stephen Hemminger wrote:
> These bugs were found when doing AI assisted review of sch_netem.c
> during investigation of the packet duplication recursion problem
> addressed in Jamal's series.
>
> The fixes cover:
>
> - probability gaps in the 4-state Markov loss model
> - queue limit not accounting for reordered packets
> - PRNG reseeded on every tc change, breaking reproducibility
> - slot delay configuration not validated for inverted ranges
> - slot delay arithmetic overflow for ranges above ~2.1 seconds
>
> v7 - queue limit check Fixes: goes back further to earlier change
> - use NL_SET_ERR_MSG_ATTR
>
> Stephen Hemminger (5):
> net/sched: netem: fix probability gaps in 4-state loss model
> net/sched: netem: fix queue limit check to include reordered packets
> net/sched: netem: only reseed PRNG when seed is explicitly provided
> net/sched: netem: check for invalid slot range
> net/sched: netem: fix slot delay calculation overflow
To the maintainers: I'd like to ask for more time to complete review of this.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-17 16:02 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-15 14:27 [PATCH v7 0/5] netem: bug fixes Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 1/5] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 2/5] net/sched: netem: fix queue limit check to include reordered packets Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 4/5] net/sched: netem: check for invalid slot range Stephen Hemminger
2026-04-15 14:27 ` [PATCH v7 5/5] net/sched: netem: fix slot delay calculation overflow Stephen Hemminger
2026-04-17 16:02 ` [PATCH v7 0/5] netem: bug fixes Simon Horman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.