* [PATCH net v4 1/8] net/sched: netem: fix probability gaps in 4-state loss model
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 2/8] net/sched: netem: fix queue limit check to include reordered packets Stephen Hemminger
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list
The 4-state Markov chain in loss_4state() has gaps at the boundaries
between transition probability ranges. The comparisons use:
if (rnd < a4)
else if (a4 < rnd && rnd < a1 + a4)
When rnd equals a boundary value exactly, neither branch matches and
no state transition occurs. The redundant lower-bound check (a4 < rnd)
is already implied by being in the else branch.
Remove the unnecessary lower-bound comparisons so the ranges are
contiguous and every random value produces a transition, matching
the GI (General and Intuitive) loss model specification.
This bug goes back to original implementation of this model.
Fixes: 661b79725fea ("netem: revised correlated loss generator")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 20df1c08b1e9..8ee72cac1faf 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -227,10 +227,10 @@ static bool loss_4state(struct netem_sched_data *q)
if (rnd < clg->a4) {
clg->state = LOST_IN_GAP_PERIOD;
return true;
- } else if (clg->a4 < rnd && rnd < clg->a1 + clg->a4) {
+ } else if (rnd < clg->a1 + clg->a4) {
clg->state = LOST_IN_BURST_PERIOD;
return true;
- } else if (clg->a1 + clg->a4 < rnd) {
+ } else {
clg->state = TX_IN_GAP_PERIOD;
}
@@ -247,9 +247,9 @@ static bool loss_4state(struct netem_sched_data *q)
case LOST_IN_BURST_PERIOD:
if (rnd < clg->a3)
clg->state = TX_IN_BURST_PERIOD;
- else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
+ else if (rnd < clg->a2 + clg->a3) {
clg->state = TX_IN_GAP_PERIOD;
- } else if (clg->a2 + clg->a3 < rnd) {
+ } else {
clg->state = LOST_IN_BURST_PERIOD;
return true;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 2/8] net/sched: netem: fix queue limit check to include reordered packets
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 1/8] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 3/8] net/sched: netem: only reseed PRNG when seed is explicitly provided Stephen Hemminger
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
open list
The queue limit check in netem_enqueue() uses q->t_len which only
counts packets in the internal tfifo. Packets placed in sch->q by
the reorder path (__qdisc_enqueue_head) are not counted, allowing
the total queue occupancy to exceed sch->limit under reordering.
Include sch->q.qlen in the limit check.
Fixes: 50612537e9ab ("netem: fix classful handling")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8ee72cac1faf..d400a730eadd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -524,7 +524,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
1 << get_random_u32_below(8);
}
- if (unlikely(q->t_len >= sch->limit)) {
+ if (unlikely(sch->q.qlen >= sch->limit)) {
/* re-link segs, so that qdisc_drop_all() frees them all */
skb->next = segs;
qdisc_drop_all(skb, sch, to_free);
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 3/8] net/sched: netem: only reseed PRNG when seed is explicitly provided
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 1/8] net/sched: netem: fix probability gaps in 4-state loss model Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 2/8] net/sched: netem: fix queue limit check to include reordered packets Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 4/8] net/sched: netem: refactor dequeue into helper functions Stephen Hemminger
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
François Michel, open list
netem_change() unconditionally reseeds the PRNG on every tc change
command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed
is generated, destroying reproducibility for users who set a
deterministic seed on a previous change.
Move the initial random seed generation to netem_init() and only
reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly
provided by the user.
Fixes: 4072d97ddc44 ("netem: add prng attribute to netem_sched_data")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index d400a730eadd..556f9747f0e7 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -1112,11 +1112,10 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
/* capping jitter to the range acceptable by tabledist() */
q->jitter = min_t(s64, abs(q->jitter), INT_MAX);
- if (tb[TCA_NETEM_PRNG_SEED])
+ if (tb[TCA_NETEM_PRNG_SEED]) {
q->prng.seed = nla_get_u64(tb[TCA_NETEM_PRNG_SEED]);
- else
- q->prng.seed = get_random_u64();
- prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+ prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+ }
unlock:
sch_tree_unlock(sch);
@@ -1139,6 +1138,9 @@ static int netem_init(struct Qdisc *sch, struct nlattr *opt,
return -EINVAL;
q->loss_model = CLG_RANDOM;
+ q->prng.seed = get_random_u64();
+ prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+
ret = netem_change(sch, opt, extack);
if (ret)
pr_info("netem: change failed\n");
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 4/8] net/sched: netem: refactor dequeue into helper functions
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
` (2 preceding siblings ...)
2026-04-06 17:25 ` [PATCH net v4 3/8] net/sched: netem: only reseed PRNG when seed is explicitly provided Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to avoid child re-entrancy Stephen Hemminger
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Jamal Hadi Salim, Jiri Pirko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
open list
Extract the tfifo removal, slot accounting, and child/direct dequeue
paths from the monolithic netem_dequeue() into separate helpers:
netem_pull_tfifo() - remove head packet from tfifo
netem_slot_account() - update slot pacing counters
netem_dequeue_child() - enqueue to child, then dequeue from child
netem_dequeue_direct()- dequeue from tfifo when no child
This replaces the goto-based control flow with straightforward function
calls, making the code easier to follow and modify.
No functional change intended.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
net/sched/sch_netem.c | 190 +++++++++++++++++++++++++++---------------
1 file changed, 123 insertions(+), 67 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 556f9747f0e7..e264f7aefb97 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -689,99 +689,155 @@ static struct sk_buff *netem_peek(struct netem_sched_data *q)
return q->t_head;
}
-static void netem_erase_head(struct netem_sched_data *q, struct sk_buff *skb)
+/*
+ * Pop the head packet from the tfifo and prepare it for delivery.
+ * skb->dev shares the rbnode area and must be restored after removal.
+ */
+static struct sk_buff *netem_pull_tfifo(struct netem_sched_data *q,
+ struct Qdisc *sch)
{
- if (skb == q->t_head) {
+ struct sk_buff *skb;
+
+ if (q->t_head) {
+ skb = q->t_head;
q->t_head = skb->next;
if (!q->t_head)
q->t_tail = NULL;
} else {
- rb_erase(&skb->rbnode, &q->t_root);
+ struct rb_node *p = rb_first(&q->t_root);
+
+ if (!p)
+ return NULL;
+ skb = rb_to_skb(p);
+ rb_erase(p, &q->t_root);
}
+
+ q->t_len--;
+ skb->next = NULL;
+ skb->prev = NULL;
+ skb->dev = qdisc_dev(sch);
+
+ return skb;
}
-static struct sk_buff *netem_dequeue(struct Qdisc *sch)
+/* Update slot pacing counters after releasing a packet */
+static void netem_slot_account(struct netem_sched_data *q,
+ const struct sk_buff *skb, u64 now)
+{
+ if (!q->slot.slot_next)
+ return;
+
+ q->slot.packets_left--;
+ q->slot.bytes_left -= qdisc_pkt_len(skb);
+ if (q->slot.packets_left <= 0 || q->slot.bytes_left <= 0)
+ get_slot_next(q, now);
+}
+
+/*
+ * Transfer time-ready packets from the tfifo into the child qdisc,
+ * then dequeue from the child.
+ */
+static struct sk_buff *netem_dequeue_child(struct Qdisc *sch)
{
struct netem_sched_data *q = qdisc_priv(sch);
+ u64 now = ktime_get_ns();
struct sk_buff *skb;
-tfifo_dequeue:
- skb = __qdisc_dequeue_head(&sch->q);
- if (skb) {
-deliver:
- qdisc_qstats_backlog_dec(sch, skb);
- qdisc_bstats_update(sch, skb);
- return skb;
- }
skb = netem_peek(q);
if (skb) {
- u64 time_to_send;
- u64 now = ktime_get_ns();
+ u64 time_to_send = netem_skb_cb(skb)->time_to_send;
- /* if more time remaining? */
- time_to_send = netem_skb_cb(skb)->time_to_send;
if (q->slot.slot_next && q->slot.slot_next < time_to_send)
get_slot_next(q, now);
if (time_to_send <= now && q->slot.slot_next <= now) {
- netem_erase_head(q, skb);
- q->t_len--;
- skb->next = NULL;
- skb->prev = NULL;
- /* skb->dev shares skb->rbnode area,
- * we need to restore its value.
- */
- skb->dev = qdisc_dev(sch);
-
- if (q->slot.slot_next) {
- q->slot.packets_left--;
- q->slot.bytes_left -= qdisc_pkt_len(skb);
- if (q->slot.packets_left <= 0 ||
- q->slot.bytes_left <= 0)
- get_slot_next(q, now);
- }
-
- if (q->qdisc) {
- unsigned int pkt_len = qdisc_pkt_len(skb);
- struct sk_buff *to_free = NULL;
- int err;
-
- err = qdisc_enqueue(skb, q->qdisc, &to_free);
- kfree_skb_list(to_free);
- if (err != NET_XMIT_SUCCESS) {
- if (net_xmit_drop_count(err))
- qdisc_qstats_drop(sch);
- sch->qstats.backlog -= pkt_len;
- sch->q.qlen--;
- qdisc_tree_reduce_backlog(sch, 1, pkt_len);
- }
- goto tfifo_dequeue;
- }
- sch->q.qlen--;
- goto deliver;
- }
-
- if (q->qdisc) {
- skb = q->qdisc->ops->dequeue(q->qdisc);
- if (skb) {
+ struct sk_buff *to_free = NULL;
+ unsigned int pkt_len;
+ int err;
+
+ skb = netem_pull_tfifo(q, sch);
+ netem_slot_account(q, skb, now);
+
+ pkt_len = qdisc_pkt_len(skb);
+ err = qdisc_enqueue(skb, q->qdisc, &to_free);
+ kfree_skb_list(to_free);
+ if (err != NET_XMIT_SUCCESS) {
+ if (net_xmit_drop_count(err))
+ qdisc_qstats_drop(sch);
+ sch->qstats.backlog -= pkt_len;
sch->q.qlen--;
- goto deliver;
+ qdisc_tree_reduce_backlog(sch, 1, pkt_len);
}
}
-
- qdisc_watchdog_schedule_ns(&q->watchdog,
- max(time_to_send,
- q->slot.slot_next));
}
- if (q->qdisc) {
- skb = q->qdisc->ops->dequeue(q->qdisc);
- if (skb) {
- sch->q.qlen--;
- goto deliver;
- }
+ skb = q->qdisc->ops->dequeue(q->qdisc);
+ if (skb)
+ sch->q.qlen--;
+
+ return skb;
+}
+
+/* Dequeue directly from the tfifo when no child qdisc is configured. */
+static struct sk_buff *netem_dequeue_direct(struct Qdisc *sch)
+{
+ struct netem_sched_data *q = qdisc_priv(sch);
+ struct sk_buff *skb;
+ u64 time_to_send;
+ u64 now;
+
+ skb = netem_peek(q);
+ if (!skb)
+ return NULL;
+
+ now = ktime_get_ns();
+ time_to_send = netem_skb_cb(skb)->time_to_send;
+
+ if (q->slot.slot_next && q->slot.slot_next < time_to_send)
+ get_slot_next(q, now);
+
+ if (time_to_send > now || q->slot.slot_next > now)
+ return NULL;
+
+ skb = netem_pull_tfifo(q, sch);
+ netem_slot_account(q, skb, now);
+ sch->q.qlen--;
+
+ return skb;
+}
+
+static struct sk_buff *netem_dequeue(struct Qdisc *sch)
+{
+ struct netem_sched_data *q = qdisc_priv(sch);
+ struct sk_buff *skb;
+
+ /* First check the reorder queue */
+ skb = __qdisc_dequeue_head(&sch->q);
+ if (skb)
+ goto deliver;
+
+ if (q->qdisc)
+ skb = netem_dequeue_child(sch);
+ else
+ skb = netem_dequeue_direct(sch);
+
+ if (skb)
+ goto deliver;
+
+ /* Nothing ready — schedule watchdog for next packet */
+ skb = netem_peek(q);
+ if (skb) {
+ u64 time_to_send = netem_skb_cb(skb)->time_to_send;
+
+ qdisc_watchdog_schedule_ns(&q->watchdog,
+ max(time_to_send, q->slot.slot_next));
}
return NULL;
+
+deliver:
+ qdisc_qstats_backlog_dec(sch, skb);
+ qdisc_bstats_update(sch, skb);
+ return skb;
}
static void netem_reset(struct Qdisc *sch)
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to avoid child re-entrancy
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
` (3 preceding siblings ...)
2026-04-06 17:25 ` [PATCH net v4 4/8] net/sched: netem: refactor dequeue into helper functions Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 6/8] net/sched: netem: null-terminate tfifo linear queue tail Stephen Hemminger
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Jamal Hadi Salim, Jiri Pirko, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
open list
netem_dequeue_child() previously transferred one packet from the tfifo
to the child qdisc per dequeue call. Parents like HFSC that track
class active/inactive state on qlen transitions could see an enqueue
during dequeue, causing double-insertion into the eltree
(CVE-2025-37890, CVE-2025-38001). Non-work-conserving children like
TBF could also refuse to return a just-enqueued packet, making netem
return NULL despite having backlog, which causes parents like DRR to
incorrectly deactivate the class.
Move all time-ready packets into the child before calling its dequeue.
This separates the enqueue and dequeue phases so the parent sees
consistent qlen transitions.
Fixes: 50612537e9ab ("netem: fix classful handling")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
net/sched/sch_netem.c | 49 +++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 25 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index e264f7aefb97..b93f0e886a2b 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -734,8 +734,10 @@ static void netem_slot_account(struct netem_sched_data *q,
}
/*
- * Transfer time-ready packets from the tfifo into the child qdisc,
- * then dequeue from the child.
+ * Transfer all time-ready packets from the tfifo into the child qdisc,
+ * then dequeue from the child. Batching the transfers avoids calling
+ * qdisc_enqueue() inside the parent's dequeue path, which confuses
+ * parents that track active/inactive state on qlen transitions (HFSC).
*/
static struct sk_buff *netem_dequeue_child(struct Qdisc *sch)
{
@@ -743,31 +745,28 @@ static struct sk_buff *netem_dequeue_child(struct Qdisc *sch)
u64 now = ktime_get_ns();
struct sk_buff *skb;
- skb = netem_peek(q);
- if (skb) {
- u64 time_to_send = netem_skb_cb(skb)->time_to_send;
-
- if (q->slot.slot_next && q->slot.slot_next < time_to_send)
- get_slot_next(q, now);
-
- if (time_to_send <= now && q->slot.slot_next <= now) {
- struct sk_buff *to_free = NULL;
- unsigned int pkt_len;
- int err;
+ while ((skb = netem_peek(q)) != NULL) {
+ struct sk_buff *to_free = NULL;
+ unsigned int pkt_len;
+ int err;
- skb = netem_pull_tfifo(q, sch);
- netem_slot_account(q, skb, now);
+ if (netem_skb_cb(skb)->time_to_send > now)
+ break;
+ if (q->slot.slot_next && q->slot.slot_next > now)
+ break;
- pkt_len = qdisc_pkt_len(skb);
- err = qdisc_enqueue(skb, q->qdisc, &to_free);
- kfree_skb_list(to_free);
- if (err != NET_XMIT_SUCCESS) {
- if (net_xmit_drop_count(err))
- qdisc_qstats_drop(sch);
- sch->qstats.backlog -= pkt_len;
- sch->q.qlen--;
- qdisc_tree_reduce_backlog(sch, 1, pkt_len);
- }
+ skb = netem_pull_tfifo(q, sch);
+ netem_slot_account(q, skb, now);
+
+ pkt_len = qdisc_pkt_len(skb);
+ err = qdisc_enqueue(skb, q->qdisc, &to_free);
+ kfree_skb_list(to_free);
+ if (unlikely(err != NET_XMIT_SUCCESS)) {
+ if (net_xmit_drop_count(err))
+ qdisc_qstats_drop(sch);
+ sch->qstats.backlog -= pkt_len;
+ sch->q.qlen--;
+ qdisc_tree_reduce_backlog(sch, 1, pkt_len);
}
}
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 6/8] net/sched: netem: null-terminate tfifo linear queue tail
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
` (4 preceding siblings ...)
2026-04-06 17:25 ` [PATCH net v4 5/8] net/sched: netem: batch-transfer ready packets to avoid child re-entrancy Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 7/8] net/sched: netem: check for invalid slot range Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 8/8] net/sched: netem: fix slot delay calculation overflow Stephen Hemminger
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Peter Oskolkov, open list
When tfifo_enqueue() appends a packet to the linear queue tail,
nskb->next is never set to NULL. The list terminates correctly
only by accident if the skb arrived with next already NULL.
Explicitly null-terminate the tail to prevent list corruption.
Fixes: d66280b12bd7 ("net: netem: use a list in addition to rbtree")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index b93f0e886a2b..e405bf862163 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -398,6 +398,7 @@ static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
q->t_tail->next = nskb;
else
q->t_head = nskb;
+ nskb->next = NULL;
q->t_tail = nskb;
} else {
struct rb_node **p = &q->t_root.rb_node, *parent = NULL;
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 7/8] net/sched: netem: check for invalid slot range
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
` (5 preceding siblings ...)
2026-04-06 17:25 ` [PATCH net v4 6/8] net/sched: netem: null-terminate tfifo linear queue tail Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
2026-04-06 17:25 ` [PATCH net v4 8/8] net/sched: netem: fix slot delay calculation overflow Stephen Hemminger
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Neal Cardwell, Yousuk Seung, open list
Reject slot configuration where min_delay exceeds max_delay.
The delay range computation in get_slot_next() underflows in
this case, producing bogus results.
Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index e405bf862163..c82f76af41aa 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -883,6 +883,18 @@ static int get_dist_table(struct disttable **tbl, const struct nlattr *attr)
return 0;
}
+static int validate_slot(const struct nlattr *attr,
+ struct netlink_ext_ack *extack)
+{
+ const struct tc_netem_slot *c = nla_data(attr);
+
+ if (c->min_delay > c->max_delay) {
+ NL_SET_ERR_MSG(extack, "slot min delay greater than max delay");
+ return -EINVAL;
+ }
+ return 0;
+}
+
static void get_slot(struct netem_sched_data *q, const struct nlattr *attr)
{
const struct tc_netem_slot *c = nla_data(attr);
@@ -1096,6 +1108,12 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
goto table_free;
}
+ if (tb[TCA_NETEM_SLOT]) {
+ ret = validate_slot(tb[TCA_NETEM_SLOT], extack);
+ if (ret)
+ goto table_free;
+ }
+
sch_tree_lock(sch);
/* backup q->clg and q->loss_model */
old_clg = q->clg;
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH net v4 8/8] net/sched: netem: fix slot delay calculation overflow
2026-04-06 17:25 [PATCH net v4 0/8] net/sched: netem bug fixes Stephen Hemminger
` (6 preceding siblings ...)
2026-04-06 17:25 ` [PATCH net v4 7/8] net/sched: netem: check for invalid slot range Stephen Hemminger
@ 2026-04-06 17:25 ` Stephen Hemminger
7 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2026-04-06 17:25 UTC (permalink / raw)
To: netdev
Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Yousuk Seung, Neal Cardwell, open list
get_slot_next() computes a random delay between min_delay and
max_delay using:
get_random_u32() * (max_delay - min_delay) >> 32
This overflows signed 64-bit arithmetic when the delay range exceeds
approximately 2.1 seconds (2^31 nanoseconds), producing a negative
result that effectively disables slot-based pacing. This is a
realistic configuration for WAN emulation (e.g., slot 1s 5s).
Use mul_u64_u32_shr() which handles the widening multiply without
overflow.
Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
net/sched/sch_netem.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index c82f76af41aa..d47e1b0d7942 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -660,9 +660,8 @@ static void get_slot_next(struct netem_sched_data *q, u64 now)
if (!q->slot_dist)
next_delay = q->slot_config.min_delay +
- (get_random_u32() *
- (q->slot_config.max_delay -
- q->slot_config.min_delay) >> 32);
+ mul_u64_u32_shr(q->slot_config.max_delay - q->slot_config.min_delay,
+ get_random_u32(), 32);
else
next_delay = tabledist(q->slot_config.dist_delay,
(s32)(q->slot_config.dist_jitter),
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread