* [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
@ 2026-01-11 16:39 Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 1/6] net: Introduce skb ttl field to track packet loops Jamal Hadi Salim
` (9 more replies)
0 siblings, 10 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
together those bits. Patches #2 and patch #5 use these bits.
I added Fixes tags to patch #1 in case it is useful for backporting.
Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
tdc test cases.
Jamal Hadi Salim (5):
net: Introduce skb ttl field to track packet loops
net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
Revert "net/sched: Restrict conditions for adding duplicating netems
to qdisc tree"
Revert "selftests/tc-testing: Add tests for restrictions on netem
duplication"
net/sched: fix packet loop on netem when duplicate is on
Victor Nogueira (1):
selftests/tc-testing: Add netem/mirred test cases exercising loops
drivers/net/ifb.c | 2 +-
include/linux/skbuff.h | 24 +-
include/net/sch_generic.h | 22 +
net/netfilter/nft_fwd_netdev.c | 1 +
net/sched/act_mirred.c | 45 +-
net/sched/sch_netem.c | 47 +-
.../tc-testing/tc-tests/actions/mirred.json | 616 +++++++++++++++++-
.../tc-testing/tc-tests/infra/qdiscs.json | 5 +-
.../tc-testing/tc-tests/qdiscs/netem.json | 96 +--
9 files changed, 698 insertions(+), 160 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH net 1/6] net: Introduce skb ttl field to track packet loops
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-13 19:44 ` Cong Wang
2026-01-11 16:39 ` [PATCH net 2/6] net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop Jamal Hadi Salim
` (8 subsequent siblings)
9 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
In order to keep track of loops across the stack, in particular when going
across from egress->ingress and back,we need to _remember the global loop
state in the skb_.
We introduce a per-skb ttl field to keep track of this state.
This patch liberates two bits:
1) The bit "skb->from_ingress" is reclaimed for ttl. Since it is currently
only used for ifb, it is safe to move this to local-per-layer skb/tc state
on the qdisc_skb_cb struct.
2) A second bit that was available on the skb.
Use cases:
1) Mirred increments the ttl whenever it sees an skb. If the skb shows
up multiple times we catch it when it exceeds MIRRED_NEST_LIMIT iterations
of the loop.
2) netem increments when using the "duplicate" feature and catches it when
it sees the packet the second time.
Fixes: fe946a751d9b ("net/sched: act_mirred: add loop detection")
Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication")
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
drivers/net/ifb.c | 2 +-
include/linux/skbuff.h | 24 ++----------------------
include/net/sch_generic.h | 22 ++++++++++++++++++++++
net/netfilter/nft_fwd_netdev.c | 1 +
4 files changed, 26 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c
index d3dc0914450a..137a20e4bf8c 100644
--- a/drivers/net/ifb.c
+++ b/drivers/net/ifb.c
@@ -124,7 +124,7 @@ static void ifb_ri_tasklet(struct tasklet_struct *t)
rcu_read_unlock();
skb->skb_iif = txp->dev->ifindex;
- if (!skb->from_ingress) {
+ if (!qdisc_skb_cb(skb)->from_ingress) {
dev_queue_xmit(skb);
} else {
skb_pull_rcsum(skb, skb->mac_len);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 86737076101d..7f18b0c28728 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -840,6 +840,7 @@ enum skb_tstamp_type {
* @no_fcs: Request NIC to treat last 4 bytes as Ethernet FCS
* @encapsulation: indicates the inner headers in the skbuff are valid
* @encap_hdr_csum: software checksum is needed
+ * @ttl: time to live count when a packet loops.
* @csum_valid: checksum is already valid
* @csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
* @csum_complete_sw: checksum was completed by software
@@ -1000,6 +1001,7 @@ struct sk_buff {
/* Indicates the inner headers are valid in the skbuff. */
__u8 encapsulation:1;
__u8 encap_hdr_csum:1;
+ __u8 ttl:2;
__u8 csum_valid:1;
#ifdef CONFIG_IPV6_NDISC_NODETYPE
__u8 ndisc_nodetype:2;
@@ -1016,9 +1018,6 @@ struct sk_buff {
__u8 offload_l3_fwd_mark:1;
#endif
__u8 redirected:1;
-#ifdef CONFIG_NET_REDIRECT
- __u8 from_ingress:1;
-#endif
#ifdef CONFIG_NETFILTER_SKIP_EGRESS
__u8 nf_skip_egress:1;
#endif
@@ -5352,30 +5351,11 @@ static inline bool skb_is_redirected(const struct sk_buff *skb)
return skb->redirected;
}
-static inline void skb_set_redirected(struct sk_buff *skb, bool from_ingress)
-{
- skb->redirected = 1;
-#ifdef CONFIG_NET_REDIRECT
- skb->from_ingress = from_ingress;
- if (skb->from_ingress)
- skb_clear_tstamp(skb);
-#endif
-}
-
static inline void skb_reset_redirect(struct sk_buff *skb)
{
skb->redirected = 0;
}
-static inline void skb_set_redirected_noclear(struct sk_buff *skb,
- bool from_ingress)
-{
- skb->redirected = 1;
-#ifdef CONFIG_NET_REDIRECT
- skb->from_ingress = from_ingress;
-#endif
-}
-
static inline bool skb_csum_is_sctp(struct sk_buff *skb)
{
#if IS_ENABLED(CONFIG_IP_SCTP)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index c3a7268b567e..42d8a1a9db4c 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -459,6 +459,9 @@ struct qdisc_skb_cb {
u8 post_ct:1;
u8 post_ct_snat:1;
u8 post_ct_dnat:1;
+#ifdef CONFIG_NET_REDIRECT
+ u8 from_ingress:1;
+#endif
};
typedef void tcf_chain_head_change_t(struct tcf_proto *tp_head, void *priv);
@@ -1140,6 +1143,25 @@ static inline void qdisc_dequeue_drop(struct Qdisc *q, struct sk_buff *skb,
q->to_free = skb;
}
+static inline void skb_set_redirected(struct sk_buff *skb, bool from_ingress)
+{
+ skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+ qdisc_skb_cb(skb)->from_ingress = from_ingress;
+ if (qdisc_skb_cb(skb)->from_ingress)
+ skb_clear_tstamp(skb);
+#endif
+}
+
+static inline void skb_set_redirected_noclear(struct sk_buff *skb,
+ bool from_ingress)
+{
+ skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+ qdisc_skb_cb(skb)->from_ingress = from_ingress;
+#endif
+}
+
/* Instead of calling kfree_skb() while root qdisc lock is held,
* queue the skb for future freeing at end of __dev_xmit_skb()
*/
diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c
index 152a9fb4d23a..d62c856ef96a 100644
--- a/net/netfilter/nft_fwd_netdev.c
+++ b/net/netfilter/nft_fwd_netdev.c
@@ -16,6 +16,7 @@
#include <net/netfilter/nf_dup_netdev.h>
#include <net/neighbour.h>
#include <net/ip.h>
+#include <net/sch_generic.h>
struct nft_fwd_netdev {
u8 sreg_dev;
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH net 2/6] net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 1/6] net: Introduce skb ttl field to track packet loops Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 3/6] Revert "net/sched: Restrict conditions for adding duplicating netems to qdisc tree" Jamal Hadi Salim
` (7 subsequent siblings)
9 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
When mirred redirects to ingress (from either ingress or egress) the loop
state from sched_mirred_dev array dev is lost because of 1) the packet
deferral into the backlog and 2) the fact the sched_mirred_dev array is
cleared. In such cases, if there was a loop we won't discover it.
Here's a simple test to reproduce:
ip a add dev port0 10.10.10.11/24
tc qdisc add dev port0 clsact
tc filter add dev port0 egress protocol ip \
prio 10 matchall action mirred ingress redirect dev port1
tc qdisc add dev port1 clsact
tc filter add dev port1 ingress protocol ip \
prio 10 matchall action mirred egress redirect dev port0
ping -c 1 -W0.01 10.10.10.10
Fixes: fe946a751d9b ("net/sched: act_mirred: add loop detection")
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/act_mirred.c | 45 ++++++++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 05e0b14b5773..9ef261e19e40 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -26,6 +26,8 @@
#include <net/tc_act/tc_mirred.h>
#include <net/tc_wrapper.h>
+#define MIRRED_DEFER_LIMIT 3
+
static LIST_HEAD(mirred_list);
static DEFINE_SPINLOCK(mirred_list_lock);
@@ -234,12 +236,15 @@ tcf_mirred_forward(bool at_ingress, bool want_ingress, struct sk_buff *skb)
{
int err;
- if (!want_ingress)
+ if (!want_ingress) {
err = tcf_dev_queue_xmit(skb, dev_queue_xmit);
- else if (!at_ingress)
- err = netif_rx(skb);
- else
- err = netif_receive_skb(skb);
+ } else {
+ skb->ttl++;
+ if (!at_ingress)
+ err = netif_rx(skb);
+ else
+ err = netif_receive_skb(skb);
+ }
return err;
}
@@ -426,6 +431,7 @@ TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
struct netdev_xmit *xmit;
bool m_mac_header_xmit;
struct net_device *dev;
+ bool want_ingress;
int i, m_eaction;
u32 blockid;
@@ -434,7 +440,8 @@ TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
#else
xmit = this_cpu_ptr(&softnet_data.xmit);
#endif
- if (unlikely(xmit->sched_mirred_nest >= MIRRED_NEST_LIMIT)) {
+ if (unlikely(xmit->sched_mirred_nest >= MIRRED_NEST_LIMIT ||
+ skb->ttl >= MIRRED_DEFER_LIMIT)) {
net_warn_ratelimited("Packet exceeded mirred recursion limit on dev %s\n",
netdev_name(skb->dev));
return TC_ACT_SHOT;
@@ -453,23 +460,27 @@ TC_INDIRECT_SCOPE int tcf_mirred_act(struct sk_buff *skb,
tcf_action_inc_overlimit_qstats(&m->common);
return retval;
}
- for (i = 0; i < xmit->sched_mirred_nest; i++) {
- if (xmit->sched_mirred_dev[i] != dev)
- continue;
- pr_notice_once("tc mirred: loop on device %s\n",
- netdev_name(dev));
- tcf_action_inc_overlimit_qstats(&m->common);
- return retval;
- }
- xmit->sched_mirred_dev[xmit->sched_mirred_nest++] = dev;
+ m_eaction = READ_ONCE(m->tcfm_eaction);
+ want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
+ if (!want_ingress) {
+ for (i = 0; i < xmit->sched_mirred_nest; i++) {
+ if (xmit->sched_mirred_dev[i] != dev)
+ continue;
+ pr_notice_once("tc mirred: loop on device %s\n",
+ netdev_name(dev));
+ tcf_action_inc_overlimit_qstats(&m->common);
+ return retval;
+ }
+ xmit->sched_mirred_dev[xmit->sched_mirred_nest++] = dev;
+ }
m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit);
- m_eaction = READ_ONCE(m->tcfm_eaction);
retval = tcf_mirred_to_dev(skb, m, dev, m_mac_header_xmit, m_eaction,
retval);
- xmit->sched_mirred_nest--;
+ if (!want_ingress)
+ xmit->sched_mirred_nest--;
return retval;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH net 3/6] Revert "net/sched: Restrict conditions for adding duplicating netems to qdisc tree"
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 1/6] net: Introduce skb ttl field to track packet loops Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 2/6] net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 4/6] Revert "selftests/tc-testing: Add tests for restrictions on netem duplication" Jamal Hadi Salim
` (6 subsequent siblings)
9 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
This reverts commit ec8e0e3d7adef940cdf9475e2352c0680189d14e.
Reported-by: Ji-Soo Chung <jschung2@proton.me>
Reported-by: Gerlinde <lrGerlinde@mailfence.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220774
Reported-by: zyc zyc <zyc199902@zohomail.cn>
Closes: https://lore.kernel.org/all/19adda5a1e2.12410b78222774.9191120410578703463@zohomail.cn/
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/sch_netem.c | 40 ----------------------------------------
1 file changed, 40 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 32a5f3304046..a9ea40c13527 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -974,41 +974,6 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
return 0;
}
-static const struct Qdisc_class_ops netem_class_ops;
-
-static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
- struct netlink_ext_ack *extack)
-{
- struct Qdisc *root, *q;
- unsigned int i;
-
- root = qdisc_root_sleeping(sch);
-
- if (sch != root && root->ops->cl_ops == &netem_class_ops) {
- if (duplicates ||
- ((struct netem_sched_data *)qdisc_priv(root))->duplicate)
- goto err;
- }
-
- if (!qdisc_dev(root))
- return 0;
-
- hash_for_each(qdisc_dev(root)->qdisc_hash, i, q, hash) {
- if (sch != q && q->ops->cl_ops == &netem_class_ops) {
- if (duplicates ||
- ((struct netem_sched_data *)qdisc_priv(q))->duplicate)
- goto err;
- }
- }
-
- return 0;
-
-err:
- NL_SET_ERR_MSG(extack,
- "netem: cannot mix duplicating netems with other netems in tree");
- return -EINVAL;
-}
-
/* Parse netlink message to set options */
static int netem_change(struct Qdisc *sch, struct nlattr *opt,
struct netlink_ext_ack *extack)
@@ -1067,11 +1032,6 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
q->gap = qopt->gap;
q->counter = 0;
q->loss = qopt->loss;
-
- ret = check_netem_in_tree(sch, qopt->duplicate, extack);
- if (ret)
- goto unlock;
-
q->duplicate = qopt->duplicate;
/* for compatibility with earlier versions.
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH net 4/6] Revert "selftests/tc-testing: Add tests for restrictions on netem duplication"
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (2 preceding siblings ...)
2026-01-11 16:39 ` [PATCH net 3/6] Revert "net/sched: Restrict conditions for adding duplicating netems to qdisc tree" Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on Jamal Hadi Salim
` (5 subsequent siblings)
9 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
This reverts commit ecdec65ec78d67d3ebd17edc88b88312054abe0d.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
.../tc-testing/tc-tests/infra/qdiscs.json | 5 +-
.../tc-testing/tc-tests/qdiscs/netem.json | 81 -------------------
2 files changed, 3 insertions(+), 83 deletions(-)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
index 6a39640aa2a8..ceb993ed04b2 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json
@@ -702,6 +702,7 @@
"$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 netem duplicate 100%",
"$TC filter add dev $DUMMY parent 1:0 protocol ip prio 1 u32 match ip dst 10.10.10.1/32 flowid 1:1",
"$TC class add dev $DUMMY parent 1:0 classid 1:2 hfsc ls m2 10Mbit",
+ "$TC qdisc add dev $DUMMY parent 1:2 handle 3:0 netem duplicate 100%",
"$TC filter add dev $DUMMY parent 1:0 protocol ip prio 2 u32 match ip dst 10.10.10.2/32 flowid 1:2",
"ping -c 1 10.10.10.1 -I$DUMMY > /dev/null || true",
"$TC filter del dev $DUMMY parent 1:0 protocol ip prio 1",
@@ -714,8 +715,8 @@
{
"kind": "hfsc",
"handle": "1:",
- "bytes": 294,
- "packets": 3
+ "bytes": 392,
+ "packets": 4
}
],
"matchCount": "1",
diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
index 718d2df2aafa..3c4444961488 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
@@ -336,86 +336,5 @@
"teardown": [
"$TC qdisc del dev $DUMMY handle 1: root"
]
- },
- {
- "id": "d34d",
- "name": "NETEM test qdisc duplication restriction in qdisc tree in netem_change root",
- "category": ["qdisc", "netem"],
- "plugins": {
- "requires": "nsPlugin"
- },
- "setup": [
- "$TC qdisc add dev $DUMMY root handle 1: netem limit 1",
- "$TC qdisc add dev $DUMMY parent 1: handle 2: netem limit 1"
- ],
- "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 1: netem duplicate 50%",
- "expExitCode": "2",
- "verifyCmd": "$TC -s qdisc show dev $DUMMY",
- "matchPattern": "qdisc netem",
- "matchCount": "2",
- "teardown": [
- "$TC qdisc del dev $DUMMY handle 1:0 root"
- ]
- },
- {
- "id": "b33f",
- "name": "NETEM test qdisc duplication restriction in qdisc tree in netem_change non-root",
- "category": ["qdisc", "netem"],
- "plugins": {
- "requires": "nsPlugin"
- },
- "setup": [
- "$TC qdisc add dev $DUMMY root handle 1: netem limit 1",
- "$TC qdisc add dev $DUMMY parent 1: handle 2: netem limit 1"
- ],
- "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 2: netem duplicate 50%",
- "expExitCode": "2",
- "verifyCmd": "$TC -s qdisc show dev $DUMMY",
- "matchPattern": "qdisc netem",
- "matchCount": "2",
- "teardown": [
- "$TC qdisc del dev $DUMMY handle 1:0 root"
- ]
- },
- {
- "id": "cafe",
- "name": "NETEM test qdisc duplication restriction in qdisc tree",
- "category": ["qdisc", "netem"],
- "plugins": {
- "requires": "nsPlugin"
- },
- "setup": [
- "$TC qdisc add dev $DUMMY root handle 1: netem limit 1 duplicate 100%"
- ],
- "cmdUnderTest": "$TC qdisc add dev $DUMMY parent 1: handle 2: netem duplicate 100%",
- "expExitCode": "2",
- "verifyCmd": "$TC -s qdisc show dev $DUMMY",
- "matchPattern": "qdisc netem",
- "matchCount": "1",
- "teardown": [
- "$TC qdisc del dev $DUMMY handle 1:0 root"
- ]
- },
- {
- "id": "1337",
- "name": "NETEM test qdisc duplication restriction in qdisc tree across branches",
- "category": ["qdisc", "netem"],
- "plugins": {
- "requires": "nsPlugin"
- },
- "setup": [
- "$TC qdisc add dev $DUMMY parent root handle 1:0 hfsc",
- "$TC class add dev $DUMMY parent 1:0 classid 1:1 hfsc rt m2 10Mbit",
- "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 netem",
- "$TC class add dev $DUMMY parent 1:0 classid 1:2 hfsc rt m2 10Mbit"
- ],
- "cmdUnderTest": "$TC qdisc add dev $DUMMY parent 1:2 handle 3:0 netem duplicate 100%",
- "expExitCode": "2",
- "verifyCmd": "$TC -s qdisc show dev $DUMMY",
- "matchPattern": "qdisc netem",
- "matchCount": "1",
- "teardown": [
- "$TC qdisc del dev $DUMMY handle 1:0 root"
- ]
}
]
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (3 preceding siblings ...)
2026-01-11 16:39 ` [PATCH net 4/6] Revert "selftests/tc-testing: Add tests for restrictions on netem duplication" Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-11 20:22 ` William Liu
2026-01-11 20:39 ` Cong Wang
2026-01-11 16:39 ` [PATCH net 6/6] selftests/tc-testing: Add netem/mirred test cases exercising loops Jamal Hadi Salim
` (4 subsequent siblings)
9 siblings, 2 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim, William Liu,
Savino Dicanosa
As stated by William [1]:
"netem_enqueue's duplication prevention logic breaks when a netem
resides in a qdisc tree with other netems - this can lead to a
soft lockup and OOM loop in netem_dequeue, as seen in [2].
Ensure that a duplicating netem cannot exist in a tree with other
netems."
In this patch, we use the first approach suggested in [1] (the skb
ttl field) to detect and stop a possible netem duplicate infinite loop.
[1] https://lore.kernel.org/netdev/20250708164141.875402-1-will@willsroot.io/
[2] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/
Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication")
Reported-by: William Liu <will@willsroot.io>
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Closes: https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/
Co-developed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
net/sched/sch_netem.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index a9ea40c13527..4a65fb841a98 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -461,7 +461,8 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
skb->prev = NULL;
/* Random duplication */
- if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
+ if (q->duplicate && !skb->ttl &&
+ q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
++count;
/* Drop packet? */
@@ -539,11 +540,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
*/
if (skb2) {
struct Qdisc *rootq = qdisc_root_bh(sch);
- u32 dupsave = q->duplicate; /* prevent duplicating a dup... */
- q->duplicate = 0;
+ skb2->ttl++; /* prevent duplicating a dup... */
rootq->enqueue(skb2, rootq, to_free);
- q->duplicate = dupsave;
skb2 = NULL;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH net 6/6] selftests/tc-testing: Add netem/mirred test cases exercising loops
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (4 preceding siblings ...)
2026-01-11 16:39 ` [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on Jamal Hadi Salim
@ 2026-01-11 16:39 ` Jamal Hadi Salim
2026-01-13 6:20 ` [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Stephen Hemminger
` (3 subsequent siblings)
9 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 16:39 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Jamal Hadi Salim
From: Victor Nogueira <victor@mojatatu.com>
Add mirred loop test cases to validate that those will be caught and other
test cases that were previously misinterpreted as loops by mirred.
Also add a netem nested duplicate test case to validate that it won't
cause an infinite loop
This commit adds 14 test cases:
- Redirect multiport: dummy egress -> dev1 ingress -> dummy egress (Loop)
- Redirect multiport: dev1 ingress -> dev1 egress -> dev1 ingress (Loop)
- Redirect multiport: dev1 ingress -> dummy ingress -> dev1 egress (No Loop)
- Redirect multiport: dev1 ingress -> dummy ingress -> dev1 ingress (Loop)
- Redirect multiport: dev1 ingress -> dummy egress -> dev1 ingress (Loop)
- Redirect multiport: dummy egress -> dev1 ingress -> dummy egress (Loop)
- Redirect multiport: dev1 ingress -> dummy ingress -> dummy egress -> dev1 egress (No Loop)
- Redirect multiport: dev1 ingress -> dummy egress -> dev1 egress (No Loop)
- Redirect multiport: dev1 ingress -> dummy egress -> dev1 egress (No Loop)
- Redirect multiport: dev1 ingress -> dummy egress -> dummy ingress (No Loop)
- Redirect singleport: dev1 ingress -> dev1 ingress (Loop)
- Redirect singleport: dummy egress -> dummy ingress (No Loop)
- Redirect multiport: dev1 ingress -> dummy ingress -> dummy egress (No Loop)
- Test netem's recursive duplicate
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
.../tc-testing/tc-tests/actions/mirred.json | 616 +++++++++++++++++-
.../tc-testing/tc-tests/qdiscs/netem.json | 33 +-
2 files changed, 647 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
index b056eb966871..9eb32cdf1ed3 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/mirred.json
@@ -1144,6 +1144,620 @@
"teardown": [
"$TC qdisc del dev $DUMMY clsact"
]
+ },
+ {
+ "id": "531c",
+ "name": "Redirect multiport: dummy egress -> dev1 ingress -> dummy egress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin"
+ ]
+ },
+ "setup": [
+ "$IP link set dev $DUMMY up || true",
+ "$IP addr add 10.10.10.10/24 dev $DUMMY || true",
+ "$TC qdisc add dev $DUMMY clsact",
+ "$TC filter add dev $DUMMY egress protocol ip prio 10 matchall action mirred ingress redirect dev $DEV1 index 1",
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 2"
+ ],
+ "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1",
+ "expExitCode": "1",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 3
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DUMMY clsact",
+ "$TC qdisc del dev $DEV1 clsact"
+ ]
+ },
+ {
+ "id": "b1d7",
+ "name": "Redirect multiport: dev1 ingress -> dev1 egress -> dev1 ingress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred egress redirect dev $DEV1 index 1"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DEV1 egress protocol ip prio 11 matchall action mirred ingress redirect dev $DEV1 index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "egress",
+ "index": 1,
+ "stats": {
+ "packets": 3
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact"
+ ]
+ },
+ {
+ "id": "c66d",
+ "name": "Redirect multiport: dev1 ingress -> dummy ingress -> dev1 egress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred ingress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY ingress protocol ip prio 11 matchall action mirred egress redirect dev $DEV1 index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "aa99",
+ "name": "Redirect multiport: dev1 ingress -> dummy ingress -> dev1 ingress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred ingress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY ingress protocol ip prio 11 matchall action mirred ingress redirect dev $DEV1 index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 2,
+ "overlimits": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "37d7",
+ "name": "Redirect multiport: dev1 ingress -> dummy egress -> dev1 ingress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY egress protocol ip prio 11 matchall action mirred ingress redirect dev $DEV1 index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "egress",
+ "index": 1,
+ "stats": {
+ "packets": 3
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "6d02",
+ "name": "Redirect multiport: dummy egress -> dev1 ingress -> dummy egress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin"
+ ]
+ },
+ "setup": [
+ "$IP link set dev $DUMMY up || true",
+ "$IP addr add 10.10.10.10/24 dev $DUMMY || true",
+ "$TC qdisc add dev $DUMMY clsact",
+ "$TC filter add dev $DUMMY egress protocol ip prio 10 matchall action mirred ingress redirect dev $DEV1 index 1",
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 11 matchall action mirred egress redirect dev $DUMMY index 2"
+ ],
+ "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1",
+ "expExitCode": "1",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 3
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DUMMY clsact",
+ "$TC qdisc del dev $DEV1 clsact"
+ ]
+ },
+ {
+ "id": "8115",
+ "name": "Redirect multiport: dev1 ingress -> dummy ingress -> dummy egress -> dev1 egress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred ingress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact",
+ "$TC filter add dev $DUMMY ingress protocol ip prio 11 matchall action mirred egress redirect dev $DUMMY index 2"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY egress protocol ip prio 12 matchall action mirred egress redirect dev $DEV1 index 3",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "9eb3",
+ "name": "Redirect multiport: dev1 ingress -> dummy egress -> dev1 egress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY egress protocol ip prio 11 matchall action mirred egress redirect dev $DEV1 index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "egress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "d837",
+ "name": "Redirect multiport: dev1 ingress -> dummy egress -> dummy ingress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred egress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY egress protocol ip prio 11 matchall action mirred ingress redirect dev $DUMMY index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "egress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "2071",
+ "name": "Redirect multiport: dev1 ingress -> dev1 ingress (Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred ingress redirect dev $DEV1 index 1",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 1,
+ "overlimits": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact"
+ ]
+ },
+ {
+ "id": "0101",
+ "name": "Redirect singleport: dummy egress -> dummy ingress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin"
+ ]
+ },
+ "setup": [
+ "$IP addr add 10.10.10.10/24 dev $DUMMY || true",
+ "$TC qdisc add dev $DUMMY clsact",
+ "$TC filter add dev $DUMMY egress protocol ip prio 11 matchall action mirred ingress redirect dev $DUMMY index 1"
+ ],
+ "cmdUnderTest": "ping -c1 -W0.01 -I $DUMMY 10.10.10.1",
+ "expExitCode": "1",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
+ },
+ {
+ "id": "cf97",
+ "name": "Redirect multiport: dev1 ingress -> dummy ingress -> dummy egress (No Loop)",
+ "category": [
+ "filter",
+ "mirred"
+ ],
+ "plugins": {
+ "requires": [
+ "nsPlugin",
+ "scapyPlugin"
+ ]
+ },
+ "setup": [
+ "$TC qdisc add dev $DEV1 clsact",
+ "$TC filter add dev $DEV1 ingress protocol ip prio 10 matchall action mirred ingress redirect dev $DUMMY index 1",
+ "$TC qdisc add dev $DUMMY clsact"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY ingress protocol ip prio 11 matchall action mirred egress redirect dev $DUMMY index 2",
+ "scapy": [
+ {
+ "iface": "$DEV0",
+ "count": 1,
+ "packet": "Ether()/IP(dst='10.10.10.1', src='10.10.10.10')/ICMP()"
+ }
+ ],
+ "expExitCode": "0",
+ "verifyCmd": "$TC -j -s actions get action mirred index 1",
+ "matchJSON": [
+ {
+ "total acts": 0
+ },
+ {
+ "actions": [
+ {
+ "order": 1,
+ "kind": "mirred",
+ "mirred_action": "redirect",
+ "direction": "ingress",
+ "index": 1,
+ "stats": {
+ "packets": 1
+ },
+ "not_in_hw": true
+ }
+ ]
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DEV1 clsact",
+ "$TC qdisc del dev $DUMMY clsact"
+ ]
}
-
]
diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
index 3c4444961488..7c954989069d 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json
@@ -336,5 +336,36 @@
"teardown": [
"$TC qdisc del dev $DUMMY handle 1: root"
]
- }
+ },
+ {
+ "id": "8c17",
+ "name": "Test netem's recursive duplicate",
+ "category": [
+ "qdisc",
+ "netem"
+ ],
+ "plugins": {
+ "requires": "nsPlugin"
+ },
+ "setup": [
+ "$IP link set dev $DUMMY up || true",
+ "$IP addr add 10.10.11.10/24 dev $DUMMY || true",
+ "$TC qdisc add dev $DUMMY root handle 1: netem limit 1 duplicate 100%",
+ "$TC qdisc add dev $DUMMY parent 1: handle 2: netem duplicate 100%"
+ ],
+ "cmdUnderTest": "ping -c 1 10.10.11.11 -W 0.01",
+ "expExitCode": "1",
+ "verifyCmd": "$TC -s -j qdisc ls dev $DUMMY root",
+ "matchJSON": [
+ {
+ "kind": "netem",
+ "handle": "1:",
+ "bytes": 294,
+ "packets": 3
+ }
+ ],
+ "teardown": [
+ "$TC qdisc del dev $DUMMY handle 1: root"
+ ]
+ }
]
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on
2026-01-11 16:39 ` [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on Jamal Hadi Salim
@ 2026-01-11 20:22 ` William Liu
2026-01-11 20:39 ` Cong Wang
1 sibling, 0 replies; 26+ messages in thread
From: William Liu @ 2026-01-11 20:22 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2, Savino Dicanosa
On Sunday, January 11th, 2026 at 4:40 PM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
>
> As stated by William [1]:
>
> "netem_enqueue's duplication prevention logic breaks when a netem
> resides in a qdisc tree with other netems - this can lead to a
> soft lockup and OOM loop in netem_dequeue, as seen in [2].
> Ensure that a duplicating netem cannot exist in a tree with other
> netems."
>
> In this patch, we use the first approach suggested in [1] (the skb
> ttl field) to detect and stop a possible netem duplicate infinite loop.
>
> [1] https://lore.kernel.org/netdev/20250708164141.875402-1-will@willsroot.io/
> [2] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/
>
> Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication")
> Reported-by: William Liu will@willsroot.io
>
> Reported-by: Savino Dicanosa savy@syst3mfailure.io
>
> Closes: https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/
> Co-developed-by: Victor Nogueira victor@mojatatu.com
>
> Signed-off-by: Victor Nogueira victor@mojatatu.com
>
> Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com
>
> ---
> net/sched/sch_netem.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index a9ea40c13527..4a65fb841a98 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -461,7 +461,8 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> skb->prev = NULL;
>
>
> /* Random duplication */
> - if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
>
> + if (q->duplicate && !skb->ttl &&
>
> + q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
>
> ++count;
>
> /* Drop packet? */
> @@ -539,11 +540,9 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> */
> if (skb2) {
> struct Qdisc rootq = qdisc_root_bh(sch);
> - u32 dupsave = q->duplicate; / prevent duplicating a dup... */
>
>
> - q->duplicate = 0;
>
> + skb2->ttl++; /* prevent duplicating a dup... */
>
> rootq->enqueue(skb2, rootq, to_free);
>
> - q->duplicate = dupsave;
>
> skb2 = NULL;
> }
>
> --
> 2.34.1
Reviewed-by: William Liu <will@willsroot.io>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on
2026-01-11 16:39 ` [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on Jamal Hadi Salim
2026-01-11 20:22 ` William Liu
@ 2026-01-11 20:39 ` Cong Wang
2026-01-11 21:56 ` Jamal Hadi Salim
1 sibling, 1 reply; 26+ messages in thread
From: Cong Wang @ 2026-01-11 20:39 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savino Dicanosa
On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> - q->duplicate = 0;
> + skb2->ttl++; /* prevent duplicating a dup... */
> rootq->enqueue(skb2, rootq, to_free);
> - q->duplicate = dupsave;
As I already explained many times, the ROOT cause is enqueuing
to the root qdisc, not anything else.
We need to completely forget all the kernel knowledge and ask
a very simple question here: is enqueuing to root qdisc a reasonable
use? More importantly, could we really define it?
I already provided my answer in my patch description, sorry for not
keeping repeating it for at least the 3rd time.
Therefore, I still don't think you fix the root cause here. The
problematic behavior of enqueuing to root qdisc should be corrected,
regardless of any kernel detail.
Thanks.
Cong
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on
2026-01-11 20:39 ` Cong Wang
@ 2026-01-11 21:56 ` Jamal Hadi Salim
2026-01-13 19:32 ` Cong Wang
0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-11 21:56 UTC (permalink / raw)
To: Cong Wang
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savino Dicanosa
On Sun, Jan 11, 2026 at 3:39 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > - q->duplicate = 0;
> > + skb2->ttl++; /* prevent duplicating a dup... */
> > rootq->enqueue(skb2, rootq, to_free);
> > - q->duplicate = dupsave;
>
> As I already explained many times, the ROOT cause is enqueuing
> to the root qdisc, not anything else.
>
> We need to completely forget all the kernel knowledge and ask
> a very simple question here: is enqueuing to root qdisc a reasonable
> use? More importantly, could we really define it?
>
> I already provided my answer in my patch description, sorry for not
> keeping repeating it for at least the 3rd time.
>
> Therefore, I still don't think you fix the root cause here. The
> problematic behavior of enqueuing to root qdisc should be corrected,
> regardless of any kernel detail.
>
The root cause is a loop in the existing code, present since the
duplication feature was introduced into netem about 20 years ago. That
code enqueues to the root qdisc.
cheers,
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (5 preceding siblings ...)
2026-01-11 16:39 ` [PATCH net 6/6] selftests/tc-testing: Add netem/mirred test cases exercising loops Jamal Hadi Salim
@ 2026-01-13 6:20 ` Stephen Hemminger
2026-01-13 19:35 ` Cong Wang
2026-01-13 20:10 ` Cong Wang
` (2 subsequent siblings)
9 siblings, 1 reply; 26+ messages in thread
From: Stephen Hemminger @ 2026-01-13 6:20 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On Sun, 11 Jan 2026 11:39:41 -0500
Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> together those bits. Patches #2 and patch #5 use these bits.
> I added Fixes tags to patch #1 in case it is useful for backporting.
> Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> tdc test cases.
>
> Jamal Hadi Salim (5):
> net: Introduce skb ttl field to track packet loops
> net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop
> Revert "net/sched: Restrict conditions for adding duplicating netems
> to qdisc tree"
> Revert "selftests/tc-testing: Add tests for restrictions on netem
> duplication"
> net/sched: fix packet loop on netem when duplicate is on
>
> Victor Nogueira (1):
> selftests/tc-testing: Add netem/mirred test cases exercising loops
>
> drivers/net/ifb.c | 2 +-
> include/linux/skbuff.h | 24 +-
> include/net/sch_generic.h | 22 +
> net/netfilter/nft_fwd_netdev.c | 1 +
> net/sched/act_mirred.c | 45 +-
> net/sched/sch_netem.c | 47 +-
> .../tc-testing/tc-tests/actions/mirred.json | 616 +++++++++++++++++-
> .../tc-testing/tc-tests/infra/qdiscs.json | 5 +-
> .../tc-testing/tc-tests/qdiscs/netem.json | 96 +--
> 9 files changed, 698 insertions(+), 160 deletions(-)
>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
This is a complex patch series so I decided to get a second opinion using AI.
It is worth reading (but not completely trusting). Review prompt is Chris Mason's
Claude review prompts.
Summary: Patch Series Analysis
Patches Reviewed
6-patch series from Jamal Hadi Salim fixing loop detection in mirred and netem:
Patch 1/6: Introduces skb->ttl (2-bit field) for cross-deferral loop tracking, moves from_ingress to qdisc_skb_cb
Patch 2/6: Fixes mirred ingress→egress→ingress loop detection using ttl
Patch 3/6: Reverts netem duplication restrictions (preparation for proper fix)
Patch 4/6: Reverts associated selftests
Patch 5/6: Fixes netem duplicate infinite loop using ttl
Patch 6/6: (Email thread discussion, not code)
Key Findings
No regressions identified. The analysis covered:
AreaResultSKB structure change (ttl field)✓ Safe - properly initialized via zeroingfrom_ingress relocation✓ Safe - written immediately before readLoop detection logic✓ Correct - ttl tracks across async boundariesNetem duplicate fix✓ Improvement over old q->duplicate hackLocking✓ Correct softirq/per-cpu patternsResource management✓ No leaks identified
Design Assessment
The approach is sound:
Egress paths: Continue using per-cpu sched_mirred_dev[] array for immediate loop detection
Ingress paths: Use skb->ttl to track loops across netif_rx() deferral boundaries
Netem: ttl-based dup prevention works across entire qdisc tree (better than old local-only fix)
Recommendation
Yes, the patch is OK to merge.
The series correctly fixes real bugs (CVE-worthy loop conditions) with a minimal, well-designed solution. The 2-bit ttl field is sufficient for the use case (limit of 3 ingress redirects), and the changes maintain backward compatibility for existing configurations while closing the loop detection gaps.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on
2026-01-11 21:56 ` Jamal Hadi Salim
@ 2026-01-13 19:32 ` Cong Wang
0 siblings, 0 replies; 26+ messages in thread
From: Cong Wang @ 2026-01-13 19:32 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savino Dicanosa
On Sun, Jan 11, 2026 at 1:57 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Sun, Jan 11, 2026 at 3:39 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > > - q->duplicate = 0;
> > > + skb2->ttl++; /* prevent duplicating a dup... */
> > > rootq->enqueue(skb2, rootq, to_free);
> > > - q->duplicate = dupsave;
> >
> > As I already explained many times, the ROOT cause is enqueuing
> > to the root qdisc, not anything else.
> >
> > We need to completely forget all the kernel knowledge and ask
> > a very simple question here: is enqueuing to root qdisc a reasonable
> > use? More importantly, could we really define it?
> >
> > I already provided my answer in my patch description, sorry for not
> > keeping repeating it for at least the 3rd time.
> >
> > Therefore, I still don't think you fix the root cause here. The
> > problematic behavior of enqueuing to root qdisc should be corrected,
> > regardless of any kernel detail.
> >
>
> The root cause is a loop in the existing code, present since the
That's the symptom, not the cause.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-13 6:20 ` [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Stephen Hemminger
@ 2026-01-13 19:35 ` Cong Wang
0 siblings, 0 replies; 26+ messages in thread
From: Cong Wang @ 2026-01-13 19:35 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Jamal Hadi Salim, davem, edumazet, kuba, pabeni, horms,
andrew+netdev, netdev, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On Mon, Jan 12, 2026 at 10:20 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
> This is a complex patch series so I decided to get a second opinion using AI.
> It is worth reading (but not completely trusting). Review prompt is Chris Mason's
> Claude review prompts.
Please prompt your AI what's a symptom and what is the cause. :)
Infinite loop is a symptom, not cause. Fixing symptoms does not
make sense, it only covers out the root cause even deeper.
Regards,
Cong
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 1/6] net: Introduce skb ttl field to track packet loops
2026-01-11 16:39 ` [PATCH net 1/6] net: Introduce skb ttl field to track packet loops Jamal Hadi Salim
@ 2026-01-13 19:44 ` Cong Wang
0 siblings, 0 replies; 26+ messages in thread
From: Cong Wang @ 2026-01-13 19:44 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2
On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 86737076101d..7f18b0c28728 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -840,6 +840,7 @@ enum skb_tstamp_type {
> * @no_fcs: Request NIC to treat last 4 bytes as Ethernet FCS
> * @encapsulation: indicates the inner headers in the skbuff are valid
> * @encap_hdr_csum: software checksum is needed
> + * @ttl: time to live count when a packet loops.
> * @csum_valid: checksum is already valid
> * @csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
> * @csum_complete_sw: checksum was completed by software
> @@ -1000,6 +1001,7 @@ struct sk_buff {
> /* Indicates the inner headers are valid in the skbuff. */
> __u8 encapsulation:1;
> __u8 encap_hdr_csum:1;
> + __u8 ttl:2;
In the worst case it increases sk_buff size.
┌──────────────────────────────┬───────────────────┬───────────────────┬─────────┐
│ Config scenario │ Current │ After change
│ Delta │
├──────────────────────────────┼───────────────────┼───────────────────┼─────────┤
│ All configs enabled │ 17 bits (3 bytes) │ 18 bits (3
bytes) │ 0 │
├──────────────────────────────┼───────────────────┼───────────────────┼─────────┤
│ CONFIG_NET_REDIRECT disabled │ 16 bits │ 18 bits
│ +2 bits │
├──────────────────────────────┼───────────────────┼───────────────────┼─────────┤
│ Minimal config │ 7 bits (1 byte) │ 9 bits (2
bytes) │ +1 byte │
└──────────────────────────────┴───────────────────┴───────────────────┴─────────┘
I think this would be a clear stopper in netdev.
Good luck to you on fighting with increasing sk_buff size!
Regards,
Cong
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (6 preceding siblings ...)
2026-01-13 6:20 ` [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Stephen Hemminger
@ 2026-01-13 20:10 ` Cong Wang
2026-01-14 16:33 ` Jamal Hadi Salim
2026-01-14 2:07 ` Jakub Kicinski
2026-01-15 10:22 ` Paolo Abeni
9 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2026-01-13 20:10 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2
On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
>
> We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> together those bits. Patches #2 and patch #5 use these bits.
> I added Fixes tags to patch #1 in case it is useful for backporting.
> Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> tdc test cases.
3 reasons why this patchset should be rejected:
1) It increases sk_buff size potentially by 1 byte with minimal config
2) Infinite loop is the symptom caused by enqueuing to the root qdisc,
fixing the infinite loop itself is fixing the symptom and covering up the
root cause deeper.
3) Using skb->ttl makes netem duplication behavior less predictable
for users. With a TTL-based approach, the duplication depth is limited
by a kernel-internal constant that is invisible to userspace. Users
configuring nested netem hierarchies cannot determine from tc
commands alone whether their packets will be duplicated at each
stage or silently pass through when TTL is exhausted.
NACKed-by: Cong Wang <xiyou.wangcong@gmail.com>
(I hope I am still better than your AI.)
Regards,
Cong
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (7 preceding siblings ...)
2026-01-13 20:10 ` Cong Wang
@ 2026-01-14 2:07 ` Jakub Kicinski
2026-01-14 16:40 ` Jamal Hadi Salim
2026-01-15 10:22 ` Paolo Abeni
9 siblings, 1 reply; 26+ messages in thread
From: Jakub Kicinski @ 2026-01-14 2:07 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, pabeni, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam
On Sun, 11 Jan 2026 11:39:41 -0500 Jamal Hadi Salim wrote:
> We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> together those bits. Patches #2 and patch #5 use these bits.
> I added Fixes tags to patch #1 in case it is useful for backporting.
> Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> tdc test cases.
TC is not the only way one can loop packets in the kernel forever.
Are we now supposed to find and prevent them all?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-13 20:10 ` Cong Wang
@ 2026-01-14 16:33 ` Jamal Hadi Salim
2026-01-15 20:16 ` Cong Wang
0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-14 16:33 UTC (permalink / raw)
To: Cong Wang
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savy
On Tue, Jan 13, 2026 at 3:10 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> >
> > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > together those bits. Patches #2 and patch #5 use these bits.
> > I added Fixes tags to patch #1 in case it is useful for backporting.
> > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > tdc test cases.
>
> 3 reasons why this patchset should be rejected:
>
> 1) It increases sk_buff size potentially by 1 byte with minimal config
>
All distro vendors turn all options. So no change in size happens.
Regardless, it's a non-arguement there is no way to resolve the mirred
issue without global state.
It's a twofer - fixing mirred and netem.
> 2) Infinite loop is the symptom caused by enqueuing to the root qdisc,
> fixing the infinite loop itself is fixing the symptom and covering up the
> root cause deeper.
>
The behavior of sending to the root has been around for ~20 years.
I just saw your patches - do you mind explaining why you didnt Cc me on them?
> 3) Using skb->ttl makes netem duplication behavior less predictable
> for users. With a TTL-based approach, the duplication depth is limited
> by a kernel-internal constant that is invisible to userspace. Users
> configuring nested netem hierarchies cannot determine from tc
> commands alone whether their packets will be duplicated at each
> stage or silently pass through when TTL is exhausted.
>
The patch is not using the ttl as a counter for netem, it's being
treated as boolean (just like your patch is doing). We are only using
this as a counter for the mirred loop use case.
> NACKed-by: Cong Wang <xiyou.wangcong@gmail.com>
Calm down please.
cheers
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-14 2:07 ` Jakub Kicinski
@ 2026-01-14 16:40 ` Jamal Hadi Salim
2026-01-15 3:28 ` Jakub Kicinski
0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-14 16:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, edumazet, pabeni, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam
On Tue, Jan 13, 2026 at 9:07 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sun, 11 Jan 2026 11:39:41 -0500 Jamal Hadi Salim wrote:
> > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > together those bits. Patches #2 and patch #5 use these bits.
> > I added Fixes tags to patch #1 in case it is useful for backporting.
> > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > tdc test cases.
>
> TC is not the only way one can loop packets in the kernel forever.
> Are we now supposed to find and prevent them all?
These two are trivial to reproduce with simple configs. They consume
both CPU and memory resources.
I am not aware of other forever packet loops - but if you are we can
look into them.
cheers,
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-14 16:40 ` Jamal Hadi Salim
@ 2026-01-15 3:28 ` Jakub Kicinski
0 siblings, 0 replies; 26+ messages in thread
From: Jakub Kicinski @ 2026-01-15 3:28 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, pabeni, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam
On Wed, 14 Jan 2026 11:40:18 -0500 Jamal Hadi Salim wrote:
> On Tue, Jan 13, 2026 at 9:07 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > On Sun, 11 Jan 2026 11:39:41 -0500 Jamal Hadi Salim wrote:
> > > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > > together those bits. Patches #2 and patch #5 use these bits.
> > > I added Fixes tags to patch #1 in case it is useful for backporting.
> > > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > > tdc test cases.
> >
> > TC is not the only way one can loop packets in the kernel forever.
> > Are we now supposed to find and prevent them all?
>
> These two are trivial to reproduce with simple configs. They consume
> both CPU and memory resources.
> I am not aware of other forever packet loops - but if you are we can
> look into them.
Is there loop prevention in BPF redirect? Plugging two ends of veth
into a bridge? Routing loops with an action to bump TTL back up?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
` (8 preceding siblings ...)
2026-01-14 2:07 ` Jakub Kicinski
@ 2026-01-15 10:22 ` Paolo Abeni
2026-01-15 14:32 ` Jamal Hadi Salim
9 siblings, 1 reply; 26+ messages in thread
From: Paolo Abeni @ 2026-01-15 10:22 UTC (permalink / raw)
To: Jamal Hadi Salim, davem, edumazet, kuba, horms, andrew+netdev
Cc: netdev, xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel,
pablo, kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On 1/11/26 5:39 PM, Jamal Hadi Salim wrote:
> We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> together those bits. Patches #2 and patch #5 use these bits.
> I added Fixes tags to patch #1 in case it is useful for backporting.
> Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> tdc test cases.
Generally speaking I think that a more self-encapsulated solution should
be preferable.
I [mis?]understand that your main concern with Cong's series is the
possible parent qlen corruption in case of duplication and the last
iteration of such series includes a self-test for that, is there
anything missing there?
The new sk_buff field looks a bit controversial. Adding such field
opens/implies using it for other/all loop detection; a 2 bits counter
will not be enough for that, and the struct sk_buff will increase for
typical build otherwise.
FTR I don't think that sk_buff the size increase for minimal config is
very relevant, as most/all of the binary layout optimization and not
thought for such build.
/P
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-15 10:22 ` Paolo Abeni
@ 2026-01-15 14:32 ` Jamal Hadi Salim
2026-01-29 5:06 ` Willem de Bruijn
0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-15 14:32 UTC (permalink / raw)
To: Paolo Abeni
Cc: davem, edumazet, kuba, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On Thu, Jan 15, 2026 at 5:23 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 1/11/26 5:39 PM, Jamal Hadi Salim wrote:
> > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > together those bits. Patches #2 and patch #5 use these bits.
> > I added Fixes tags to patch #1 in case it is useful for backporting.
> > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > tdc test cases.
>
> Generally speaking I think that a more self-encapsulated solution should
> be preferable.
>
I dont see a way to do that with mirred. I am more than happy if
someone else solves that issue or gives me an idea how to.
> I [mis?]understand that your main concern with Cong's series is the
> possible parent qlen corruption in case of duplication and the last
> iteration of such series includes a self-test for that, is there
> anything missing there?
i dont read the list when I am busy, but I will read emails when Cced
to me. I had not seen Cong's patches before yesterday.
But to answer your question, I presented a much simpler fix and more
importantly after looking at Cong's post i notice it changes a 20 year
old approach (which returned things to the root qdisc). William had
already pointed this to him. The simple change i posted doesn't
require that.
In any case if Stephen or you or Jakub want to push that change go
ahead - we'll wait to see what the bots find.
I am more interested in the mirred one because the current approach
has both loops and false positive(example claiming a loop when there
is none).
> The new sk_buff field looks a bit controversial. Adding such field
> opens/implies using it for other/all loop detection; a 2 bits counter
> will not be enough for that, and the struct sk_buff will increase for
> typical build otherwise.
My logic is: two bits is better than zero bits. More bits the better.
I am not sure i see sharing across the stack - and if we do hit that
situation, something will drop and we can debug.
At the moment I know of these two bugs - which are trivial to fix as
shown. I don't think it's fair to ask me to fix all potential (and
hypotheical) loops; i can fix them if someone shows an example setup.
> FTR I don't think that sk_buff the size increase for minimal config is
> very relevant, as most/all of the binary layout optimization and not
> thought for such build.
It is not really. Nobody turns off options that are ifdef just to say
"i need to save 1B".
cheers,
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-14 16:33 ` Jamal Hadi Salim
@ 2026-01-15 20:16 ` Cong Wang
2026-01-16 15:04 ` Jamal Hadi Salim
0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2026-01-15 20:16 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savy
On Wed, Jan 14, 2026 at 8:34 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Tue, Jan 13, 2026 at 3:10 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > >
> > >
> > > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > > together those bits. Patches #2 and patch #5 use these bits.
> > > I added Fixes tags to patch #1 in case it is useful for backporting.
> > > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > > tdc test cases.
> >
> > 3 reasons why this patchset should be rejected:
> >
> > 1) It increases sk_buff size potentially by 1 byte with minimal config
> >
>
> All distro vendors turn all options. So no change in size happens.
> Regardless, it's a non-arguement there is no way to resolve the mirred
> issue without global state.
> It's a twofer - fixing mirred and netem.
This makes little sense, because otherwise people could easily add:
struct sk_buff {
....
#ifdef CONFIG_NOT_ENABLED_BY_DEFAULT
struct a_huge_field very_big;
#endif
};
What's the boundary?
>
> > 2) Infinite loop is the symptom caused by enqueuing to the root qdisc,
> > fixing the infinite loop itself is fixing the symptom and covering up the
> > root cause deeper.
> >
>
> The behavior of sending to the root has been around for ~20 years.
So what?
> I just saw your patches - do you mind explaining why you didnt Cc me on them?
You were the one who refused anyone's feedback on your broken and
hard-coded policy in the kernel.
Please enlighten me on how we should talk to a person who refused
any feedback? More importantly, why should we waste time on that?
BTW, I am sure you are on netdev.
>
> > 3) Using skb->ttl makes netem duplication behavior less predictable
> > for users. With a TTL-based approach, the duplication depth is limited
> > by a kernel-internal constant that is invisible to userspace. Users
> > configuring nested netem hierarchies cannot determine from tc
> > commands alone whether their packets will be duplicated at each
> > stage or silently pass through when TTL is exhausted.
> >
>
> The patch is not using the ttl as a counter for netem, it's being
> treated as boolean (just like your patch is doing). We are only using
> this as a counter for the mirred loop use case.
This does not change this argument for a bit. It is still hidden
and users are still unable to figure it out (even before your patch).
Regards,
Cong
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-15 20:16 ` Cong Wang
@ 2026-01-16 15:04 ` Jamal Hadi Salim
0 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-16 15:04 UTC (permalink / raw)
To: Cong Wang
Cc: davem, edumazet, kuba, pabeni, horms, andrew+netdev, netdev, jiri,
victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
William Liu, Savy
On Thu, Jan 15, 2026 at 3:17 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Wed, Jan 14, 2026 at 8:34 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> > On Tue, Jan 13, 2026 at 3:10 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > >
> > > On Sun, Jan 11, 2026 at 8:40 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > > >
> > > >
> > > > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > > > together those bits. Patches #2 and patch #5 use these bits.
> > > > I added Fixes tags to patch #1 in case it is useful for backporting.
> > > > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > > > tdc test cases.
> > >
> > > 3 reasons why this patchset should be rejected:
> > >
> > > 1) It increases sk_buff size potentially by 1 byte with minimal config
> > >
> >
> > All distro vendors turn all options. So no change in size happens.
> > Regardless, it's a non-arguement there is no way to resolve the mirred
> > issue without global state.
> > It's a twofer - fixing mirred and netem.
>
> This makes little sense, because otherwise people could easily add:
>
> struct sk_buff {
> ....
> #ifdef CONFIG_NOT_ENABLED_BY_DEFAULT
> struct a_huge_field very_big;
> #endif
> };
>
> What's the boundary?
>
> >
> > > 2) Infinite loop is the symptom caused by enqueuing to the root qdisc,
> > > fixing the infinite loop itself is fixing the symptom and covering up the
> > > root cause deeper.
> > >
> >
> > The behavior of sending to the root has been around for ~20 years.
>
> So what?
>
Let's say you have a filter and action (or ebpf program) that needs to
see every packet as part of its setup. That filter is attached to the
root qdisc. The filter is no longer seeing the duplicated packets.
> > I just saw your patches - do you mind explaining why you didnt Cc me on them?
>
> You were the one who refused anyone's feedback on your broken and
> hard-coded policy in the kernel.
>
Ok, I think ive had it with you. Your claim is laughable at best. I am
the one who wasnt taking feedback? Seriously? you literally scared
people who could be potentially contributing to tc by your drama. You
received feedback on all variations of your four-to-five patche and
you didnt listen to any. It would be a good idea to use an AI to
summarize mailing list discussions and i hope such discussions can be
captured as part of commits.
> Please enlighten me on how we should talk to a person who refused
> any feedback? More importantly, why should we waste time on that?
>
> BTW, I am sure you are on netdev.
I read netdev emails only when i have time. Emails directed at me will
be read much much sooner.
We have rules: if you send patches, you must copy every stakeholder.
This cant just be based on your emotions on when this rule applies or
not. Please make sure you do this going forward.
> >
> > > 3) Using skb->ttl makes netem duplication behavior less predictable
> > > for users. With a TTL-based approach, the duplication depth is limited
> > > by a kernel-internal constant that is invisible to userspace. Users
> > > configuring nested netem hierarchies cannot determine from tc
> > > commands alone whether their packets will be duplicated at each
> > > stage or silently pass through when TTL is exhausted.
> > >
> >
> > The patch is not using the ttl as a counter for netem, it's being
> > treated as boolean (just like your patch is doing). We are only using
> > this as a counter for the mirred loop use case.
>
> This does not change this argument for a bit. It is still hidden
> and users are still unable to figure it out (even before your patch).
>
I am trying to make sense of what you are saying.
The ttl being boolean is exactly as in your patch with cb.
The goal of your patch should be to stop the loop. You are making an
additional change so that your cb changes work and you are implying
that the user can only understand it if better you made these extra
changes?
cheers,
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-15 14:32 ` Jamal Hadi Salim
@ 2026-01-29 5:06 ` Willem de Bruijn
2026-01-29 21:22 ` Jamal Hadi Salim
0 siblings, 1 reply; 26+ messages in thread
From: Willem de Bruijn @ 2026-01-29 5:06 UTC (permalink / raw)
To: Jamal Hadi Salim
Cc: Paolo Abeni, davem, edumazet, kuba, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On Thu, Jan 15, 2026 at 6:33 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Thu, Jan 15, 2026 at 5:23 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On 1/11/26 5:39 PM, Jamal Hadi Salim wrote:
> > > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > > together those bits. Patches #2 and patch #5 use these bits.
> > > I added Fixes tags to patch #1 in case it is useful for backporting.
> > > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > > tdc test cases.
> >
> > Generally speaking I think that a more self-encapsulated solution should
> > be preferable.
> >
>
> I dont see a way to do that with mirred. I am more than happy if
> someone else solves that issue or gives me an idea how to.
It might be informative that there used to be a redirect ttl field. I
removed it as part of tc_verd in commit aec745e2c520 ("net-tc: remove
unused tc_verd fields"). It was already unused by that time. The
mechanism was removed earlier in commit c19ae86a510c ("tc: remove
unused redirect ttl").
The IFB specific redirect logic remains (tc_at_ingress,
tc_skip_classify, from_ingress, redirected). Maybe some of those bits
can be used more efficiently. The cover letter for commit aec745e2c520
had a few suggestions [1].
Another redirect limit we have is XMIT_RECURSION_LIMIT. Though this
assumes running in a single call stack. BPF redirect uses it as of
commit a70b506efe89 ("bpf: enforce recursion limit on redirects"). For
tx. Rx takes netif_rx, so enqueue_to_backlog. So it does not seem that
this can address the entire request.
[1] https://lists.openwall.net/netdev/2017/01/07/92
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-29 5:06 ` Willem de Bruijn
@ 2026-01-29 21:22 ` Jamal Hadi Salim
2026-02-28 13:05 ` Jamal Hadi Salim
0 siblings, 1 reply; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-01-29 21:22 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Paolo Abeni, davem, edumazet, kuba, horms, andrew+netdev, netdev,
xiyou.wangcong, jiri, victor, dcaratti, lariel, daniel, pablo,
kadlec, fw, phil, netfilter-devel, coreteam, zyc199902,
lrGerlinde, jschung2
On Thu, Jan 29, 2026 at 12:06 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Thu, Jan 15, 2026 at 6:33 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> > On Thu, Jan 15, 2026 at 5:23 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > >
> > > On 1/11/26 5:39 PM, Jamal Hadi Salim wrote:
> > > > We introduce a 2-bit global skb->ttl counter.Patch #1 describes how we puti
> > > > together those bits. Patches #2 and patch #5 use these bits.
> > > > I added Fixes tags to patch #1 in case it is useful for backporting.
> > > > Patch #3 and #4 revert William's earlier netem commits. Patch #6 introduces
> > > > tdc test cases.
> > >
> > > Generally speaking I think that a more self-encapsulated solution should
> > > be preferable.
> > >
> >
> > I dont see a way to do that with mirred. I am more than happy if
> > someone else solves that issue or gives me an idea how to.
>
> It might be informative that there used to be a redirect ttl field. I
> removed it as part of tc_verd in commit aec745e2c520 ("net-tc: remove
> unused tc_verd fields"). It was already unused by that time. The
> mechanism was removed earlier in commit c19ae86a510c ("tc: remove
> unused redirect ttl").
True - we used to have 3 bits, but we can leave with two;
Unfortunately, a single bit won't work - otherwise we could have just
used the one available.
Only one bit was available, so we recouped another (skb->from_ingress)
and adapted it to use the tc cb struct.
There is still a chance that could cause issues - all the users of
this field were CC'ed and i was hoping they would respond.
> The IFB specific redirect logic remains (tc_at_ingress,
> tc_skip_classify, from_ingress, redirected). Maybe some of those bits
> can be used more efficiently. The cover letter for commit aec745e2c520
> had a few suggestions [1].
>
Using the iif check against netdev feels a bit "dirty," but if there's
a strong view that it's okay, we could liberate that bit (some testing
needed) and in that case i would leave skb->from_ingress alone.
> Another redirect limit we have is XMIT_RECURSION_LIMIT. Though this
> assumes running in a single call stack. BPF redirect uses it as of
> commit a70b506efe89 ("bpf: enforce recursion limit on redirects"). For
> tx. Rx takes netif_rx, so enqueue_to_backlog. So it does not seem that
> this can address the entire request.
True.
cheers,
jamal
> [1] https://lists.openwall.net/netdev/2017/01/07/92
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem
2026-01-29 21:22 ` Jamal Hadi Salim
@ 2026-02-28 13:05 ` Jamal Hadi Salim
0 siblings, 0 replies; 26+ messages in thread
From: Jamal Hadi Salim @ 2026-02-28 13:05 UTC (permalink / raw)
To: Willem de Bruijn
Cc: Paolo Abeni, davem, edumazet, kuba, horms, andrew+netdev, netdev,
jiri, victor, dcaratti, lariel, daniel, pablo, kadlec, fw, phil,
netfilter-devel, coreteam, zyc199902, lrGerlinde, jschung2,
Stephen Hemminger
()
Sorry, I was busied out so dropped the ball on this one...
On Thu, Jan 29, 2026 at 4:22 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Thu, Jan 29, 2026 at 12:06 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
> >
> > On Thu, Jan 15, 2026 at 6:33 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > >
> > > On Thu, Jan 15, 2026 at 5:23 AM Paolo Abeni <pabeni@redhat.com> wrote:
[..]
> > > > Generally speaking I think that a more self-encapsulated solution should
> > > > be preferable.
> > > >
> > >
> > > I dont see a way to do that with mirred. I am more than happy if
> > > someone else solves that issue or gives me an idea how to.
> >
> > It might be informative that there used to be a redirect ttl field. I
> > removed it as part of tc_verd in commit aec745e2c520 ("net-tc: remove
> > unused tc_verd fields"). It was already unused by that time. The
> > mechanism was removed earlier in commit c19ae86a510c ("tc: remove
> > unused redirect ttl").
>
> True - we used to have 3 bits, but we can leave with two;
> Unfortunately, a single bit won't work - otherwise we could have just
> used the one available.
> Only one bit was available, so we recouped another (skb->from_ingress)
> and adapted it to use the tc cb struct.
> There is still a chance that could cause issues - all the users of
> this field were CC'ed and i was hoping they would respond.
>
And indeed reclaiming that bit is problematic. So the AI didnt catch this ;->
Stephen, see if you can improve your prompt to catch this in the
future. Should be noted this would be nice to include as well in the
netdev type AI to catch cb stomping bugs. Turning skb->from_ingress
into an skb->cb field is problematic because the bit gets written way
before we hit tc. First the mlnx driver writes it in
mlx5e_tc_int_port_dev_fwd() and then the GRO code trumples over it. So
by the time it gets to tc it could have an arbitrary value. So I guess
our overlords are not ready to take over ;->
> > The IFB specific redirect logic remains (tc_at_ingress,
> > tc_skip_classify, from_ingress, redirected). Maybe some of those bits
> > can be used more efficiently. The cover letter for commit aec745e2c520
> > had a few suggestions [1].
> >
>
> Using the iif check against netdev feels a bit "dirty," but if there's
> a strong view that it's okay, we could liberate that bit (some testing
> needed) and in that case i would leave skb->from_ingress alone.
>
Let me try to reclaim skb->tc_skip_classify first and see if that
causes any issues.
cheers,
jamal
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2026-02-28 13:05 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-11 16:39 [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 1/6] net: Introduce skb ttl field to track packet loops Jamal Hadi Salim
2026-01-13 19:44 ` Cong Wang
2026-01-11 16:39 ` [PATCH net 2/6] net/sched: Fix ethx:ingress -> ethy:egress -> ethx:ingress mirred loop Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 3/6] Revert "net/sched: Restrict conditions for adding duplicating netems to qdisc tree" Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 4/6] Revert "selftests/tc-testing: Add tests for restrictions on netem duplication" Jamal Hadi Salim
2026-01-11 16:39 ` [PATCH net 5/6] net/sched: fix packet loop on netem when duplicate is on Jamal Hadi Salim
2026-01-11 20:22 ` William Liu
2026-01-11 20:39 ` Cong Wang
2026-01-11 21:56 ` Jamal Hadi Salim
2026-01-13 19:32 ` Cong Wang
2026-01-11 16:39 ` [PATCH net 6/6] selftests/tc-testing: Add netem/mirred test cases exercising loops Jamal Hadi Salim
2026-01-13 6:20 ` [PATCH net 0/6] net/sched: Fix packet loops in mirred and netem Stephen Hemminger
2026-01-13 19:35 ` Cong Wang
2026-01-13 20:10 ` Cong Wang
2026-01-14 16:33 ` Jamal Hadi Salim
2026-01-15 20:16 ` Cong Wang
2026-01-16 15:04 ` Jamal Hadi Salim
2026-01-14 2:07 ` Jakub Kicinski
2026-01-14 16:40 ` Jamal Hadi Salim
2026-01-15 3:28 ` Jakub Kicinski
2026-01-15 10:22 ` Paolo Abeni
2026-01-15 14:32 ` Jamal Hadi Salim
2026-01-29 5:06 ` Willem de Bruijn
2026-01-29 21:22 ` Jamal Hadi Salim
2026-02-28 13:05 ` Jamal Hadi Salim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox