* [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs
@ 2026-06-30 12:33 Daniel Borkmann
2026-06-30 12:33 ` [PATCH net 1/3] bpf: Reject redirect helpers without a bpf_net_context Daniel Borkmann
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Daniel Borkmann @ 2026-06-30 12:33 UTC (permalink / raw)
To: kuba; +Cc: pabeni, jhs, bigeasy, andrii, memxor, bpf, netdev
This is an alternative fix to [0] in order to not uglify
__dev_queue_xmit() with sprinkled ifdefs given this can be
simplified and isolated through a simple test into the BPF
redirect helper itself.
I've also added a proper BPF selftest, so there is no need
to check-in a binary BPF object into selftests given we do
have BPF infra for all of this.
[0] https://lore.kernel.org/netdev/20260629102157.737306-1-jhs@mojatatu.com/
[1] https://lore.kernel.org/netdev/20260629102157.737306-4-jhs@mojatatu.com/
Daniel Borkmann (2):
bpf: Reject redirect helpers without a bpf_net_context
selftests/bpf: Add test for redirect from qdisc qevent block
Jamal Hadi Salim (1):
net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains
include/net/pkt_cls.h | 14 ++-
net/core/filter.c | 17 ++-
net/sched/cls_api.c | 6 +-
net/sched/sch_cake.c | 2 +-
net/sched/sch_drr.c | 2 +-
net/sched/sch_dualpi2.c | 2 +-
net/sched/sch_ets.c | 2 +-
net/sched/sch_fq_codel.c | 2 +-
net/sched/sch_fq_pie.c | 2 +-
net/sched/sch_hfsc.c | 2 +-
net/sched/sch_htb.c | 2 +-
net/sched/sch_multiq.c | 2 +-
net/sched/sch_prio.c | 2 +-
net/sched/sch_qfq.c | 2 +-
net/sched/sch_sfb.c | 2 +-
net/sched/sch_sfq.c | 2 +-
tools/testing/selftests/bpf/config | 1 +
.../selftests/bpf/prog_tests/tc_qevent.c | 113 ++++++++++++++++++
.../selftests/bpf/progs/test_tc_qevent.c | 23 ++++
19 files changed, 175 insertions(+), 25 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_qevent.c
create mode 100644 tools/testing/selftests/bpf/progs/test_tc_qevent.c
--
2.43.0
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH net 1/3] bpf: Reject redirect helpers without a bpf_net_context 2026-06-30 12:33 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Daniel Borkmann @ 2026-06-30 12:33 ` Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Daniel Borkmann ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Daniel Borkmann @ 2026-06-30 12:33 UTC (permalink / raw) To: kuba; +Cc: pabeni, jhs, bigeasy, andrii, memxor, bpf, netdev The bpf_redirect*() helpers and skb_do_redirect() obtain the per-task bpf_redirect_info via bpf_net_ctx_get_ri(), which dereferences the current->bpf_net_context unconditionally. That context is established on the paths that run tc BPF such as sch_handle_{ingress,egress}(), *except* for the case where {cls,act}_bpf was attached to a proper qdisc. A program running from there reaches the NULL deref in two ways: * It calls bpf_redirect() directly, which dereferences the context at the top of the helper: tc qdisc add dev eth0 root handle 1: red limit 1MB min 10KB max 20KB \ avpkt 1000 burst 100 qevent early_drop block 10 tc filter add block 10 pref 1 bpf obj redirect.o * It simply returns TC_ACT_REDIRECT without helper call: tcf_qevent_handle() then dispatches to skb_do_redirect(), which dereferences the context Rather than extending bpf_net_context management into the qdisc path, make the redirect helpers refuse to operate when no context exists, and have tcf_qevent_handle() drop a TC_ACT_REDIRECT verdict instead of calling skb_do_redirect(). Previous behaviour was a crash, so nothing regresses by not supporting it. Fixes: 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Fixes: 3625750f05ec ("net: sched: Introduce helpers for qevent blocks") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- net/core/filter.c | 17 +++++++++++------ net/sched/cls_api.c | 6 ++---- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index b446aa8be5c3..11bb0d236822 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2552,11 +2552,13 @@ int skb_do_redirect(struct sk_buff *skb) BPF_CALL_2(bpf_redirect, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); + struct bpf_redirect_info *ri; - if (unlikely(flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL))) + if (unlikely(!bpf_net_ctx_get() || + (flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL)))) return TC_ACT_SHOT; + ri = bpf_net_ctx_get_ri(); ri->flags = flags; ri->tgt_index = ifindex; @@ -2573,11 +2575,12 @@ static const struct bpf_func_proto bpf_redirect_proto = { BPF_CALL_2(bpf_redirect_peer, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); + struct bpf_redirect_info *ri; - if (unlikely(flags)) + if (unlikely(!bpf_net_ctx_get() || flags)) return TC_ACT_SHOT; + ri = bpf_net_ctx_get_ri(); ri->flags = BPF_F_PEER; ri->tgt_index = ifindex; @@ -2595,11 +2598,13 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = { BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params, int, plen, u64, flags) { - struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); + struct bpf_redirect_info *ri; - if (unlikely((plen && plen < sizeof(*params)) || flags)) + if (unlikely((plen && plen < sizeof(*params)) || + !bpf_net_ctx_get() || flags)) return TC_ACT_SHOT; + ri = bpf_net_ctx_get_ri(); ri->flags = BPF_F_NEIGH | (plen ? BPF_F_NEXTHOP : 0); ri->tgt_index = ifindex; diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 3e67600a4a1a..ac49ca6d9a0c 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -4034,6 +4034,8 @@ struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, stru fl = rcu_dereference_bh(qe->filter_chain); switch (tcf_classify(skb, NULL, fl, &cl_res, false)) { + case TC_ACT_REDIRECT: + fallthrough; case TC_ACT_SHOT: qdisc_qstats_drop(sch); __qdisc_drop(skb, to_free); @@ -4045,10 +4047,6 @@ struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, stru __qdisc_drop(skb, to_free); *ret = __NET_XMIT_STOLEN; return NULL; - case TC_ACT_REDIRECT: - skb_do_redirect(skb); - *ret = __NET_XMIT_STOLEN; - return NULL; case TC_ACT_CONSUMED: *ret = __NET_XMIT_STOLEN; return NULL; -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains 2026-06-30 12:33 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 1/3] bpf: Reject redirect helpers without a bpf_net_context Daniel Borkmann @ 2026-06-30 12:33 ` Daniel Borkmann 2026-06-30 15:16 ` Jamal Hadi Salim 2026-06-30 12:33 ` [PATCH net 3/3] selftests/bpf: Add test for redirect from qdisc qevent block Daniel Borkmann 2026-06-30 14:37 ` [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Sebastian Andrzej Siewior 3 siblings, 1 reply; 9+ messages in thread From: Daniel Borkmann @ 2026-06-30 12:33 UTC (permalink / raw) To: kuba; +Cc: pabeni, jhs, bigeasy, andrii, memxor, bpf, netdev, Victor Nogueira From: Jamal Hadi Salim <jhs@mojatatu.com> When a TC filter attached to a qdisc filter chain returns TC_ACT_REDIRECT (ex: via an eBPF program calling bpf_redirect() or an act_bpf action), the redirect was silently lost i.e no qdisc classify function handled TC_ACT_REDIRECT, so the packet fell through the switch and was enqueued normally instead of being redirected. This has been broken since bpf_redirect() was introduced for TC in commit 27b29f63058d ("bpf: add bpf_redirect() helper"). We got lucky for a long time because bpf_net_context was a per-CPU variable that was always available. commit 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") turned bpf_net_context into a task_struct member that is only set up by explicit callers. Without a caller setting it up, bpf_redirect() itself crashes with a NULL pointer dereference in bpf_net_ctx_get_ri(). However, even with bpf_net_context available, TC_ACT_REDIRECT from qdisc filter chains cannot be honored without adding skb_do_redirect() calls to every qdisc classify function, which would require changes across net/sched/. Isolate it to ebpf core where it belongs. Instead, add a tcf_classify_qdisc() inline helper in pkt_cls.h, as a wrapper around tcf_classify() for use by qdisc classify functions and tcf_qevent_handle(). When the classify verdict is TC_ACT_REDIRECT, the wrapper converts it to TC_ACT_SHOT, dropping the packet rather than letting it continue silently. Dropping is preferred over letting the packet through because the user immediately sees packet loss. Silently passing the packet through would hide the problem and leave the user wondering why their redirect is not working. The clsact fast path, tc_run() continues to call tcf_classify() directly and is unaffected: TC_ACT_REDIRECT is returned as-is and handled by sch_handle_egress/ingress() calling skb_do_redirect() as before. Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Fixes: 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- include/net/pkt_cls.h | 14 +++++++++++++- net/sched/cls_api.c | 4 +--- net/sched/sch_cake.c | 2 +- net/sched/sch_drr.c | 2 +- net/sched/sch_dualpi2.c | 2 +- net/sched/sch_ets.c | 2 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_fq_pie.c | 2 +- net/sched/sch_hfsc.c | 2 +- net/sched/sch_htb.c | 2 +- net/sched/sch_multiq.c | 2 +- net/sched/sch_prio.c | 2 +- net/sched/sch_qfq.c | 2 +- net/sched/sch_sfb.c | 2 +- net/sched/sch_sfq.c | 2 +- 15 files changed, 27 insertions(+), 17 deletions(-) diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 3bd08d7f39c1..5f5cb36439fe 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -156,8 +156,20 @@ static inline int tcf_classify(struct sk_buff *skb, { return TC_ACT_UNSPEC; } - #endif +static inline int tcf_classify_qdisc(struct sk_buff *skb, + const struct tcf_proto *tp, + struct tcf_result *res, bool compat_mode) +{ + int ret = tcf_classify(skb, NULL, tp, res, compat_mode); + + /* TC_ACT_REDIRECT from qdisc filter chains is not supported. + * Use BPF via tcx or mirred redirect instead. + */ + if (unlikely(ret == TC_ACT_REDIRECT)) + ret = TC_ACT_SHOT; + return ret; +} static inline unsigned long __cls_set_class(unsigned long *clp, unsigned long cl) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index ac49ca6d9a0c..3ca56d060e28 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -4033,9 +4033,7 @@ struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, stru fl = rcu_dereference_bh(qe->filter_chain); - switch (tcf_classify(skb, NULL, fl, &cl_res, false)) { - case TC_ACT_REDIRECT: - fallthrough; + switch (tcf_classify_qdisc(skb, fl, &cl_res, false)) { case TC_ACT_SHOT: qdisc_qstats_drop(sch); __qdisc_drop(skb, to_free); diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c index a3c185505afc..94eb47ac54ee 100644 --- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -1730,7 +1730,7 @@ static u32 cake_classify(struct Qdisc *sch, struct cake_tin_data **t, goto hash; *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - result = tcf_classify(skb, NULL, filter, &res, false); + result = tcf_classify_qdisc(skb, filter, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c index 020657f959b5..91b1ef824afa 100644 --- a/net/sched/sch_drr.c +++ b/net/sched/sch_drr.c @@ -312,7 +312,7 @@ static struct drr_class *drr_classify(struct sk_buff *skb, struct Qdisc *sch, *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; fl = rcu_dereference_bh(q->filter_list); - result = tcf_classify(skb, NULL, fl, &res, false); + result = tcf_classify_qdisc(skb, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c index 5434df6ca8ef..98364f74211e 100644 --- a/net/sched/sch_dualpi2.c +++ b/net/sched/sch_dualpi2.c @@ -364,7 +364,7 @@ static int dualpi2_skb_classify(struct dualpi2_sched_data *q, return NET_XMIT_SUCCESS; } - result = tcf_classify(skb, NULL, fl, &res, false); + result = tcf_classify_qdisc(skb, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index cb8cf437ce87..25fcf4079fec 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -391,7 +391,7 @@ static struct ets_class *ets_classify(struct sk_buff *skb, struct Qdisc *sch, *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; if (TC_H_MAJ(skb->priority) != sch->handle) { fl = rcu_dereference_bh(q->filter_list); - err = tcf_classify(skb, NULL, fl, &res, false); + err = tcf_classify_qdisc(skb, fl, &res, false); #ifdef CONFIG_NET_CLS_ACT switch (err) { case TC_ACT_STOLEN: diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c index cafd1f943d99..6cce86ba383c 100644 --- a/net/sched/sch_fq_codel.c +++ b/net/sched/sch_fq_codel.c @@ -91,7 +91,7 @@ static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch, return fq_codel_hash(q, skb) + 1; *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - result = tcf_classify(skb, NULL, filter, &res, false); + result = tcf_classify_qdisc(skb, filter, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c index 72f48fa4010b..069e1facd413 100644 --- a/net/sched/sch_fq_pie.c +++ b/net/sched/sch_fq_pie.c @@ -96,7 +96,7 @@ static unsigned int fq_pie_classify(struct sk_buff *skb, struct Qdisc *sch, return fq_pie_hash(q, skb) + 1; *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - result = tcf_classify(skb, NULL, filter, &res, false); + result = tcf_classify_qdisc(skb, filter, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 7e537295b8b6..e87f5021a199 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -1143,7 +1143,7 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; head = &q->root; tcf = rcu_dereference_bh(q->root.filter_list); - while (tcf && (result = tcf_classify(skb, NULL, tcf, &res, false)) >= 0) { + while (tcf && (result = tcf_classify_qdisc(skb, tcf, &res, false)) >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { case TC_ACT_QUEUED: diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 908b9ba9ba2e..fdac0dc8f35a 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -243,7 +243,7 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, } *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - while (tcf && (result = tcf_classify(skb, NULL, tcf, &res, false)) >= 0) { + while (tcf && (result = tcf_classify_qdisc(skb, tcf, &res, false)) >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { case TC_ACT_QUEUED: diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c index 4e465d11e3d7..004f0d275caf 100644 --- a/net/sched/sch_multiq.c +++ b/net/sched/sch_multiq.c @@ -36,7 +36,7 @@ multiq_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) int err; *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - err = tcf_classify(skb, NULL, fl, &res, false); + err = tcf_classify_qdisc(skb, fl, &res, false); #ifdef CONFIG_NET_CLS_ACT switch (err) { case TC_ACT_STOLEN: diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index e4dd56a89072..79437c587e7e 100644 --- a/net/sched/sch_prio.c +++ b/net/sched/sch_prio.c @@ -39,7 +39,7 @@ prio_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; if (TC_H_MAJ(skb->priority) != sch->handle) { fl = rcu_dereference_bh(q->filter_list); - err = tcf_classify(skb, NULL, fl, &res, false); + err = tcf_classify_qdisc(skb, fl, &res, false); #ifdef CONFIG_NET_CLS_ACT switch (err) { case TC_ACT_STOLEN: diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index cb56787e1d25..6f3b7273cb16 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -709,7 +709,7 @@ static struct qfq_class *qfq_classify(struct sk_buff *skb, struct Qdisc *sch, *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; fl = rcu_dereference_bh(q->filter_list); - result = tcf_classify(skb, NULL, fl, &res, false); + result = tcf_classify_qdisc(skb, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c index b1d465094276..ed39869199c0 100644 --- a/net/sched/sch_sfb.c +++ b/net/sched/sch_sfb.c @@ -260,7 +260,7 @@ static bool sfb_classify(struct sk_buff *skb, struct tcf_proto *fl, struct tcf_result res; int result; - result = tcf_classify(skb, NULL, fl, &res, false); + result = tcf_classify_qdisc(skb, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 758b88f21865..77675f9a4c46 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -171,7 +171,7 @@ static unsigned int sfq_classify(struct sk_buff *skb, struct Qdisc *sch, return sfq_hash(q, skb) + 1; *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; - result = tcf_classify(skb, NULL, fl, &res, false); + result = tcf_classify_qdisc(skb, fl, &res, false); if (result >= 0) { #ifdef CONFIG_NET_CLS_ACT switch (result) { -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains 2026-06-30 12:33 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Daniel Borkmann @ 2026-06-30 15:16 ` Jamal Hadi Salim 2026-06-30 15:23 ` Daniel Borkmann 0 siblings, 1 reply; 9+ messages in thread From: Jamal Hadi Salim @ 2026-06-30 15:16 UTC (permalink / raw) To: Daniel Borkmann Cc: kuba, pabeni, bigeasy, andrii, memxor, bpf, netdev, Victor Nogueira On Tue, Jun 30, 2026 at 8:33 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > From: Jamal Hadi Salim <jhs@mojatatu.com> > > When a TC filter attached to a qdisc filter chain returns > TC_ACT_REDIRECT (ex: via an eBPF program calling bpf_redirect() or an > act_bpf action), the redirect was silently lost i.e no qdisc classify > function handled TC_ACT_REDIRECT, so the packet fell through the > switch and was enqueued normally instead of being redirected. > > This has been broken since bpf_redirect() was introduced for TC in > commit 27b29f63058d ("bpf: add bpf_redirect() helper"). We got lucky > for a long time because bpf_net_context was a per-CPU variable that > was always available. > > commit 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct > on PREEMPT_RT.") turned bpf_net_context into a task_struct member that > is only set up by explicit callers. Without a caller setting it up, > bpf_redirect() itself crashes with a NULL pointer dereference in > bpf_net_ctx_get_ri(). However, even with bpf_net_context available, > TC_ACT_REDIRECT from qdisc filter chains cannot be honored without > adding skb_do_redirect() calls to every qdisc classify function, which > would require changes across net/sched/. Isolate it to ebpf core where > it belongs. > > Instead, add a tcf_classify_qdisc() inline helper in pkt_cls.h, as a > wrapper around tcf_classify() for use by qdisc classify functions and > tcf_qevent_handle(). When the classify verdict is TC_ACT_REDIRECT, > the wrapper converts it to TC_ACT_SHOT, dropping the packet rather > than letting it continue silently. Dropping is preferred over > letting the packet through because the user immediately sees packet > loss. Silently passing the packet through would hide the problem and > leave the user wondering why their redirect is not working. > > The clsact fast path, tc_run() continues to call tcf_classify() directly > and is unaffected: TC_ACT_REDIRECT is returned as-is and handled by > sch_handle_egress/ingress() calling skb_do_redirect() as before. > > Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") > Fixes: 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") > Tested-by: Victor Nogueira <victor@mojatatu.com> > Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > --- > include/net/pkt_cls.h | 14 +++++++++++++- > net/sched/cls_api.c | 4 +--- > net/sched/sch_cake.c | 2 +- > net/sched/sch_drr.c | 2 +- > net/sched/sch_dualpi2.c | 2 +- > net/sched/sch_ets.c | 2 +- > net/sched/sch_fq_codel.c | 2 +- > net/sched/sch_fq_pie.c | 2 +- > net/sched/sch_hfsc.c | 2 +- > net/sched/sch_htb.c | 2 +- > net/sched/sch_multiq.c | 2 +- > net/sched/sch_prio.c | 2 +- > net/sched/sch_qfq.c | 2 +- > net/sched/sch_sfb.c | 2 +- > net/sched/sch_sfq.c | 2 +- > 15 files changed, 27 insertions(+), 17 deletions(-) > > diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h > index 3bd08d7f39c1..5f5cb36439fe 100644 > --- a/include/net/pkt_cls.h > +++ b/include/net/pkt_cls.h > @@ -156,8 +156,20 @@ static inline int tcf_classify(struct sk_buff *skb, > { > return TC_ACT_UNSPEC; > } > - > #endif > +static inline int tcf_classify_qdisc(struct sk_buff *skb, > + const struct tcf_proto *tp, > + struct tcf_result *res, bool compat_mode) > +{ > + int ret = tcf_classify(skb, NULL, tp, res, compat_mode); > + > + /* TC_ACT_REDIRECT from qdisc filter chains is not supported. > + * Use BPF via tcx or mirred redirect instead. > + */ > + if (unlikely(ret == TC_ACT_REDIRECT)) > + ret = TC_ACT_SHOT; > + return ret; > +} > Why did you remove the warning? A lesser issue is that you introduced a space above #endif cheers, jamal > static inline unsigned long > __cls_set_class(unsigned long *clp, unsigned long cl) > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c > index ac49ca6d9a0c..3ca56d060e28 100644 > --- a/net/sched/cls_api.c > +++ b/net/sched/cls_api.c > @@ -4033,9 +4033,7 @@ struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, stru > > fl = rcu_dereference_bh(qe->filter_chain); > > - switch (tcf_classify(skb, NULL, fl, &cl_res, false)) { > - case TC_ACT_REDIRECT: > - fallthrough; > + switch (tcf_classify_qdisc(skb, fl, &cl_res, false)) { > case TC_ACT_SHOT: > qdisc_qstats_drop(sch); > __qdisc_drop(skb, to_free); > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c > index a3c185505afc..94eb47ac54ee 100644 > --- a/net/sched/sch_cake.c > +++ b/net/sched/sch_cake.c > @@ -1730,7 +1730,7 @@ static u32 cake_classify(struct Qdisc *sch, struct cake_tin_data **t, > goto hash; > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - result = tcf_classify(skb, NULL, filter, &res, false); > + result = tcf_classify_qdisc(skb, filter, &res, false); > > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c > index 020657f959b5..91b1ef824afa 100644 > --- a/net/sched/sch_drr.c > +++ b/net/sched/sch_drr.c > @@ -312,7 +312,7 @@ static struct drr_class *drr_classify(struct sk_buff *skb, struct Qdisc *sch, > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > fl = rcu_dereference_bh(q->filter_list); > - result = tcf_classify(skb, NULL, fl, &res, false); > + result = tcf_classify_qdisc(skb, fl, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c > index 5434df6ca8ef..98364f74211e 100644 > --- a/net/sched/sch_dualpi2.c > +++ b/net/sched/sch_dualpi2.c > @@ -364,7 +364,7 @@ static int dualpi2_skb_classify(struct dualpi2_sched_data *q, > return NET_XMIT_SUCCESS; > } > > - result = tcf_classify(skb, NULL, fl, &res, false); > + result = tcf_classify_qdisc(skb, fl, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c > index cb8cf437ce87..25fcf4079fec 100644 > --- a/net/sched/sch_ets.c > +++ b/net/sched/sch_ets.c > @@ -391,7 +391,7 @@ static struct ets_class *ets_classify(struct sk_buff *skb, struct Qdisc *sch, > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > if (TC_H_MAJ(skb->priority) != sch->handle) { > fl = rcu_dereference_bh(q->filter_list); > - err = tcf_classify(skb, NULL, fl, &res, false); > + err = tcf_classify_qdisc(skb, fl, &res, false); > #ifdef CONFIG_NET_CLS_ACT > switch (err) { > case TC_ACT_STOLEN: > diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c > index cafd1f943d99..6cce86ba383c 100644 > --- a/net/sched/sch_fq_codel.c > +++ b/net/sched/sch_fq_codel.c > @@ -91,7 +91,7 @@ static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch, > return fq_codel_hash(q, skb) + 1; > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - result = tcf_classify(skb, NULL, filter, &res, false); > + result = tcf_classify_qdisc(skb, filter, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c > index 72f48fa4010b..069e1facd413 100644 > --- a/net/sched/sch_fq_pie.c > +++ b/net/sched/sch_fq_pie.c > @@ -96,7 +96,7 @@ static unsigned int fq_pie_classify(struct sk_buff *skb, struct Qdisc *sch, > return fq_pie_hash(q, skb) + 1; > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - result = tcf_classify(skb, NULL, filter, &res, false); > + result = tcf_classify_qdisc(skb, filter, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c > index 7e537295b8b6..e87f5021a199 100644 > --- a/net/sched/sch_hfsc.c > +++ b/net/sched/sch_hfsc.c > @@ -1143,7 +1143,7 @@ hfsc_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > head = &q->root; > tcf = rcu_dereference_bh(q->root.filter_list); > - while (tcf && (result = tcf_classify(skb, NULL, tcf, &res, false)) >= 0) { > + while (tcf && (result = tcf_classify_qdisc(skb, tcf, &res, false)) >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > case TC_ACT_QUEUED: > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index 908b9ba9ba2e..fdac0dc8f35a 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -243,7 +243,7 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, > } > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - while (tcf && (result = tcf_classify(skb, NULL, tcf, &res, false)) >= 0) { > + while (tcf && (result = tcf_classify_qdisc(skb, tcf, &res, false)) >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > case TC_ACT_QUEUED: > diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c > index 4e465d11e3d7..004f0d275caf 100644 > --- a/net/sched/sch_multiq.c > +++ b/net/sched/sch_multiq.c > @@ -36,7 +36,7 @@ multiq_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) > int err; > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - err = tcf_classify(skb, NULL, fl, &res, false); > + err = tcf_classify_qdisc(skb, fl, &res, false); > #ifdef CONFIG_NET_CLS_ACT > switch (err) { > case TC_ACT_STOLEN: > diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c > index e4dd56a89072..79437c587e7e 100644 > --- a/net/sched/sch_prio.c > +++ b/net/sched/sch_prio.c > @@ -39,7 +39,7 @@ prio_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > if (TC_H_MAJ(skb->priority) != sch->handle) { > fl = rcu_dereference_bh(q->filter_list); > - err = tcf_classify(skb, NULL, fl, &res, false); > + err = tcf_classify_qdisc(skb, fl, &res, false); > #ifdef CONFIG_NET_CLS_ACT > switch (err) { > case TC_ACT_STOLEN: > diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c > index cb56787e1d25..6f3b7273cb16 100644 > --- a/net/sched/sch_qfq.c > +++ b/net/sched/sch_qfq.c > @@ -709,7 +709,7 @@ static struct qfq_class *qfq_classify(struct sk_buff *skb, struct Qdisc *sch, > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > fl = rcu_dereference_bh(q->filter_list); > - result = tcf_classify(skb, NULL, fl, &res, false); > + result = tcf_classify_qdisc(skb, fl, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c > index b1d465094276..ed39869199c0 100644 > --- a/net/sched/sch_sfb.c > +++ b/net/sched/sch_sfb.c > @@ -260,7 +260,7 @@ static bool sfb_classify(struct sk_buff *skb, struct tcf_proto *fl, > struct tcf_result res; > int result; > > - result = tcf_classify(skb, NULL, fl, &res, false); > + result = tcf_classify_qdisc(skb, fl, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c > index 758b88f21865..77675f9a4c46 100644 > --- a/net/sched/sch_sfq.c > +++ b/net/sched/sch_sfq.c > @@ -171,7 +171,7 @@ static unsigned int sfq_classify(struct sk_buff *skb, struct Qdisc *sch, > return sfq_hash(q, skb) + 1; > > *qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; > - result = tcf_classify(skb, NULL, fl, &res, false); > + result = tcf_classify_qdisc(skb, fl, &res, false); > if (result >= 0) { > #ifdef CONFIG_NET_CLS_ACT > switch (result) { > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains 2026-06-30 15:16 ` Jamal Hadi Salim @ 2026-06-30 15:23 ` Daniel Borkmann 2026-07-01 15:35 ` Jamal Hadi Salim 0 siblings, 1 reply; 9+ messages in thread From: Daniel Borkmann @ 2026-06-30 15:23 UTC (permalink / raw) To: Jamal Hadi Salim Cc: kuba, pabeni, bigeasy, andrii, memxor, bpf, netdev, Victor Nogueira On 6/30/26 5:16 PM, Jamal Hadi Salim wrote: > On Tue, Jun 30, 2026 at 8:33 AM Daniel Borkmann <daniel@iogearbox.net> wrote: >> From: Jamal Hadi Salim <jhs@mojatatu.com> >> >> When a TC filter attached to a qdisc filter chain returns >> TC_ACT_REDIRECT (ex: via an eBPF program calling bpf_redirect() or an >> act_bpf action), the redirect was silently lost i.e no qdisc classify >> function handled TC_ACT_REDIRECT, so the packet fell through the >> switch and was enqueued normally instead of being redirected. >> >> This has been broken since bpf_redirect() was introduced for TC in >> commit 27b29f63058d ("bpf: add bpf_redirect() helper"). We got lucky >> for a long time because bpf_net_context was a per-CPU variable that >> was always available. >> >> commit 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct >> on PREEMPT_RT.") turned bpf_net_context into a task_struct member that >> is only set up by explicit callers. Without a caller setting it up, >> bpf_redirect() itself crashes with a NULL pointer dereference in >> bpf_net_ctx_get_ri(). However, even with bpf_net_context available, >> TC_ACT_REDIRECT from qdisc filter chains cannot be honored without >> adding skb_do_redirect() calls to every qdisc classify function, which >> would require changes across net/sched/. Isolate it to ebpf core where >> it belongs. >> >> Instead, add a tcf_classify_qdisc() inline helper in pkt_cls.h, as a >> wrapper around tcf_classify() for use by qdisc classify functions and >> tcf_qevent_handle(). When the classify verdict is TC_ACT_REDIRECT, >> the wrapper converts it to TC_ACT_SHOT, dropping the packet rather >> than letting it continue silently. Dropping is preferred over >> letting the packet through because the user immediately sees packet >> loss. Silently passing the packet through would hide the problem and >> leave the user wondering why their redirect is not working. >> >> The clsact fast path, tc_run() continues to call tcf_classify() directly >> and is unaffected: TC_ACT_REDIRECT is returned as-is and handled by >> sch_handle_egress/ingress() calling skb_do_redirect() as before. >> >> Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") >> Fixes: 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") >> Tested-by: Victor Nogueira <victor@mojatatu.com> >> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> >> --- >> include/net/pkt_cls.h | 14 +++++++++++++- >> net/sched/cls_api.c | 4 +--- >> net/sched/sch_cake.c | 2 +- >> net/sched/sch_drr.c | 2 +- >> net/sched/sch_dualpi2.c | 2 +- >> net/sched/sch_ets.c | 2 +- >> net/sched/sch_fq_codel.c | 2 +- >> net/sched/sch_fq_pie.c | 2 +- >> net/sched/sch_hfsc.c | 2 +- >> net/sched/sch_htb.c | 2 +- >> net/sched/sch_multiq.c | 2 +- >> net/sched/sch_prio.c | 2 +- >> net/sched/sch_qfq.c | 2 +- >> net/sched/sch_sfb.c | 2 +- >> net/sched/sch_sfq.c | 2 +- >> 15 files changed, 27 insertions(+), 17 deletions(-) >> >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h >> index 3bd08d7f39c1..5f5cb36439fe 100644 >> --- a/include/net/pkt_cls.h >> +++ b/include/net/pkt_cls.h >> @@ -156,8 +156,20 @@ static inline int tcf_classify(struct sk_buff *skb, >> { >> return TC_ACT_UNSPEC; >> } >> - >> #endif >> +static inline int tcf_classify_qdisc(struct sk_buff *skb, >> + const struct tcf_proto *tp, >> + struct tcf_result *res, bool compat_mode) >> +{ >> + int ret = tcf_classify(skb, NULL, tp, res, compat_mode); >> + >> + /* TC_ACT_REDIRECT from qdisc filter chains is not supported. >> + * Use BPF via tcx or mirred redirect instead. >> + */ >> + if (unlikely(ret == TC_ACT_REDIRECT)) >> + ret = TC_ACT_SHOT; >> + return ret; >> +} > > Why did you remove the warning? > A lesser issue is that you introduced a space above #endif I don't think we need to put out an extra warn to spam the log, this can be easily traced either via bpftrace or qdisc stats etc; plus the guidance to eBPF with clsact is also obsolete. Given noone has run into this the last 10y, I don't think it really matters tbh. Thanks, Daniel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains 2026-06-30 15:23 ` Daniel Borkmann @ 2026-07-01 15:35 ` Jamal Hadi Salim 0 siblings, 0 replies; 9+ messages in thread From: Jamal Hadi Salim @ 2026-07-01 15:35 UTC (permalink / raw) To: Daniel Borkmann Cc: Jakub Kicinski, Paolo Abeni, Sebastian Andrzej Siewior, Andrii Nakryiko, Kumar Kartikeya Dwivedi, bpf, Linux Kernel Network Developers, Victor Nogueira On Tue, Jun 30, 2026 at 11:23 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 6/30/26 5:16 PM, Jamal Hadi Salim wrote: > > On Tue, Jun 30, 2026 at 8:33 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > >> From: Jamal Hadi Salim <jhs@mojatatu.com> > >> > >> When a TC filter attached to a qdisc filter chain returns > >> TC_ACT_REDIRECT (ex: via an eBPF program calling bpf_redirect() or an > >> act_bpf action), the redirect was silently lost i.e no qdisc classify > >> function handled TC_ACT_REDIRECT, so the packet fell through the > >> switch and was enqueued normally instead of being redirected. > >> > >> This has been broken since bpf_redirect() was introduced for TC in > >> commit 27b29f63058d ("bpf: add bpf_redirect() helper"). We got lucky > >> for a long time because bpf_net_context was a per-CPU variable that > >> was always available. > >> > >> commit 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct > >> on PREEMPT_RT.") turned bpf_net_context into a task_struct member that > >> is only set up by explicit callers. Without a caller setting it up, > >> bpf_redirect() itself crashes with a NULL pointer dereference in > >> bpf_net_ctx_get_ri(). However, even with bpf_net_context available, > >> TC_ACT_REDIRECT from qdisc filter chains cannot be honored without > >> adding skb_do_redirect() calls to every qdisc classify function, which > >> would require changes across net/sched/. Isolate it to ebpf core where > >> it belongs. > >> > >> Instead, add a tcf_classify_qdisc() inline helper in pkt_cls.h, as a > >> wrapper around tcf_classify() for use by qdisc classify functions and > >> tcf_qevent_handle(). When the classify verdict is TC_ACT_REDIRECT, > >> the wrapper converts it to TC_ACT_SHOT, dropping the packet rather > >> than letting it continue silently. Dropping is preferred over > >> letting the packet through because the user immediately sees packet > >> loss. Silently passing the packet through would hide the problem and > >> leave the user wondering why their redirect is not working. > >> > >> The clsact fast path, tc_run() continues to call tcf_classify() directly > >> and is unaffected: TC_ACT_REDIRECT is returned as-is and handled by > >> sch_handle_egress/ingress() calling skb_do_redirect() as before. > >> > >> Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") > >> Fixes: 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") > >> Tested-by: Victor Nogueira <victor@mojatatu.com> > >> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> > >> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > >> --- > >> include/net/pkt_cls.h | 14 +++++++++++++- > >> net/sched/cls_api.c | 4 +--- > >> net/sched/sch_cake.c | 2 +- > >> net/sched/sch_drr.c | 2 +- > >> net/sched/sch_dualpi2.c | 2 +- > >> net/sched/sch_ets.c | 2 +- > >> net/sched/sch_fq_codel.c | 2 +- > >> net/sched/sch_fq_pie.c | 2 +- > >> net/sched/sch_hfsc.c | 2 +- > >> net/sched/sch_htb.c | 2 +- > >> net/sched/sch_multiq.c | 2 +- > >> net/sched/sch_prio.c | 2 +- > >> net/sched/sch_qfq.c | 2 +- > >> net/sched/sch_sfb.c | 2 +- > >> net/sched/sch_sfq.c | 2 +- > >> 15 files changed, 27 insertions(+), 17 deletions(-) > >> > >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h > >> index 3bd08d7f39c1..5f5cb36439fe 100644 > >> --- a/include/net/pkt_cls.h > >> +++ b/include/net/pkt_cls.h > >> @@ -156,8 +156,20 @@ static inline int tcf_classify(struct sk_buff *skb, > >> { > >> return TC_ACT_UNSPEC; > >> } > >> - > >> #endif > >> +static inline int tcf_classify_qdisc(struct sk_buff *skb, > >> + const struct tcf_proto *tp, > >> + struct tcf_result *res, bool compat_mode) > >> +{ > >> + int ret = tcf_classify(skb, NULL, tp, res, compat_mode); > >> + > >> + /* TC_ACT_REDIRECT from qdisc filter chains is not supported. > >> + * Use BPF via tcx or mirred redirect instead. > >> + */ > >> + if (unlikely(ret == TC_ACT_REDIRECT)) > >> + ret = TC_ACT_SHOT; > >> + return ret; > >> +} > > > > Why did you remove the warning? > > A lesser issue is that you introduced a space above #endif > > I don't think we need to put out an extra warn to spam the log, this > can be easily traced either via bpftrace or qdisc stats etc; plus the > guidance to eBPF with clsact is also obsolete. Given noone has run > into this the last 10y, I don't think it really matters tbh. It's a bit entitled for you to change someone else's patch to fit your philosophy without a mention. It's only one message per qdisc. But fine, let's keep it that way. Sashiko is hallucinating on TC_ACT_CONSUMED for patch 1. Let's ignore it. cheers, jamal > Thanks, > Daniel ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 3/3] selftests/bpf: Add test for redirect from qdisc qevent block 2026-06-30 12:33 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 1/3] bpf: Reject redirect helpers without a bpf_net_context Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Daniel Borkmann @ 2026-06-30 12:33 ` Daniel Borkmann 2026-06-30 14:37 ` [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Sebastian Andrzej Siewior 3 siblings, 0 replies; 9+ messages in thread From: Daniel Borkmann @ 2026-06-30 12:33 UTC (permalink / raw) To: kuba; +Cc: pabeni, jhs, bigeasy, andrii, memxor, bpf, netdev Add a regression test for the NULL current->bpf_net_context deref hit when a BPF classifier attached to a qdisc qevent block asks for a redirect. The classifier runs from tcf_qevent_handle() on the qdisc enqueue path, outside any bpf_net_context. # LDLIBS=-static PKG_CONFIG='pkg-config --static' ./vmtest.sh -- ./test_progs -t qevent [...] + /etc/rcS.d/S50-startup ./test_progs -t qevent #496/1 tc_qevent/redirect_verdict:OK #496/2 tc_qevent/redirect_helper:OK #496 tc_qevent:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- tools/testing/selftests/bpf/config | 1 + .../selftests/bpf/prog_tests/tc_qevent.c | 113 ++++++++++++++++++ .../selftests/bpf/progs/test_tc_qevent.c | 23 ++++ 3 files changed, 137 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/tc_qevent.c create mode 100644 tools/testing/selftests/bpf/progs/test_tc_qevent.c diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index adb25146e88c..ea7044f30adc 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -82,6 +82,7 @@ CONFIG_NET_SCH_BPF=y CONFIG_NET_SCH_FQ=y CONFIG_NET_SCH_INGRESS=y CONFIG_NET_SCH_HTB=y +CONFIG_NET_SCH_RED=y CONFIG_NET_SCHED=y CONFIG_NETDEVSIM=y CONFIG_NETFILTER=y diff --git a/tools/testing/selftests/bpf/prog_tests/tc_qevent.c b/tools/testing/selftests/bpf/prog_tests/tc_qevent.c new file mode 100644 index 000000000000..67e1d17567ab --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/tc_qevent.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> +#include <network_helpers.h> +#include <sys/socket.h> +#include <netinet/in.h> +#include <arpa/inet.h> +#include <unistd.h> +#include <string.h> + +#include "test_tc_qevent.skel.h" + +#define NS_TX "tc_qevent_tx" +#define NS_RX "tc_qevent_rx" +#define IP_TX "10.255.0.1" +#define IP_RX "10.255.0.2" +#define PIN_PATH "/sys/fs/bpf/tc_qevent_redirect" + +static void blast_udp(void) +{ + struct sockaddr_in dst = {}; + char buf[1400] = {}; + int fd, i; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (!ASSERT_GE(fd, 0, "udp socket")) + return; + + dst.sin_family = AF_INET; + dst.sin_port = htons(12345); + inet_pton(AF_INET, IP_RX, &dst.sin_addr); + + /* + * Push far more than the RED queue can hold. Once qavg crosses qth_min + * every further packet hits the congestion_drop / early_drop qevent. + */ + for (i = 0; i < 50000; i++) + sendto(fd, buf, sizeof(buf), MSG_DONTWAIT, + (struct sockaddr *)&dst, sizeof(dst)); + + close(fd); +} + +static void run_qevent_redirect(struct bpf_program *prog, __u64 *counter) +{ + struct nstoken *tok = NULL; + int err; + + SYS_NOFAIL("ip netns del %s", NS_TX); + SYS_NOFAIL("ip netns del %s", NS_RX); + unlink(PIN_PATH); + + err = bpf_program__pin(prog, PIN_PATH); + if (!ASSERT_OK(err, "pin prog")) + return; + + SYS(unpin, "ip netns add %s", NS_TX); + SYS(del_tx, "ip netns add %s", NS_RX); + SYS(del_rx, "ip -n %s link add veth0 type veth peer name veth1 netns %s", NS_TX, NS_RX); + SYS(del_rx, "ip -n %s addr add %s/24 dev veth0", NS_TX, IP_TX); + SYS(del_rx, "ip -n %s link set veth0 up", NS_TX); + SYS(del_rx, "ip -n %s addr add %s/24 dev veth1", NS_RX, IP_RX); + SYS(del_rx, "ip -n %s link set veth1 up", NS_RX); + + tok = open_netns(NS_TX); + if (!ASSERT_OK_PTR(tok, "open_netns")) + goto del_rx; + + SYS(close_ns, "tc qdisc add dev veth0 root handle 1: htb default 1"); + SYS(close_ns, "tc class add dev veth0 parent 1: classid 1:1 htb rate 1mbit ceil 1mbit"); + + if (system("tc qdisc add dev veth0 parent 1:1 handle 11: red " + "limit 500000 avpkt 1000 probability 1 min 5000 max 6000 " + "burst 6 qevent early_drop block 10 2>/dev/null")) { + test__skip(); + goto close_ns; + } + + if (system("tc filter add block 10 bpf da object-pinned " + PIN_PATH " 2>/dev/null")) { + test__skip(); + goto close_ns; + } + + blast_udp(); + ASSERT_GT(*counter, 0, "qevent classifier ran"); +close_ns: + close_netns(tok); +del_rx: + SYS_NOFAIL("ip netns del %s", NS_RX); +del_tx: + SYS_NOFAIL("ip netns del %s", NS_TX); +unpin: + bpf_program__unpin(prog, PIN_PATH); +} + +void test_tc_qevent(void) +{ + struct test_tc_qevent *skel; + + skel = test_tc_qevent__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open_and_load")) + return; + + if (test__start_subtest("redirect_verdict")) + run_qevent_redirect(skel->progs.qevent_redirect_verdict, + &skel->bss->verdict_calls); + if (test__start_subtest("redirect_helper")) + run_qevent_redirect(skel->progs.qevent_redirect_helper, + &skel->bss->helper_calls); + + test_tc_qevent__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_tc_qevent.c b/tools/testing/selftests/bpf/progs/test_tc_qevent.c new file mode 100644 index 000000000000..1529c111f4aa --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_tc_qevent.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> + +int redirect_ifindex = 1; +__u64 verdict_calls = 0; +__u64 helper_calls = 0; + +SEC("tc") +int qevent_redirect_verdict(struct __sk_buff *skb) +{ + __sync_fetch_and_add(&verdict_calls, 1); + return TCX_REDIRECT; +} + +SEC("tc") +int qevent_redirect_helper(struct __sk_buff *skb) +{ + __sync_fetch_and_add(&helper_calls, 1); + return bpf_redirect(redirect_ifindex, 0); +} + +char _license[] SEC("license") = "GPL"; -- 2.43.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs 2026-06-30 12:33 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Daniel Borkmann ` (2 preceding siblings ...) 2026-06-30 12:33 ` [PATCH net 3/3] selftests/bpf: Add test for redirect from qdisc qevent block Daniel Borkmann @ 2026-06-30 14:37 ` Sebastian Andrzej Siewior 2026-06-30 15:09 ` Daniel Borkmann 3 siblings, 1 reply; 9+ messages in thread From: Sebastian Andrzej Siewior @ 2026-06-30 14:37 UTC (permalink / raw) To: Daniel Borkmann; +Cc: kuba, pabeni, jhs, andrii, memxor, bpf, netdev On 2026-06-30 14:33:28 [+0200], Daniel Borkmann wrote: > This is an alternative fix to [0] in order to not uglify > __dev_queue_xmit() with sprinkled ifdefs given this can be > simplified and isolated through a simple test into the BPF > redirect helper itself. > > I've also added a proper BPF selftest, so there is no need > to check-in a binary BPF object into selftests given we do > have BPF infra for all of this. 1/3 makes sense. Assuming we wouldn't have this per-task memory assignment, wouldn't then the state from one redirect leak into another? For what it's worth: Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Sebastian ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs 2026-06-30 14:37 ` [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Sebastian Andrzej Siewior @ 2026-06-30 15:09 ` Daniel Borkmann 0 siblings, 0 replies; 9+ messages in thread From: Daniel Borkmann @ 2026-06-30 15:09 UTC (permalink / raw) To: Sebastian Andrzej Siewior; +Cc: kuba, pabeni, jhs, andrii, memxor, bpf, netdev On 6/30/26 4:37 PM, Sebastian Andrzej Siewior wrote: > On 2026-06-30 14:33:28 [+0200], Daniel Borkmann wrote: >> This is an alternative fix to [0] in order to not uglify >> __dev_queue_xmit() with sprinkled ifdefs given this can be >> simplified and isolated through a simple test into the BPF >> redirect helper itself. >> >> I've also added a proper BPF selftest, so there is no need >> to check-in a binary BPF object into selftests given we do >> have BPF infra for all of this. > > 1/3 makes sense. Assuming we wouldn't have this per-task memory > assignment, wouldn't then the state from one redirect leak into another? For the normal/functional case out of tcx / cls_bpf via clsact not given skb_do_redirect() is called right after return. (For the full qdisc case it was never working in the first place.) > For what it's worth: > Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > Sebastian ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-07-01 15:35 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-30 12:33 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 1/3] bpf: Reject redirect helpers without a bpf_net_context Daniel Borkmann 2026-06-30 12:33 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Daniel Borkmann 2026-06-30 15:16 ` Jamal Hadi Salim 2026-06-30 15:23 ` Daniel Borkmann 2026-07-01 15:35 ` Jamal Hadi Salim 2026-06-30 12:33 ` [PATCH net 3/3] selftests/bpf: Add test for redirect from qdisc qevent block Daniel Borkmann 2026-06-30 14:37 ` [PATCH net 0/3] Fix broken TC_ACT_REDIRECT from qdiscs Sebastian Andrzej Siewior 2026-06-30 15:09 ` Daniel Borkmann
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox