From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
horms@kernel.org, toke@toke.dk, jiri@resnulli.us,
clrkwllms@kernel.org, rostedt@goodmis.org, kuniyu@google.com,
sdf.kernel@gmail.com, skhawaja@google.com, liuhangbin@gmail.com,
krikku@gmail.com, mkarsten@uwaterloo.ca, victor@mojatatu.com,
ast@kernel.org, hawk@kernel.org, john.fastabend@gmail.com,
daniel@iogearbox.net, Sashiko <sashiko-bot@kernel.org>
Subject: Re: [PATCH net 1/3] net: Extend bpf_net_context lifetime to cover qdisc enqueue
Date: Mon, 29 Jun 2026 12:29:17 +0200 [thread overview]
Message-ID: <20260629102917.Ag2Vd7LR@linutronix.de> (raw)
In-Reply-To: <20260626165156.169012-2-jhs@mojatatu.com>
On 2026-06-26 12:51:54 [-0400], Jamal Hadi Salim wrote:
> The bpf_net_context used by sch_handle_egress() is stack-allocated and torn
> down in that function returned. By the time tcf_qevent_handle() runs
> current->bpf_net_context is NULL.
>
> When a filter attached to a qevent block (e.g. RED's early_drop or mark
> qevents, which always use shared blocks) returns TC_ACT_REDIRECT,
> tcf_qevent_handle() calls skb_do_redirect(), which in turn calls bpf helper
> bpf_net_ctx_get_ri(). That helper unconditionally dereferences
> current->bpf_net_context resulting in a NULL pointer dereference.
>
> Note: The same holds for actions that invoke BPF redirect helpers
> (e.g. act_bpf running a program that calls bpf_redirect()) during qevent
> classification itself. And as a matter of fact the same assumption is
> made in the code outside of tc.
>
> Fix:
> Move the bpf_net_context lifecycle out of sch_handle_egress() into
> __dev_queue_xmit(), so that it spans both the egress TC fast path and the
> qdisc enqueue. The setup is placed outside the egress_needed_key static
> branch because qevents are independent of clsact/NF egress hooks and
> that key may stay disabled when only a qevent-bearing qdisc is
> configured. Unfortunately this adds a small unconditional penalty to the
> code path _per packet_ only guarded by CONFIG_NET_XGRESS (two writes and
> one read for bpf_net_ctx_set, plus one write for bpf_net_ctx_clear).
I fail to understand this but you and sashiko have an understanding...
If there is TC_ACT_REDIRECT returned by tc_run(), then the skb is NULL
and as such uppon return from sch_handle_egress() the control flow goes
to the out label.
As a fix you move the bpf_net_ctx assigned to before CONFIG_NET_EGRESS
and clear it on exit. What do I miss here?
> This keeps all bpf_net_context management in net/core/dev.c i.e the
> existing boundary between tc core and BPF without requiring any net/sched/
> code to know about BPF plumbing.
>
> Reproducer (see the accompanying tdc test):
>
> tc qdisc add dev eth0 root handle 1: red limit 1MB min 10KB max 20KB \
> avpkt 1000 burst 100 qevent early_drop block 10
> tc qdisc add dev eth0 clsact
> tc filter add block 10 pref 1 bpf obj redirect.o
stupid question: how do I get this redirect.o? Just a simply thing to
reproduce this…
> tc filter add dev eth0 egress protocol ip prio 1 matchall \
> action gact pass
>
> traffic through eth0 triggers red_enqueue() -> tcf_qevent_handle() and,
> on a redirect verdict, a NULL deref in skb_do_redirect().
Sebastian
next prev parent reply other threads:[~2026-06-29 10:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-26 16:51 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 1/3] net: Extend bpf_net_context lifetime to cover qdisc enqueue Jamal Hadi Salim
2026-06-27 16:52 ` sashiko-bot
2026-06-28 12:16 ` Jamal Hadi Salim
2026-06-29 10:29 ` Sebastian Andrzej Siewior [this message]
2026-06-29 10:47 ` Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Jamal Hadi Salim
2026-06-27 16:52 ` sashiko-bot
2026-06-28 12:28 ` Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 3/3] selftests/tc-testing: Verify bpf redirect on RED block with preceding clsact (egress) classifier Jamal Hadi Salim
2026-06-27 16:52 ` sashiko-bot
2026-06-28 12:36 ` Jamal Hadi Salim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260629102917.Ag2Vd7LR@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=clrkwllms@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=horms@kernel.org \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=john.fastabend@gmail.com \
--cc=krikku@gmail.com \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=liuhangbin@gmail.com \
--cc=mkarsten@uwaterloo.ca \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rostedt@goodmis.org \
--cc=sashiko-bot@kernel.org \
--cc=sdf.kernel@gmail.com \
--cc=skhawaja@google.com \
--cc=toke@toke.dk \
--cc=victor@mojatatu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.