Netdev List
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	horms@kernel.org, toke@toke.dk, jiri@resnulli.us,
	clrkwllms@kernel.org, rostedt@goodmis.org, kuniyu@google.com,
	sdf.kernel@gmail.com, skhawaja@google.com, liuhangbin@gmail.com,
	krikku@gmail.com, mkarsten@uwaterloo.ca, victor@mojatatu.com,
	ast@kernel.org, hawk@kernel.org, john.fastabend@gmail.com,
	daniel@iogearbox.net, Sashiko <sashiko-bot@kernel.org>
Subject: Re: [PATCH net 1/3] net: Extend bpf_net_context lifetime to cover qdisc enqueue
Date: Mon, 29 Jun 2026 12:29:17 +0200	[thread overview]
Message-ID: <20260629102917.Ag2Vd7LR@linutronix.de> (raw)
In-Reply-To: <20260626165156.169012-2-jhs@mojatatu.com>

On 2026-06-26 12:51:54 [-0400], Jamal Hadi Salim wrote:
> The bpf_net_context used by sch_handle_egress() is stack-allocated and torn
> down in that function returned. By the time tcf_qevent_handle() runs
> current->bpf_net_context is NULL.
> 
> When a filter attached to a qevent block (e.g. RED's early_drop or mark
> qevents, which always use shared blocks) returns TC_ACT_REDIRECT,
> tcf_qevent_handle() calls skb_do_redirect(), which in turn calls bpf helper
> bpf_net_ctx_get_ri(). That helper unconditionally dereferences
> current->bpf_net_context resulting in a NULL pointer dereference.
> 
> Note: The same holds for actions that invoke BPF redirect helpers
> (e.g. act_bpf running a program that calls bpf_redirect()) during qevent
> classification itself. And as a matter of fact the same assumption is
> made in the code outside of tc.
> 
> Fix:
> Move the bpf_net_context lifecycle out of sch_handle_egress() into
> __dev_queue_xmit(), so that it spans both the egress TC fast path and the
> qdisc enqueue. The setup is placed outside the egress_needed_key static
> branch because qevents are independent of clsact/NF egress hooks and
> that key may stay disabled when only a qevent-bearing qdisc is
> configured. Unfortunately this adds a small unconditional penalty to the
> code path _per packet_ only guarded by CONFIG_NET_XGRESS (two writes and
> one read for bpf_net_ctx_set, plus one write for bpf_net_ctx_clear).

I fail to understand this but you and sashiko have an understanding...
If there is TC_ACT_REDIRECT returned by tc_run(), then the skb is NULL
and as such uppon return from sch_handle_egress() the control flow goes
to the out label.
As a fix you move the bpf_net_ctx assigned to before CONFIG_NET_EGRESS
and clear it on exit. What do I miss here?

> This keeps all bpf_net_context management in net/core/dev.c i.e the
> existing boundary between tc core and BPF without requiring any net/sched/
> code to know about BPF plumbing.
> 
> Reproducer (see the accompanying tdc test):
> 
>   tc qdisc add dev eth0 root handle 1: red limit 1MB min 10KB max 20KB \
>       avpkt 1000 burst 100 qevent early_drop block 10
>   tc qdisc add dev eth0 clsact
>   tc filter add block 10 pref 1 bpf obj redirect.o

stupid question: how do I get this redirect.o? Just a simply thing to
reproduce this…

>   tc filter add dev eth0 egress protocol ip prio 1 matchall \
>       action gact pass
> 
>   traffic through eth0 triggers red_enqueue() -> tcf_qevent_handle() and,
>   on a redirect verdict, a NULL deref in skb_do_redirect().

Sebastian

  reply	other threads:[~2026-06-29 10:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 16:51 [PATCH net 0/3] Fix broken TC_ACT_REDIRECT Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 1/3] net: Extend bpf_net_context lifetime to cover qdisc enqueue Jamal Hadi Salim
2026-06-29 10:29   ` Sebastian Andrzej Siewior [this message]
2026-06-29 10:47     ` Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 2/3] net/sched: Handle TC_ACT_REDIRECT from qdisc filter chains Jamal Hadi Salim
     [not found]   ` <20260627165220.096B61F00A3A@smtp.kernel.org>
2026-06-28 12:28     ` Jamal Hadi Salim
2026-06-26 16:51 ` [PATCH net 3/3] selftests/tc-testing: Verify bpf redirect on RED block with preceding clsact (egress) classifier Jamal Hadi Salim
     [not found]   ` <20260627165220.CA0CD1F000E9@smtp.kernel.org>
2026-06-28 12:36     ` Jamal Hadi Salim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629102917.Ag2Vd7LR@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=clrkwllms@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=horms@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=krikku@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=liuhangbin@gmail.com \
    --cc=mkarsten@uwaterloo.ca \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=sashiko-bot@kernel.org \
    --cc=sdf.kernel@gmail.com \
    --cc=skhawaja@google.com \
    --cc=toke@toke.dk \
    --cc=victor@mojatatu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox