BPF List
 help / color / mirror / Atom feed
From: "Kumar Kartikeya Dwivedi" <memxor@gmail.com>
To: "Justin Suess" <utilityemal77@gmail.com>,
	"Kumar Kartikeya Dwivedi" <memxor@gmail.com>
Cc: <bpf@vger.kernel.org>, "Alexei Starovoitov" <ast@kernel.org>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Emil Tsalapatis" <emil@etsalapatis.com>, <kkd@meta.com>,
	<kernel-team@meta.com>
Subject: Re: [PATCH bpf-next v1 1/5] bpf: Treat non-iterator tracing progs as tracing
Date: Mon, 08 Jun 2026 20:53:03 +0200	[thread overview]
Message-ID: <DJ3W8IZ08WBT.VTRTP72FSWVO@gmail.com> (raw)
In-Reply-To: <aibxqG77bragcPos@zenbox>

On Mon Jun 8, 2026 at 7:47 PM CEST, Justin Suess wrote:
> On Mon, Jun 08, 2026 at 04:48:34PM +0200, Kumar Kartikeya Dwivedi wrote:
>> The is_tracing_prog_type() predicate omitted BPF_PROG_TYPE_TRACING even
>> though fentry, fexit, fmod_ret, raw_tp BTF and similar programs have the
>> same execution-context concerns as the tracing program types already
>> covered by the helper.
>>
>> This matters for map compatibility checks that reject bpf_spin_lock,
>> bpf_list_head and bpf_rb_root in tracing contexts. BPF_PROG_TYPE_TRACING
>> programs can run from arbitrary instrumented contexts, including places
>> where taking these locks or manipulating graph roots is not safe.
>> BPF_TRACE_ITER is different: iterator programs run from task context, so
>> we continue to exclude them.
>>
>> This can reject existing fentry/fexit-style programs that use map values
>> with these fields. Such programs were accepted only because the
>> predicate missed this program type; their use depends on semantics the
>> verifier already rejects for equivalent tracing hooks.
>>
>> Move is_tracing_prog_type() checks from check_map_prog_compatibility()
>> to points where the fields are actually used to avoid preemptively
>> rejecting tracing programs that use maps with such fields but do not
>> touch these fields.
>>
>> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>> ---
> Thanks for sending this, looks pretty good to me.
>>  kernel/bpf/verifier.c | 37 ++++++++++++++++++++++---------------
>>  1 file changed, 22 insertions(+), 15 deletions(-)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index ed7ba0e6a9ce..26bfb4465725 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -6967,6 +6967,11 @@ static int process_spin_lock(struct bpf_verifier_env *env, struct bpf_reg_state
>>  	u32 spin_lock_off;
>>  	int err;
>>
>> +	if (is_tracing_prog_type(env->prog)) {
>> +		verbose(env, "tracing progs cannot use bpf_spin_lock yet\n");
>> +		return -EINVAL;
>> +	}
>> +
>>  	if (!is_const) {
>>  		verbose(env,
>>  			"%s doesn't have constant offset. %s_lock has to be at the constant offset\n",
>> @@ -12222,6 +12227,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>>  				return ret;
>>  			break;
>>  		case KF_ARG_PTR_TO_LIST_HEAD:
>> +			if (is_tracing_prog_type(env->prog)) {
>> +				verbose(env, "tracing progs cannot use bpf_{list_head,rb_root} yet\n");
>> +				return -EINVAL;
>> +			}
>>  			if (reg->type != PTR_TO_MAP_VALUE &&
>>  			    reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
>>  				verbose(env, "%s expected pointer to map value or allocated object\n",
>> @@ -12238,6 +12247,10 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
>>  				return ret;
>>  			break;
>>  		case KF_ARG_PTR_TO_RB_ROOT:
>> +			if (is_tracing_prog_type(env->prog)) {
>> +				verbose(env, "tracing progs cannot use bpf_{list_head,rb_root} yet\n");
>> +				return -EINVAL;
>> +			}
>>  			if (reg->type != PTR_TO_MAP_VALUE &&
>>  			    reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) {
>>  				verbose(env, "%s expected pointer to map value or allocated object\n",
>> @@ -17664,9 +17677,11 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
>>  	return __add_used_btf(env, btf);
>>  }
>>
>> -static bool is_tracing_prog_type(enum bpf_prog_type type)
>> +static bool is_tracing_prog_type(const struct bpf_prog *prog)
>>  {
>> -	switch (type) {
>> +	switch (resolve_prog_type(prog)) {
>> +	case BPF_PROG_TYPE_TRACING:
>> +		return prog->expected_attach_type != BPF_TRACE_ITER;
> Seems reasonable to exclude this attach type as they run always in task
> context as stated. This seems like the safest approach.
>
> In practice, *most* tracing type programs will be unreachable from
> NMI. But the edge cases are difficult to enumerate...
>
> fentry / fexit are nearly impossible to exclude, since there's no
> good way to answer "can this function be called from an nmi handler"
> at verification time. Such a list would be unmaintainable.
>
> Raw tracepoints are similar. Indeed, it would be *possible* to
> specifically exclude the tracepoints that cannot be called from NMI. But
> this would require maintaining a large list and would be brittle.
>
> fmodify_return attach types are whitelisted to LSM hooks and
> ALLOW_ERROR_INJECTION functions. A quick grep for ALLOW_ERROR_INJECTION
> doesn't bring anything that looks obviously NMI-relevant, but again, I
> think allowing it would be asking for trouble later.
>
> So I think this is a good balance...
>>  	case BPF_PROG_TYPE_KPROBE:
>>  	case BPF_PROG_TYPE_TRACEPOINT:
>>  	case BPF_PROG_TYPE_PERF_EVENT:
>> @@ -17697,24 +17712,16 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
>>  		return -EACCES;
>>  	}
>>
>> -	if (btf_record_has_field(map->record, BPF_LIST_HEAD) ||
>> -	    btf_record_has_field(map->record, BPF_RB_ROOT)) {
>> -		if (is_tracing_prog_type(prog_type)) {
>> -			verbose(env, "tracing progs cannot use bpf_{list_head,rb_root} yet\n");
>> -			return -EINVAL;
>> -		}
>> -	}
>> -
>>  	if (btf_record_has_field(map->record, BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK)) {
>>  		if (prog_type == BPF_PROG_TYPE_SOCKET_FILTER) {
>>  			verbose(env, "socket filter progs cannot use bpf_spin_lock yet\n");
>>  			return -EINVAL;
>>  		}
>> -
>> -		if (is_tracing_prog_type(prog_type)) {
>> -			verbose(env, "tracing progs cannot use bpf_spin_lock yet\n");
>> -			return -EINVAL;
>> -		}
>> +		/*
>> +		 * Rejecting tracing progs accessing maps with bpf_spin_lock in
>> +		 * them here would be too conservative; let's defer rejection
>> +		 * until seeing first use.
>> +		 */
>
> In practice, I think most programs using those spinlocks w/ fentry
> weren't broken.
>
> They were saved only by nature of the fact that the particular
> functions that attached to were uncallable / never were
> called from NMI. Which is the case for 99% of those programs.
>
> So this case is where fixing the other 1% of cases are going
> to have consequences for the remainder.
>
> This is a tough decision to have to make. There's options, but all of
> them come with downsides.
>
> 1. (this patch) break tracing programs using bpf_spin_lock.
> 2. Somehow prevent NMI reentrancy of tracing progs on the same cpu. probably
> by just ignoring the reentrant calls, much how we handle the recursive
> tracepoint prevention for bpf. (complicated, may have side effects)
> 3. Inject a fixup runtime check for in_nmi for every tracing program
> and tail callable program, exiting before executing anything
> (still breakage, just at runtime now)
>
> If breakage is acceptible here, the approach this patch takes is the
> most *correct* way to do it, and the way it should have been done from
> day one, but having users relying on the "broken" behavior makes it
> harder to fix now.
>
> This seems like a maintainer call.

Yeah, you outlined the tradeoffs perfectly. What I'm leaning toward is splitting
this one out of the series, focus on the other two for now in v2, and send a new
set with this fix + opening up bpf_res_spin_lock() for use as a replacement. We
can then emit some guidance for the user to switch when their program breaks to
a working replacement which hopefully allows a reasonable migration path while
reducing this technical debt.

  reply	other threads:[~2026-06-08 18:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 14:48 [PATCH bpf-next v1 0/5] Fix kptr dtor deadlock Kumar Kartikeya Dwivedi
2026-06-08 14:48 ` [PATCH bpf-next v1 1/5] bpf: Treat non-iterator tracing progs as tracing Kumar Kartikeya Dwivedi
2026-06-08 14:51   ` Kumar Kartikeya Dwivedi
2026-06-08 15:13   ` sashiko-bot
2026-06-08 15:44   ` bot+bpf-ci
2026-06-08 17:47   ` Justin Suess
2026-06-08 18:53     ` Kumar Kartikeya Dwivedi [this message]
2026-06-08 14:48 ` [PATCH bpf-next v1 2/5] bpf: Reject bpf_obj_drop() from tracing progs Kumar Kartikeya Dwivedi
2026-06-08 15:40   ` sashiko-bot
2026-06-08 14:48 ` [PATCH bpf-next v1 3/5] bpf: Cancel special fields on map value recycle Kumar Kartikeya Dwivedi
2026-06-08 15:44   ` bot+bpf-ci
2026-06-08 15:56   ` sashiko-bot
2026-06-08 18:01   ` Justin Suess
2026-06-08 18:50     ` Kumar Kartikeya Dwivedi
2026-06-08 14:48 ` [PATCH bpf-next v1 4/5] selftests/bpf: Exercise unsafe obj drops from tracing progs Kumar Kartikeya Dwivedi
2026-06-08 16:16   ` sashiko-bot
2026-06-08 14:48 ` [PATCH bpf-next v1 5/5] selftests/bpf: Exercise kptr map update lifetime Kumar Kartikeya Dwivedi
2026-06-08 16:40   ` sashiko-bot
2026-06-08 14:58 ` [PATCH bpf-next v1 0/5] Fix kptr dtor deadlock Kumar Kartikeya Dwivedi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DJ3W8IZ08WBT.VTRTP72FSWVO@gmail.com \
    --to=memxor@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=emil@etsalapatis.com \
    --cc=kernel-team@meta.com \
    --cc=kkd@meta.com \
    --cc=utilityemal77@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox