Re: [PATCH bpf-next v1 3/8] bpf: Support for kfuncs with KF_MAGIC_ARGS

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ihor Solodrai <ihor.solodrai@linux.dev>
To: Eduard Zingerman <eddyz87@gmail.com>,
	bpf@vger.kernel.org, andrii@kernel.org, ast@kernel.org
Cc: dwarves@vger.kernel.org, alan.maguire@oracle.com,
	acme@kernel.org, tj@kernel.org, kernel-team@meta.com
Subject: Re: [PATCH bpf-next v1 3/8] bpf: Support for kfuncs with KF_MAGIC_ARGS
Date: Thu, 30 Oct 2025 09:31:42 -0700	[thread overview]
Message-ID: <da20bc30-85be-44ab-b837-19aa97ebc431@linux.dev> (raw)
In-Reply-To: <b667472aeb77ac63a3de82dae77012c0285e0286.camel@gmail.com>

Hi Eduard, thank you for a quick review.

On 10/29/25 4:54 PM, Eduard Zingerman wrote:
> On Wed, 2025-10-29 at 12:01 -0700, Ihor Solodrai wrote:
>> A kernel function bpf_foo with KF_MAGIC_ARGS flag is expected to have
>                                  ^^^^^^^^^^^^^
> 		I don't like this name very much.
> 		It bears very little context.
> 		Imo, KF_IMPLICIT_ARGS fits the use case much better.

I know, naming is hard...

The issue is that it's not only the flag, across the code we need
descriptive names for every "magic" thing:
  * a flagged function
    * how do we call it? kfunc_with_impl_args?
  * a function that exists only in BTF (_impl)
    * it's not an "implicit" function
    * it's not exactly an "implementation" function
    * "fake" is even worse than "magic" IMO, because it's not fake,
      but you could argue it's magical :D
    * btf_only_kfunc?
  * describing arguments is simpler: "implicit" seems ok, although as
    Alexei pointed out in previous iteration they are very much
    explicit in the kernel [1]

For me, "(BPF) interface" and "(kernel) implementation" pair of terms
makes sense, but then I think it would be logical to have both
declarations in the kernel.

The advantage of "magic" in this context is that it doesn't have
loaded meaning. But I agree this is a stretch, so can't insist.

[1] https://lore.kernel.org/bpf/CAADnVQLvuubey0A0Fk=bzN-=JG2UUQHRqBijZpuvqMQ+xy4W4g@mail.gmail.com/


> 
>> two types in BTF:
>>   * `bpf_foo` with a function prototype that omits __magic arguments
>>   * `bpf_foo_impl` with a function prototype that matches kernel
>>      declaration, but doesn't have a ksym associated with its name
> 
> Could you please start with an example here?
> Stating how `bpf_foo` needs to be declared in kernel, and what are the
> options to invoke it from bpf.  Then proceed with BTF details, etc.

Ok. I think I can reshuffle explanations between the cover letter and
commit message, it's a bit redundant already.

> 
>> In order to support magic kfuncs the verifier has to know how to
>> resolve calls both of `bpf_foo` and `bpf_foo_impl` to the correct BTF
>> function prototype and address.
>>
>> In add_kfunc_call() kfunc flags are inspected to detect a magic kfunc
>> or its _impl, and then the address and func_proto are adjusted for the
>> kfunc descriptor.
>>
>> In fetch_kfunc_meta() similar logic is used to fixup the contents of
>> struct bpf_kfunc_call_arg_meta.
>>
>> In check_kfunc_call() reset the subreg_def of registers holding magic
>> arguments to correctly track zero extensions.
>>
>> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
>> ---
>>  include/linux/btf.h   |   1 +
>>  kernel/bpf/verifier.c | 123 ++++++++++++++++++++++++++++++++++++++++--
>>  2 files changed, 120 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/btf.h b/include/linux/btf.h
>> index 9c64bc5e5789..3fe20514692f 100644
>> --- a/include/linux/btf.h
>> +++ b/include/linux/btf.h
>> @@ -79,6 +79,7 @@
>>  #define KF_ARENA_RET    (1 << 13) /* kfunc returns an arena pointer */
>>  #define KF_ARENA_ARG1   (1 << 14) /* kfunc takes an arena pointer as its first argument */
>>  #define KF_ARENA_ARG2   (1 << 15) /* kfunc takes an arena pointer as its second argument */
>> +#define KF_MAGIC_ARGS   (1 << 16) /* kfunc signature is different from its BPF signature */
>>  
>>  /*
>>   * Tag marking a kernel function as a kfunc. This is meant to minimize the
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index cb1b483be0fa..fcf0872b9e3d 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -3263,17 +3263,68 @@ static struct btf *find_kfunc_desc_btf(struct bpf_verifier_env *env, s16 offset)
>>  	return btf_vmlinux ?: ERR_PTR(-ENOENT);
>>  }
>>  
>> +/*
>> + * magic_kfuncs is used as a list of (foo, foo_impl) pairs
>> + */
>> +BTF_ID_LIST(magic_kfuncs)
>> +BTF_ID_UNUSED
>> +BTF_ID_LIST_END(magic_kfuncs)
>> +
>> +static s32 magic_kfunc_by_impl(s32 impl_func_id)
>> +{
>> +	int i;
>> +
>> +	for (i = 1; i < BTF_ID_LIST_SIZE(magic_kfuncs); i += 2) {
>> +		if (magic_kfuncs[i] == impl_func_id)
>                                        ^^^^^
> Nit: similarly, I'd rename this to something like "implicit_func_id"
>      or "fake_func_id. "impl" is confusing because this id has nothing
>      to do with implementation.
> 
>> +			return magic_kfuncs[i - 1];
>> +	}
>> +	return -ENOENT;
>> +}
>> +
>> +static s32 impl_by_magic_kfunc(s32 func_id)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < BTF_ID_LIST_SIZE(magic_kfuncs); i += 2) {
>> +		if (magic_kfuncs[i] == func_id)
>> +			return magic_kfuncs[i + 1];
>> +	}
>> +	return -ENOENT;
>> +}
>> +
>> +static const struct btf_type *find_magic_kfunc_proto(struct btf *desc_btf, s32 func_id)
>> +{
>> +	const struct btf_type *impl_func, *func_proto;
>> +	u32 impl_func_id;
>> +
>> +	impl_func_id = impl_by_magic_kfunc(func_id);
>> +	if (impl_func_id < 0)
>> +		return NULL;
>> +
>> +	impl_func = btf_type_by_id(desc_btf, impl_func_id);
>> +	if (!impl_func || !btf_type_is_func(impl_func))
>> +		return NULL;
>> +
>> +	func_proto = btf_type_by_id(desc_btf, impl_func->type);
>> +	if (!func_proto || !btf_type_is_func_proto(func_proto))
>> +		return NULL;
>> +
>> +	return func_proto;
>> +}
>> +
>>  static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id, s16 offset)
>>  {
>> -	const struct btf_type *func, *func_proto;
>> +	const struct btf_type *func, *func_proto, *tmp_func;
>>  	struct bpf_kfunc_btf_tab *btf_tab;
>> +	const char *func_name, *tmp_name;
>>  	struct btf_func_model func_model;
>>  	struct bpf_kfunc_desc_tab *tab;
>>  	struct bpf_prog_aux *prog_aux;
>>  	struct bpf_kfunc_desc *desc;
>> -	const char *func_name;
>>  	struct btf *desc_btf;
>>  	unsigned long addr;
>> +	u32 *kfunc_flags;
>> +	s32 tmp_func_id;
>>  	int err;
>>  
>>  	prog_aux = env->prog->aux;
>> @@ -3349,8 +3400,37 @@ static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id, s16 offset)
>>  		return -EINVAL;
>>  	}
>>  
>> +	kfunc_flags = btf_kfunc_flags(desc_btf, func_id, env->prog);
>>  	func_name = btf_name_by_offset(desc_btf, func->name_off);
>>  	addr = kallsyms_lookup_name(func_name);
>> +
>> +	/* This may be an _impl kfunc with KF_MAGIC_ARGS counterpart */
>> +	if (unlikely(!addr && !kfunc_flags)) {
>> +		tmp_func_id = magic_kfunc_by_impl(func_id);
> 
> I think there is no need to hide magic_kfunc_by_impl() call behind the
> above condition. It can be moved before kfunc_flags assignment.
> Then it wont be necessary to textually repeat btf_name_by_offset() and
> kallsyms_lookup_name() calls.

Not sure I follow...

Yes, !addr is enough to detect potential _impl function, but there is
no way around name lookup in BTF and then another address lookup.

The _impl function doesn't have an address, so after failed
  kallsyms_lookup_name("kfunc_impl");
we must do
  kallsyms_lookup_name("kfunc");
to find the correct address.

Or do you suggest doing something like:

  tmp_func_id = magic_kfunc_by_impl(func_id);
  if (tmp_func_id > 0)
      func_id = tmp_func_id;

at the beginning of add_kfunc_call()?


> 
>> +		if (tmp_func_id < 0)
>> +			return -EACCES;
> 
> Nit: this skips proper error reporting: "cannot find address for kernel function %s\n".
> 
>> +		tmp_func = btf_type_by_id(desc_btf, tmp_func_id);
>> +		if (!tmp_func || !btf_type_is_func(tmp_func))
> 
> Nit: this condition indicates a verifier bug, should it be reported as such?
> 
>> +			return -EACCES;
>> +		tmp_name = btf_name_by_offset(desc_btf, tmp_func->name_off);
>> +		addr = kallsyms_lookup_name(tmp_name);
>> +	}
>> +
>> +	/*
>> +	 * Note that kfunc_flags may be NULL at this point, which means that we couldn't find
>> +	 * func_id in any relevant kfunc_id_set. This most likely indicates an invalid kfunc call.
>> +	 * However we don't want to fail the verification here, because invalid calls may be
>> +	 * eliminated as dead code later.
>> +	 */
>> +	if (unlikely(kfunc_flags && KF_MAGIC_ARGS & *kfunc_flags)) {
>> +		func_proto = find_magic_kfunc_proto(desc_btf, func_id);
>> +		if (!func_proto) {
>> +			verbose(env, "cannot find _impl proto for kernel function %s\n",
>> +			func_name);
>> +			return -EINVAL;
>> +		}
>> +	}
>> +
>>  	if (!addr) {
>>  		verbose(env, "cannot find address for kernel function %s\n",
>>  			func_name);
>> @@ -12051,6 +12131,11 @@ static bool is_kfunc_arg_irq_flag(const struct btf *btf, const struct btf_param
>>  	return btf_param_match_suffix(btf, arg, "__irq_flag");
>>  }
>>  
>> +static bool is_kfunc_arg_magic(const struct btf *btf, const struct btf_param *arg)
>> +{
>> +	return btf_param_match_suffix(btf, arg, "__magic");
>> +}
>> +
>>  static bool is_kfunc_arg_prog(const struct btf *btf, const struct btf_param *arg)
>>  {
>>  	return btf_param_match_suffix(btf, arg, "__prog");
>> @@ -13613,6 +13698,7 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env,
>>  	u32 func_id, *kfunc_flags;
>>  	const char *func_name;
>>  	struct btf *desc_btf;
>> +	s32 tmp_func_id;
>>  
>>  	if (kfunc_name)
>>  		*kfunc_name = NULL;
>> @@ -13632,10 +13718,28 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env,
>>  	func_proto = btf_type_by_id(desc_btf, func->type);
>>  
>>  	kfunc_flags = btf_kfunc_flags_if_allowed(desc_btf, func_id, env->prog);
>> -	if (!kfunc_flags) {
>> -		return -EACCES;
>> +	if (unlikely(!kfunc_flags)) {
> 
> What if we patch insn->imm to use the "fake" function id in add_kfunc_call()?
> Then modifications to fetch_kfunc_meta() wont be necessary.


I considered this. I wasn't sure it's safe to patch insn->imm at this
stage of verification. Also I thought it may be harder to debug the
verifier if we do btf id replacement in the calls pre-verification
(because we lose the original btf id).

Maybe I was too causious.

Alexei, Andrii, what do you think?


> 
>> +		/*
>> +		 * An _impl kfunc with KF_MAGIC_ARGS counterpart
>> +		 * does not have its own kfunc flags.
>> +		 */
>> +		tmp_func_id = magic_kfunc_by_impl(func_id);
>> +		if (tmp_func_id < 0)
>> +			return -EACCES;
>> +		kfunc_flags = btf_kfunc_flags_if_allowed(desc_btf, tmp_func_id, env->prog);
>> +		if (!kfunc_flags)
>> +			return -EACCES;
>> +	} else if (unlikely(KF_MAGIC_ARGS & *kfunc_flags)) {
>> +		/*
>> +		 * An actual func_proto of a kfunc with KF_MAGIC_ARGS flag
>> +		 * can be found through the corresponding _impl kfunc.
>> +		 */
>> +		func_proto = find_magic_kfunc_proto(desc_btf, func_id);
>>  	}
>>  
>> +	if (!func_proto)
>> +		return -EACCES;
>> +
>>  	memset(meta, 0, sizeof(*meta));
>>  	meta->btf = desc_btf;
>>  	meta->func_id = func_id;
>> @@ -14189,6 +14293,17 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>  	for (i = 0; i < nargs; i++) {
>>  		u32 regno = i + 1;
>>  
>> +		/*
>> +		 * Magic arguments are set after main verification pass.
>> +		 * For correct tracking of zero-extensions we have to reset subreg_def for such
>> +		 * args. Otherwise mark_btf_func_reg_size() will be inspecting subreg_def of regs
>> +		 * from an earlier (irrelevant) point in the program, which may lead to an error
>> +		 * in opt_subreg_zext_lo32_rnd_hi32().
>> +		 */
>> +		if (unlikely(KF_MAGIC_ARGS & meta.kfunc_flags
>> +				&& is_kfunc_arg_magic(desc_btf, &args[i])))
>> +			regs[regno].subreg_def = DEF_NOT_SUBREG;
>> +
>>  		t = btf_type_skip_modifiers(desc_btf, args[i].type, NULL);
>>  		if (btf_type_is_ptr(t))
>>  			mark_btf_func_reg_size(env, regno, sizeof(void *));

next prev parent reply	other threads:[~2025-10-30 16:32 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-29 19:01 [PATCH bpf-next v1 0/8] bpf: magic kernel functions Ihor Solodrai
2025-10-29 19:01 ` [PATCH bpf-next v1 1/8] bpf: Add BTF_ID_LIST_END and BTF_ID_LIST_SIZE macros Ihor Solodrai
2025-10-29 19:41   ` bot+bpf-ci
2025-10-29 20:44     ` Ihor Solodrai
2025-10-29 23:54   ` Eduard Zingerman
2025-10-29 19:01 ` [PATCH bpf-next v1 2/8] bpf: Refactor btf_kfunc_id_set_contains Ihor Solodrai
2025-10-29 23:55   ` Eduard Zingerman
2025-10-29 19:01 ` [PATCH bpf-next v1 3/8] bpf: Support for kfuncs with KF_MAGIC_ARGS Ihor Solodrai
2025-10-29 19:41   ` bot+bpf-ci
2025-10-29 20:49     ` Ihor Solodrai
2025-10-29 23:59       ` Eduard Zingerman
2025-10-29 23:54   ` Eduard Zingerman
2025-10-30  0:03     ` Alexei Starovoitov
2025-10-30 16:31     ` Ihor Solodrai [this message]
2025-10-30 17:26       ` Eduard Zingerman
2025-10-30 10:24   ` kernel test robot
2025-10-30 11:58   ` kernel test robot
2025-10-30 13:54   ` kernel test robot
2025-10-29 19:01 ` [PATCH bpf-next v1 4/8] bpf: Support __magic prog_aux arguments for kfuncs Ihor Solodrai
2025-10-29 19:01 ` [PATCH bpf-next v1 5/8] bpf: Re-define bpf_wq_set_callback as magic kfunc Ihor Solodrai
2025-10-30  0:16   ` Eduard Zingerman
2025-10-29 19:01 ` [PATCH bpf-next v1 6/8] bpf,docs: Document KF_MAGIC_ARGS flag and __magic annotation Ihor Solodrai
2025-10-30  0:21   ` Eduard Zingerman
2025-10-29 19:01 ` [PATCH bpf-next v1 7/8] bpf: Re-define bpf_task_work_schedule_* kfuncs as magic Ihor Solodrai
2025-10-29 19:01 ` [PATCH bpf-next v1 8/8] bpf: Re-define bpf_stream_vprintk as a magic kfunc Ihor Solodrai
2025-10-30  0:44 ` [PATCH bpf-next v1 0/8] bpf: magic kernel functions Eduard Zingerman
2025-10-30  6:11   ` Eduard Zingerman
2025-10-30 18:14     ` Eduard Zingerman
2025-10-30 18:24       ` Ihor Solodrai
2025-10-30 18:37         ` Eduard Zingerman
2025-10-30 18:26       ` Alan Maguire
2025-10-30 18:42         ` Eduard Zingerman
2025-10-30 18:46         ` Ihor Solodrai
2025-10-30 19:47           ` Andrii Nakryiko
2025-10-30 20:02             ` Ihor Solodrai
2025-10-30 20:38               ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da20bc30-85be-44ab-b837-19aa97ebc431@linux.dev \
    --to=ihor.solodrai@linux.dev \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.