public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: thinker.li@gmail.com
Cc: sinquersw@gmail.com, kuifeng@meta.com, bpf@vger.kernel.org,
	ast@kernel.org, song@kernel.org, kernel-team@meta.com,
	andrii@kernel.org, drosen@google.com
Subject: Re: [PATCH bpf-next v11 07/13] bpf: pass attached BTF to the bpf_struct_ops subsystem
Date: Thu, 9 Nov 2023 18:04:22 -0800	[thread overview]
Message-ID: <5cbae302-7fa6-5625-921a-c6f548bcc3a2@linux.dev> (raw)
In-Reply-To: <20231106201252.1568931-8-thinker.li@gmail.com>

On 11/6/23 12:12 PM, thinker.li@gmail.com wrote:
> From: Kui-Feng Lee <thinker.li@gmail.com>
> 
> Every kernel module has its BTF, comprising information on types defined in
> the module. The BTF fd (attr->value_type_btf_obj_fd) passed from userspace

I would highlight this patch (adds) value_type_btf_obj_fd.

> helps the bpf_struct_ops to lookup type information and description of the
> struct_ops type, which is necessary for parsing the layout of map element
> values and registering maps. The descriptions are looked up by matching a
> type id (attr->btf_vmlinux_value_type_id) against bpf_struct_ops_desc(s)
> defined in a BTF. If the struct_ops type is defined in a module, the
> bpf_struct_ops needs to know the module BTF to lookup the
> bpf_struct_ops_desc.
> 
> The bpf_prog includes attach_btf in aux which is passed along with the
> bpf_attr when loading the program. The purpose of attach_btf is to

I read it as "attach_btf" is passed in the bpf_attr. This has been in my head 
for a while. I sort of know what is the actual uapi, so didn't get to it yet.

We have already discussed a bit of this offline. I think it meant 
attr->attach_btf_obj_fd here.

This patch is mainly about how the userspace passing kmod's btf to the kernel 
during map creation and prog load and also what uapi does it use. The commit 
message should mention this patch is reusing the existing 
attr->attach_btf_obj_fd for the userspace to pass the kmod's btf when loading 
the struct_ops prog. I need to go back to the syscall.c code to figure out and 
also leap forward to the later libbpf patch to confirm it.

I depend on the commit message to help the review. It is much appreciated if the 
commit message is clear and accurate on things like: what it wants to do, how it 
does it (addition/deletion/changes), and what are the major changes.

> determine the btf type of attach_btf_id. The attach_btf_id is then used to
> identify the traced function for a trace program. In the case of struct_ops
> programs, it is used to identify the struct_ops type of the struct_ops
> object that a program is attached to.
> 
> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
> ---
>   include/uapi/linux/bpf.h       |  5 +++
>   kernel/bpf/bpf_struct_ops.c    | 57 ++++++++++++++++++++++++----------
>   kernel/bpf/syscall.c           |  2 +-
>   kernel/bpf/verifier.c          |  9 ++++--
>   tools/include/uapi/linux/bpf.h |  5 +++
>   5 files changed, 57 insertions(+), 21 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 0f6cdf52b1da..fd20c52606b2 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1398,6 +1398,11 @@ union bpf_attr {
>   		 * to using 5 hash functions).
>   		 */
>   		__u64	map_extra;
> +
> +		__u32   value_type_btf_obj_fd;	/* fd pointing to a BTF
> +						 * type data for
> +						 * btf_vmlinux_value_type_id.
> +						 */
>   	};
>   
>   	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index 4ba6181ed1c4..2fb1b21f989a 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
> @@ -635,6 +635,7 @@ static void __bpf_struct_ops_map_free(struct bpf_map *map)
>   		bpf_jit_uncharge_modmem(PAGE_SIZE);
>   	}
>   	bpf_map_area_free(st_map->uvalue);
> +	btf_put(st_map->btf);
>   	bpf_map_area_free(st_map);
>   }
>   
> @@ -675,15 +676,30 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>   	struct bpf_struct_ops_map *st_map;
>   	const struct btf_type *t, *vt;
>   	struct bpf_map *map;
> +	struct btf *btf;
>   	int ret;
>   
> -	st_ops_desc = bpf_struct_ops_find_value(btf_vmlinux, attr->btf_vmlinux_value_type_id);
> -	if (!st_ops_desc)
> -		return ERR_PTR(-ENOTSUPP);
> +	if (attr->value_type_btf_obj_fd) {
> +		/* The map holds btf for its whole life time. */
> +		btf = btf_get_by_fd(attr->value_type_btf_obj_fd);
> +		if (IS_ERR(btf))
> +			return ERR_PTR(PTR_ERR(btf));
> +	} else {
> +		btf = btf_vmlinux;
> +		btf_get(btf);
> +	}
> +
> +	st_ops_desc = bpf_struct_ops_find_value(btf, attr->btf_vmlinux_value_type_id);
> +	if (!st_ops_desc) {
> +		ret = -ENOTSUPP;
> +		goto errout;
> +	}
>   
>   	vt = st_ops_desc->value_type;
> -	if (attr->value_size != vt->size)
> -		return ERR_PTR(-EINVAL);
> +	if (attr->value_size != vt->size) {
> +		ret = -EINVAL;
> +		goto errout;
> +	}
>   
>   	t = st_ops_desc->type;
>   
> @@ -694,17 +710,18 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>   		(vt->size - sizeof(struct bpf_struct_ops_value));
>   
>   	st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE);
> -	if (!st_map)
> -		return ERR_PTR(-ENOMEM);
> +	if (!st_map) {
> +		ret = -ENOMEM;
> +		goto errout;
> +	}
>   
> +	st_map->btf = btf;

How about do the "st_map->btf = btf;" assignment the same as where the current 
code is doing (a few lines below). Would it avoid the new "btf = NULL;" dance 
during the error case?

nit, if moving a line, I would move the following "st_map->st_ops_desc = 
st_ops_desc;" to the later and close to where "st_map->btf = btf;" is.

>   	st_map->st_ops_desc = st_ops_desc;
>   	map = &st_map->map;
>   
>   	ret = bpf_jit_charge_modmem(PAGE_SIZE);
> -	if (ret) {
> -		__bpf_struct_ops_map_free(map);
> -		return ERR_PTR(ret);
> -	}
> +	if (ret)
> +		goto errout_free;
>   
>   	st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
>   	if (!st_map->image) {
> @@ -713,25 +730,31 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>   		 * here.
>   		 */
>   		bpf_jit_uncharge_modmem(PAGE_SIZE);
> -		__bpf_struct_ops_map_free(map);
> -		return ERR_PTR(-ENOMEM);
> +		ret = -ENOMEM;
> +		goto errout_free;
>   	}
>   	st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
>   	st_map->links =
>   		bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_links *),
>   				   NUMA_NO_NODE);
>   	if (!st_map->uvalue || !st_map->links) {
> -		__bpf_struct_ops_map_free(map);
> -		return ERR_PTR(-ENOMEM);
> +		ret = -ENOMEM;
> +		goto errout_free;
>   	}
>   
> -	st_map->btf = btf_vmlinux;

The old code initializes "st_map->btf" here.

> -
>   	mutex_init(&st_map->lock);
>   	set_vm_flush_reset_perms(st_map->image);
>   	bpf_map_init_from_attr(map, attr);
>   
>   	return map;
> +
> +errout_free:
> +	__bpf_struct_ops_map_free(map);
> +	btf = NULL;		/* has been released */
> +errout:
> +	btf_put(btf);
> +
> +	return ERR_PTR(ret);
>   }
>   
>   static u64 bpf_struct_ops_map_mem_usage(const struct bpf_map *map)
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0ed286b8a0f0..974651fe2bee 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -1096,7 +1096,7 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
>   	return ret;
>   }
>   
> -#define BPF_MAP_CREATE_LAST_FIELD map_extra
> +#define BPF_MAP_CREATE_LAST_FIELD value_type_btf_obj_fd
>   /* called via syscall */
>   static int map_create(union bpf_attr *attr)
>   {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index bdd166cab977..3f446f76d4bf 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -20086,6 +20086,7 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
>   	const struct btf_member *member;
>   	struct bpf_prog *prog = env->prog;
>   	u32 btf_id, member_idx;
> +	struct btf *btf;
>   	const char *mname;
>   
>   	if (!prog->gpl_compatible) {
> @@ -20093,8 +20094,10 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
>   		return -EINVAL;
>   	}
>   
> +	btf = prog->aux->attach_btf;
> +
>   	btf_id = prog->aux->attach_btf_id;
> -	st_ops_desc = bpf_struct_ops_find(btf_vmlinux, btf_id);
> +	st_ops_desc = bpf_struct_ops_find(btf, btf_id);
>   	if (!st_ops_desc) {
>   		verbose(env, "attach_btf_id %u is not a supported struct\n",
>   			btf_id);
> @@ -20111,8 +20114,8 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
>   	}
>   
>   	member = &btf_type_member(t)[member_idx];
> -	mname = btf_name_by_offset(btf_vmlinux, member->name_off);
> -	func_proto = btf_type_resolve_func_ptr(btf_vmlinux, member->type,
> +	mname = btf_name_by_offset(btf, member->name_off);
> +	func_proto = btf_type_resolve_func_ptr(btf, member->type,
>   					       NULL);
>   	if (!func_proto) {
>   		verbose(env, "attach to invalid member %s(@idx %u) of struct %s\n",
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 0f6cdf52b1da..fd20c52606b2 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1398,6 +1398,11 @@ union bpf_attr {
>   		 * to using 5 hash functions).
>   		 */
>   		__u64	map_extra;
> +
> +		__u32   value_type_btf_obj_fd;	/* fd pointing to a BTF
> +						 * type data for
> +						 * btf_vmlinux_value_type_id.
> +						 */
>   	};
>   
>   	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */


  reply	other threads:[~2023-11-10  2:04 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-06 20:12 [PATCH bpf-next v11 00/13] Registrating struct_ops types from modules thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 01/13] bpf: refactory struct_ops type initialization to a function thinker.li
2023-11-10  1:11   ` Martin KaFai Lau
2023-11-21 23:53     ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 02/13] bpf: get type information with BPF_ID_LIST thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 03/13] bpf, net: introduce bpf_struct_ops_desc thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 04/13] bpf: add struct_ops_tab to btf thinker.li
2023-11-10  1:35   ` Martin KaFai Lau
2023-11-22  2:27     ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 05/13] bpf: make struct_ops_map support btfs other than btf_vmlinux thinker.li
2023-11-10  1:40   ` Martin KaFai Lau
2023-11-22  2:28     ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 06/13] bpf: lookup struct_ops types from a given module BTF thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 07/13] bpf: pass attached BTF to the bpf_struct_ops subsystem thinker.li
2023-11-10  2:04   ` Martin KaFai Lau [this message]
2023-11-22 22:33     ` Kui-Feng Lee
2023-11-27 22:08       ` Martin KaFai Lau
2023-11-06 20:12 ` [PATCH bpf-next v11 08/13] bpf: hold module for bpf_struct_ops_map thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 09/13] bpf: validate value_type thinker.li
2023-11-10  2:11   ` Martin KaFai Lau
2023-11-22 23:47     ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 10/13] bpf, net: switch to dynamic registration thinker.li
2023-11-10  2:19   ` Martin KaFai Lau
2023-11-22 23:53     ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 11/13] libbpf: Find correct module BTFs for struct_ops maps and progs thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 12/13] bpf: export btf_ctx_access to modules thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 13/13] selftests/bpf: test case for register_bpf_struct_ops() thinker.li
2023-11-10  2:23   ` Martin KaFai Lau
2023-11-22 23:59     ` Kui-Feng Lee
2023-11-17 10:45   ` Hou Tao
2023-11-23  0:00     ` Kui-Feng Lee
2023-11-10  6:56 ` [PATCH bpf-next v11 00/13] Registrating struct_ops types from modules Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cbae302-7fa6-5625-921a-c6f548bcc3a2@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=drosen@google.com \
    --cc=kernel-team@meta.com \
    --cc=kuifeng@meta.com \
    --cc=sinquersw@gmail.com \
    --cc=song@kernel.org \
    --cc=thinker.li@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox