From: Kui-Feng Lee <sinquersw@gmail.com>
To: Martin KaFai Lau <martin.lau@linux.dev>, thinker.li@gmail.com
Cc: kuifeng@meta.com, bpf@vger.kernel.org, ast@kernel.org,
song@kernel.org, kernel-team@meta.com, andrii@kernel.org,
drosen@google.com
Subject: Re: [PATCH bpf-next v11 07/13] bpf: pass attached BTF to the bpf_struct_ops subsystem
Date: Wed, 22 Nov 2023 14:33:59 -0800 [thread overview]
Message-ID: <180568df-308f-4bc5-8a54-a9f224123429@gmail.com> (raw)
In-Reply-To: <5cbae302-7fa6-5625-921a-c6f548bcc3a2@linux.dev>
On 11/9/23 18:04, Martin KaFai Lau wrote:
> On 11/6/23 12:12 PM, thinker.li@gmail.com wrote:
>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>
>> Every kernel module has its BTF, comprising information on types
>> defined in
>> the module. The BTF fd (attr->value_type_btf_obj_fd) passed from
>> userspace
>
> I would highlight this patch (adds) value_type_btf_obj_fd.
>
>> helps the bpf_struct_ops to lookup type information and description of
>> the
>> struct_ops type, which is necessary for parsing the layout of map element
>> values and registering maps. The descriptions are looked up by matching a
>> type id (attr->btf_vmlinux_value_type_id) against bpf_struct_ops_desc(s)
>> defined in a BTF. If the struct_ops type is defined in a module, the
>> bpf_struct_ops needs to know the module BTF to lookup the
>> bpf_struct_ops_desc.
>>
>> The bpf_prog includes attach_btf in aux which is passed along with the
>> bpf_attr when loading the program. The purpose of attach_btf is to
>
> I read it as "attach_btf" is passed in the bpf_attr. This has been in my
> head for a while. I sort of know what is the actual uapi, so didn't get
> to it yet.
>
> We have already discussed a bit of this offline. I think it meant
> attr->attach_btf_obj_fd here.
>
> This patch is mainly about how the userspace passing kmod's btf to the
> kernel during map creation and prog load and also what uapi does it use.
> The commit message should mention this patch is reusing the existing
> attr->attach_btf_obj_fd for the userspace to pass the kmod's btf when
> loading the struct_ops prog. I need to go back to the syscall.c code to
> figure out and also leap forward to the later libbpf patch to confirm it.
>
> I depend on the commit message to help the review. It is much
> appreciated if the commit message is clear and accurate on things like:
> what it wants to do, how it does it (addition/deletion/changes), and
> what are the major changes.
Got it! I will rewrite the commit log to make it easier to read the
patch.
>
>> determine the btf type of attach_btf_id. The attach_btf_id is then
>> used to
>> identify the traced function for a trace program. In the case of
>> struct_ops
>> programs, it is used to identify the struct_ops type of the struct_ops
>> object that a program is attached to.
>>
>> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
>> ---
>> include/uapi/linux/bpf.h | 5 +++
>> kernel/bpf/bpf_struct_ops.c | 57 ++++++++++++++++++++++++----------
>> kernel/bpf/syscall.c | 2 +-
>> kernel/bpf/verifier.c | 9 ++++--
>> tools/include/uapi/linux/bpf.h | 5 +++
>> 5 files changed, 57 insertions(+), 21 deletions(-)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 0f6cdf52b1da..fd20c52606b2 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -1398,6 +1398,11 @@ union bpf_attr {
>> * to using 5 hash functions).
>> */
>> __u64 map_extra;
>> +
>> + __u32 value_type_btf_obj_fd; /* fd pointing to a BTF
>> + * type data for
>> + * btf_vmlinux_value_type_id.
>> + */
>> };
>> struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
>> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
>> index 4ba6181ed1c4..2fb1b21f989a 100644
>> --- a/kernel/bpf/bpf_struct_ops.c
>> +++ b/kernel/bpf/bpf_struct_ops.c
>> @@ -635,6 +635,7 @@ static void __bpf_struct_ops_map_free(struct
>> bpf_map *map)
>> bpf_jit_uncharge_modmem(PAGE_SIZE);
>> }
>> bpf_map_area_free(st_map->uvalue);
>> + btf_put(st_map->btf);
>> bpf_map_area_free(st_map);
>> }
>> @@ -675,15 +676,30 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> struct bpf_struct_ops_map *st_map;
>> const struct btf_type *t, *vt;
>> struct bpf_map *map;
>> + struct btf *btf;
>> int ret;
>> - st_ops_desc = bpf_struct_ops_find_value(btf_vmlinux,
>> attr->btf_vmlinux_value_type_id);
>> - if (!st_ops_desc)
>> - return ERR_PTR(-ENOTSUPP);
>> + if (attr->value_type_btf_obj_fd) {
>> + /* The map holds btf for its whole life time. */
>> + btf = btf_get_by_fd(attr->value_type_btf_obj_fd);
>> + if (IS_ERR(btf))
>> + return ERR_PTR(PTR_ERR(btf));
>> + } else {
>> + btf = btf_vmlinux;
>> + btf_get(btf);
>> + }
>> +
>> + st_ops_desc = bpf_struct_ops_find_value(btf,
>> attr->btf_vmlinux_value_type_id);
>> + if (!st_ops_desc) {
>> + ret = -ENOTSUPP;
>> + goto errout;
>> + }
>> vt = st_ops_desc->value_type;
>> - if (attr->value_size != vt->size)
>> - return ERR_PTR(-EINVAL);
>> + if (attr->value_size != vt->size) {
>> + ret = -EINVAL;
>> + goto errout;
>> + }
>> t = st_ops_desc->type;
>> @@ -694,17 +710,18 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> (vt->size - sizeof(struct bpf_struct_ops_value));
>> st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE);
>> - if (!st_map)
>> - return ERR_PTR(-ENOMEM);
>> + if (!st_map) {
>> + ret = -ENOMEM;
>> + goto errout;
>> + }
>> + st_map->btf = btf;
>
> How about do the "st_map->btf = btf;" assignment the same as where the
> current code is doing (a few lines below). Would it avoid the new "btf =
> NULL;" dance during the error case?
>
> nit, if moving a line, I would move the following "st_map->st_ops_desc =
> st_ops_desc;" to the later and close to where "st_map->btf = btf;" is.
It would work. But, I also need to init st_map->btf as NULL. Or, it may
fail at errout_free to free an invalid pointer if I read it correctly.
>
>> st_map->st_ops_desc = st_ops_desc;
>> map = &st_map->map;
>> ret = bpf_jit_charge_modmem(PAGE_SIZE);
>> - if (ret) {
>> - __bpf_struct_ops_map_free(map);
>> - return ERR_PTR(ret);
>> - }
>> + if (ret)
>> + goto errout_free;
>> st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
>> if (!st_map->image) {
>> @@ -713,25 +730,31 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> * here.
>> */
>> bpf_jit_uncharge_modmem(PAGE_SIZE);
>> - __bpf_struct_ops_map_free(map);
>> - return ERR_PTR(-ENOMEM);
>> + ret = -ENOMEM;
>> + goto errout_free;
>> }
>> st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
>> st_map->links =
>> bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct
>> bpf_links *),
>> NUMA_NO_NODE);
>> if (!st_map->uvalue || !st_map->links) {
>> - __bpf_struct_ops_map_free(map);
>> - return ERR_PTR(-ENOMEM);
>> + ret = -ENOMEM;
>> + goto errout_free;
>> }
>> - st_map->btf = btf_vmlinux;
>
> The old code initializes "st_map->btf" here.
>
>> -
>> mutex_init(&st_map->lock);
>> set_vm_flush_reset_perms(st_map->image);
>> bpf_map_init_from_attr(map, attr);
>> return map;
>> +
>> +errout_free:
>> + __bpf_struct_ops_map_free(map);
>> + btf = NULL; /* has been released */
>> +errout:
>> + btf_put(btf);
>> +
>> + return ERR_PTR(ret);
>> }
>> static u64 bpf_struct_ops_map_mem_usage(const struct bpf_map *map)
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 0ed286b8a0f0..974651fe2bee 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -1096,7 +1096,7 @@ static int map_check_btf(struct bpf_map *map,
>> const struct btf *btf,
>> return ret;
>> }
>> -#define BPF_MAP_CREATE_LAST_FIELD map_extra
>> +#define BPF_MAP_CREATE_LAST_FIELD value_type_btf_obj_fd
>> /* called via syscall */
>> static int map_create(union bpf_attr *attr)
>> {
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index bdd166cab977..3f446f76d4bf 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -20086,6 +20086,7 @@ static int check_struct_ops_btf_id(struct
>> bpf_verifier_env *env)
>> const struct btf_member *member;
>> struct bpf_prog *prog = env->prog;
>> u32 btf_id, member_idx;
>> + struct btf *btf;
>> const char *mname;
>> if (!prog->gpl_compatible) {
>> @@ -20093,8 +20094,10 @@ static int check_struct_ops_btf_id(struct
>> bpf_verifier_env *env)
>> return -EINVAL;
>> }
>> + btf = prog->aux->attach_btf;
>> +
>> btf_id = prog->aux->attach_btf_id;
>> - st_ops_desc = bpf_struct_ops_find(btf_vmlinux, btf_id);
>> + st_ops_desc = bpf_struct_ops_find(btf, btf_id);
>> if (!st_ops_desc) {
>> verbose(env, "attach_btf_id %u is not a supported struct\n",
>> btf_id);
>> @@ -20111,8 +20114,8 @@ static int check_struct_ops_btf_id(struct
>> bpf_verifier_env *env)
>> }
>> member = &btf_type_member(t)[member_idx];
>> - mname = btf_name_by_offset(btf_vmlinux, member->name_off);
>> - func_proto = btf_type_resolve_func_ptr(btf_vmlinux, member->type,
>> + mname = btf_name_by_offset(btf, member->name_off);
>> + func_proto = btf_type_resolve_func_ptr(btf, member->type,
>> NULL);
>> if (!func_proto) {
>> verbose(env, "attach to invalid member %s(@idx %u) of struct
>> %s\n",
>> diff --git a/tools/include/uapi/linux/bpf.h
>> b/tools/include/uapi/linux/bpf.h
>> index 0f6cdf52b1da..fd20c52606b2 100644
>> --- a/tools/include/uapi/linux/bpf.h
>> +++ b/tools/include/uapi/linux/bpf.h
>> @@ -1398,6 +1398,11 @@ union bpf_attr {
>> * to using 5 hash functions).
>> */
>> __u64 map_extra;
>> +
>> + __u32 value_type_btf_obj_fd; /* fd pointing to a BTF
>> + * type data for
>> + * btf_vmlinux_value_type_id.
>> + */
>> };
>> struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
>
next prev parent reply other threads:[~2023-11-22 22:34 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-06 20:12 [PATCH bpf-next v11 00/13] Registrating struct_ops types from modules thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 01/13] bpf: refactory struct_ops type initialization to a function thinker.li
2023-11-10 1:11 ` Martin KaFai Lau
2023-11-21 23:53 ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 02/13] bpf: get type information with BPF_ID_LIST thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 03/13] bpf, net: introduce bpf_struct_ops_desc thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 04/13] bpf: add struct_ops_tab to btf thinker.li
2023-11-10 1:35 ` Martin KaFai Lau
2023-11-22 2:27 ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 05/13] bpf: make struct_ops_map support btfs other than btf_vmlinux thinker.li
2023-11-10 1:40 ` Martin KaFai Lau
2023-11-22 2:28 ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 06/13] bpf: lookup struct_ops types from a given module BTF thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 07/13] bpf: pass attached BTF to the bpf_struct_ops subsystem thinker.li
2023-11-10 2:04 ` Martin KaFai Lau
2023-11-22 22:33 ` Kui-Feng Lee [this message]
2023-11-27 22:08 ` Martin KaFai Lau
2023-11-06 20:12 ` [PATCH bpf-next v11 08/13] bpf: hold module for bpf_struct_ops_map thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 09/13] bpf: validate value_type thinker.li
2023-11-10 2:11 ` Martin KaFai Lau
2023-11-22 23:47 ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 10/13] bpf, net: switch to dynamic registration thinker.li
2023-11-10 2:19 ` Martin KaFai Lau
2023-11-22 23:53 ` Kui-Feng Lee
2023-11-06 20:12 ` [PATCH bpf-next v11 11/13] libbpf: Find correct module BTFs for struct_ops maps and progs thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 12/13] bpf: export btf_ctx_access to modules thinker.li
2023-11-06 20:12 ` [PATCH bpf-next v11 13/13] selftests/bpf: test case for register_bpf_struct_ops() thinker.li
2023-11-10 2:23 ` Martin KaFai Lau
2023-11-22 23:59 ` Kui-Feng Lee
2023-11-17 10:45 ` Hou Tao
2023-11-23 0:00 ` Kui-Feng Lee
2023-11-10 6:56 ` [PATCH bpf-next v11 00/13] Registrating struct_ops types from modules Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=180568df-308f-4bc5-8a54-a9f224123429@gmail.com \
--to=sinquersw@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=drosen@google.com \
--cc=kernel-team@meta.com \
--cc=kuifeng@meta.com \
--cc=martin.lau@linux.dev \
--cc=song@kernel.org \
--cc=thinker.li@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox