From: Kui-Feng Lee <sinquersw@gmail.com>
To: Martin KaFai Lau <martin.lau@linux.dev>, thinker.li@gmail.com
Cc: kuifeng@meta.com, bpf@vger.kernel.org, ast@kernel.org,
song@kernel.org, kernel-team@meta.com, andrii@kernel.org,
drosen@google.com
Subject: Re: [PATCH bpf-next v13 08/14] bpf: hold module for bpf_struct_ops_map.
Date: Fri, 15 Dec 2023 15:25:42 -0800 [thread overview]
Message-ID: <7ddc9157-1eba-41b1-a3fd-bbf315f9cfb9@gmail.com> (raw)
In-Reply-To: <0a8849cd-b8ca-4219-b7cc-5331c42fc190@linux.dev>
On 12/14/23 21:54, Martin KaFai Lau wrote:
> On 12/8/23 4:27 PM, thinker.li@gmail.com wrote:
>> From: Kui-Feng Lee <thinker.li@gmail.com>
>>
>> To ensure that a module remains accessible whenever a struct_ops
>> object of
>> a struct_ops type provided by the module is still in use.
>>
>> struct bpf_strct_ops_map doesn't hold a refcnt to btf anymore sicne a
>
> s /bpf_strct_/bpf_struct_/
>
> s/sicne/since/
>
>> module will hold a refcnt to it's btf already. But, struct_ops
>> programs are
>> different. They hold their associated btf, not the module since they need
>> only btf to assure their types (signatures).
>
> The patch subject is not accurate. The patch holds the module refcnt
> when verifying the bpf prog also. May be "hold module refcnt in
> struct_ops map creation and prog verification".
>
> The commit message also is inaccurate on the prog load. It did not
> mention the module is also held when loading struct_ops prog but it is
> only held during the verification time. Please explain why it is only
> needed during the verification time.
>
>>
>> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
>> ---
>> include/linux/bpf.h | 1 +
>> include/linux/bpf_verifier.h | 1 +
>> kernel/bpf/bpf_struct_ops.c | 28 +++++++++++++++++++++++-----
>> kernel/bpf/verifier.c | 10 ++++++++++
>> 4 files changed, 35 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 91bcd62d6fcf..c5c7cc4552f5 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -1681,6 +1681,7 @@ struct bpf_struct_ops {
>> void (*unreg)(void *kdata);
>> int (*update)(void *kdata, void *old_kdata);
>> int (*validate)(void *kdata);
>> + struct module *owner;
>> const char *name;
>> struct btf_func_model func_models[BPF_STRUCT_OPS_MAX_NR_MEMBERS];
>> };
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 314b679fb494..01113bcdd479 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -651,6 +651,7 @@ struct bpf_verifier_env {
>> u32 prev_insn_idx;
>> struct bpf_prog *prog; /* eBPF program being verified */
>> const struct bpf_verifier_ops *ops;
>> + struct module *attach_btf_mod; /* The owner module of
>> prog->aux->attach_btf */
>> struct bpf_verifier_stack_elem *head; /* stack of verifier
>> states to be processed */
>> int stack_size; /* number of states to be processed */
>> bool strict_alignment; /* perform strict pointer
>> alignment checks */
>> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
>> index f943f8378e76..a838f7c7d583 100644
>> --- a/kernel/bpf/bpf_struct_ops.c
>> +++ b/kernel/bpf/bpf_struct_ops.c
>> @@ -641,12 +641,15 @@ static void __bpf_struct_ops_map_free(struct
>> bpf_map *map)
>> bpf_jit_uncharge_modmem(PAGE_SIZE);
>> }
>> bpf_map_area_free(st_map->uvalue);
>> - btf_put(st_map->btf);
>> bpf_map_area_free(st_map);
>> }
>> static void bpf_struct_ops_map_free(struct bpf_map *map)
>> {
>> + struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map
>> *)map;
>> +
>> + module_put(st_map->st_ops_desc->st_ops->owner);
>
> The module_get was not done on st_ops->owner when st_map->btf is
> btf_vmlinux (i.e. not module). Although it probably does not matter, I
> would feel more comfortable if it only releases for the things that it
> did acquire earlier.
>
> /* st_ops->owner was acquired during map_alloc to implicitly holds
> * the btf's refcnt. The acquire was only done when btf_is_module()
> * st_map->btf cannot be NULL here.
> */
> if (btf_is_module(st_map->btf))
> module_put(st_map->st_ops_desc->st_ops->owner);
Sure! I will update it.
>
>> +
>> /* The struct_ops's function may switch to another struct_ops.
>> *
>> * For example, bpf_tcp_cc_x->init() may switch to
>> @@ -681,6 +684,7 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> size_t st_map_size;
>> struct bpf_struct_ops_map *st_map;
>> const struct btf_type *t, *vt;
>> + struct module *mod = NULL;
>> struct bpf_map *map;
>> struct btf *btf;
>> int ret;
>> @@ -690,10 +694,20 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> btf = btf_get_by_fd(attr->value_type_btf_obj_fd);
>> if (IS_ERR(btf))
>> return ERR_PTR(PTR_ERR(btf));
>> - } else {
>> +
>> + if (btf != btf_vmlinux) {
>> + mod = btf_try_get_module(btf);
>> + if (!mod) {
>> + btf_put(btf);
>> + return ERR_PTR(-EINVAL);
>> + }
>> + }
>> + /* mod (NULL for btf_vmlinux) holds a refcnt to btf. We
>> + * don't need an extra refcnt here.
>> + */
>> + btf_put(btf);
>> + } else
>> btf = btf_vmlinux;
>> - btf_get(btf);
>> - }
>> st_ops_desc = bpf_struct_ops_find_value(btf,
>> attr->btf_vmlinux_value_type_id);
>> if (!st_ops_desc) {
>> @@ -756,7 +770,7 @@ static struct bpf_map
>> *bpf_struct_ops_map_alloc(union bpf_attr *attr)
>> errout_free:
>> __bpf_struct_ops_map_free(map);
>> errout:
>> - btf_put(btf);
>> + module_put(mod);
>> return ERR_PTR(ret);
>> }
>> @@ -886,6 +900,10 @@ static int bpf_struct_ops_map_link_update(struct
>> bpf_link *link, struct bpf_map
>> if (!bpf_struct_ops_valid_to_reg(new_map))
>> return -EINVAL;
>> + /* The old map is holding the refcount for the owner module. The
>> + * ownership of the owner module refcount is going to be
>> + * transferred from the old map to the new map.
>> + */
>
> This part I don't understand. Both old and new map hold its own module's
> refcount at map_alloc time and release its own module refcnt during
> map_free().
> Where the module refcount transfer happened?
Sorry! This comment is not more valid. I will remove it.
>
>> if (!st_map->st_ops_desc->st_ops->update)
>> return -EOPNOTSUPP;
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 795c16f9cf57..c303cf2fb5ff 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -20079,6 +20079,14 @@ static int check_struct_ops_btf_id(struct
>> bpf_verifier_env *env)
>> }
>> btf = prog->aux->attach_btf;
>> + if (btf != btf_vmlinux) {
>
> if (btf_is_module(btf)) {
>
Got it!
>> + /* Make sure st_ops is valid through the lifetime of env */
>> + env->attach_btf_mod = btf_try_get_module(btf);
>> + if (!env->attach_btf_mod) {
>> + verbose(env, "owner module of btf is not found\n");
>> + return -ENOTSUPP;
>> + }
>> + }
>> btf_id = prog->aux->attach_btf_id;
>> st_ops_desc = bpf_struct_ops_find(btf, btf_id);
>> @@ -20792,6 +20800,8 @@ int bpf_check(struct bpf_prog **prog, union
>> bpf_attr *attr, bpfptr_t uattr, __u3
>> env->prog->expected_attach_type = 0;
>> *prog = env->prog;
>> +
>> + module_put(env->attach_btf_mod);
>> err_unlock:
>> if (!is_priv)
>> mutex_unlock(&bpf_verifier_lock);
>
next prev parent reply other threads:[~2023-12-15 23:25 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-09 0:26 [PATCH bpf-next v13 00/14] Registrating struct_ops types from modules thinker.li
2023-12-09 0:26 ` [PATCH bpf-next v13 01/14] bpf: refactory struct_ops type initialization to a function thinker.li
2023-12-09 0:26 ` [PATCH bpf-next v13 02/14] bpf: get type information with BPF_ID_LIST thinker.li
2023-12-15 1:59 ` Martin KaFai Lau
2023-12-09 0:26 ` [PATCH bpf-next v13 03/14] bpf, net: introduce bpf_struct_ops_desc thinker.li
2023-12-15 2:05 ` Martin KaFai Lau
2023-12-09 0:26 ` [PATCH bpf-next v13 04/14] bpf: add struct_ops_tab to btf thinker.li
2023-12-15 2:22 ` Martin KaFai Lau
2023-12-15 21:42 ` Kui-Feng Lee
2023-12-16 1:19 ` Martin KaFai Lau
2023-12-16 5:43 ` Kui-Feng Lee
2023-12-16 16:48 ` Martin KaFai Lau
2023-12-17 7:09 ` Kui-Feng Lee
2023-12-09 0:27 ` [PATCH bpf-next v13 05/14] bpf: make struct_ops_map support btfs other than btf_vmlinux thinker.li
2023-12-09 0:27 ` [PATCH bpf-next v13 06/14] bpf: lookup struct_ops types from a given module BTF thinker.li
2023-12-09 0:27 ` [PATCH bpf-next v13 07/14] bpf: pass attached BTF to the bpf_struct_ops subsystem thinker.li
2023-12-15 2:44 ` Martin KaFai Lau
2023-12-15 22:10 ` Kui-Feng Lee
2023-12-16 0:19 ` Martin KaFai Lau
2023-12-16 5:55 ` Kui-Feng Lee
2023-12-16 6:07 ` Kui-Feng Lee
2023-12-16 16:41 ` Martin KaFai Lau
2023-12-16 19:38 ` Kui-Feng Lee
2023-12-09 0:27 ` [PATCH bpf-next v13 08/14] bpf: hold module for bpf_struct_ops_map thinker.li
2023-12-15 5:54 ` Martin KaFai Lau
2023-12-15 23:25 ` Kui-Feng Lee [this message]
2023-12-09 0:27 ` [PATCH bpf-next v13 09/14] bpf: validate value_type thinker.li
2023-12-15 6:02 ` Martin KaFai Lau
2023-12-15 23:52 ` Kui-Feng Lee
2023-12-09 0:27 ` [PATCH bpf-next v13 10/14] bpf, net: switch to dynamic registration thinker.li
2023-12-15 6:51 ` Martin KaFai Lau
2023-12-09 0:27 ` [PATCH bpf-next v13 11/14] libbpf: Find correct module BTFs for struct_ops maps and progs thinker.li
2023-12-09 0:27 ` [PATCH bpf-next v13 12/14] bpf: export btf_ctx_access to modules thinker.li
2023-12-09 0:27 ` [PATCH bpf-next v13 13/14] selftests/bpf: test case for register_bpf_struct_ops() thinker.li
2023-12-15 7:17 ` Martin KaFai Lau
2023-12-17 7:32 ` Kui-Feng Lee
2023-12-09 0:27 ` [PATCH bpf-next v13 14/14] bpf: pass btf object id in bpf_map_info thinker.li
2023-12-15 7:46 ` Martin KaFai Lau
2023-12-17 7:35 ` Kui-Feng Lee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7ddc9157-1eba-41b1-a3fd-bbf315f9cfb9@gmail.com \
--to=sinquersw@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=drosen@google.com \
--cc=kernel-team@meta.com \
--cc=kuifeng@meta.com \
--cc=martin.lau@linux.dev \
--cc=song@kernel.org \
--cc=thinker.li@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox