From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kui-Feng Lee <sinquersw@gmail.com>, Kui-Feng Lee <thinker.li@gmail.com>
Cc: kuifeng@meta.com, bpf@vger.kernel.org, ast@kernel.org,
song@kernel.org, kernel-team@meta.com, andrii@kernel.org
Subject: Re: [PATCH bpf-next v2 2/6] bpf: enable detaching links of struct_ops objects.
Date: Wed, 8 May 2024 17:36:08 -0700 [thread overview]
Message-ID: <799553a1-b916-4926-819f-c30aa6aa4d2a@linux.dev> (raw)
In-Reply-To: <7f285130-d6bf-43ea-b70a-eddc5c419d3e@gmail.com>
On 5/8/24 5:14 PM, Kui-Feng Lee wrote:
>
>
> On 5/8/24 16:22, Martin KaFai Lau wrote:
>> On 5/6/24 10:55 PM, Kui-Feng Lee wrote:
>>> Implement the detach callback in bpf_link_ops for struct_ops. The
>>> subsystems that struct_ops objects are registered to can use this callback
>>> to detach the links being passed to them.
>>
>> The user space can also use the detach. The subsystem is merely reusing the
>> similar detach callback if it stores the link during reg().
>
> Sure!
>
>>
>>>
>>> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
>>> ---
>>> kernel/bpf/bpf_struct_ops.c | 50 ++++++++++++++++++++++++++++++++-----
>>> 1 file changed, 44 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
>>> index 390f8c155135..bd2602982e4d 100644
>>> --- a/kernel/bpf/bpf_struct_ops.c
>>> +++ b/kernel/bpf/bpf_struct_ops.c
>>> @@ -1057,9 +1057,6 @@ static void bpf_struct_ops_map_link_dealloc(struct
>>> bpf_link *link)
>>> st_map = (struct bpf_struct_ops_map *)
>>> rcu_dereference_protected(st_link->map, true);
>>> if (st_map) {
>>> - /* st_link->map can be NULL if
>>> - * bpf_struct_ops_link_create() fails to register.
>>> - */
>>> st_map->st_ops_desc->st_ops->unreg(&st_map->kvalue.data, st_link);
>>> bpf_map_put(&st_map->map);
>>> }
>>> @@ -1075,7 +1072,8 @@ static void bpf_struct_ops_map_link_show_fdinfo(const
>>> struct bpf_link *link,
>>> st_link = container_of(link, struct bpf_struct_ops_link, link);
>>> rcu_read_lock();
>>> map = rcu_dereference(st_link->map);
>>> - seq_printf(seq, "map_id:\t%d\n", map->id);
>>> + if (map)
>>> + seq_printf(seq, "map_id:\t%d\n", map->id);
>>> rcu_read_unlock();
>>> }
>>> @@ -1088,7 +1086,8 @@ static int bpf_struct_ops_map_link_fill_link_info(const
>>> struct bpf_link *link,
>>> st_link = container_of(link, struct bpf_struct_ops_link, link);
>>> rcu_read_lock();
>>> map = rcu_dereference(st_link->map);
>>> - info->struct_ops.map_id = map->id;
>>> + if (map)
>>> + info->struct_ops.map_id = map->id;
>>> rcu_read_unlock();
>>> return 0;
>>> }
>>> @@ -1113,6 +1112,10 @@ static int bpf_struct_ops_map_link_update(struct
>>> bpf_link *link, struct bpf_map
>>> mutex_lock(&update_mutex);
>>> old_map = rcu_dereference_protected(st_link->map,
>>> lockdep_is_held(&update_mutex));
>>> + if (!old_map) {
>>> + err = -EINVAL;
>>> + goto err_out;
>>> + }
>>> if (expected_old_map && old_map != expected_old_map) {
>>> err = -EPERM;
>>> goto err_out;
>>> @@ -1139,8 +1142,37 @@ static int bpf_struct_ops_map_link_update(struct
>>> bpf_link *link, struct bpf_map
>>> return err;
>>> }
>>> +static int bpf_struct_ops_map_link_detach(struct bpf_link *link)
>>> +{
>>> + struct bpf_struct_ops_link *st_link = container_of(link, struct
>>> bpf_struct_ops_link, link);
>>> + struct bpf_struct_ops_map *st_map;
>>> + struct bpf_map *map;
>>> +
>>> + mutex_lock(&update_mutex);
>>> +
>>> + map = rcu_dereference_protected(st_link->map, true);
>>
>> nit. s/true/lockdep_is_held(&update_mutex)/
>
>
> I thought it is protected by the refcount holding by the caller.
> WDYT?
st_link->map is the one with __rcu tag and "!map" is tested next. I don't see
how these imply the map pointer is protected by refcount. Can you explain?
>
>
>>
>>> + if (!map) {
>>> + mutex_unlock(&update_mutex);
>>> + return -EINVAL;
>>> + }
>>> + st_map = container_of(map, struct bpf_struct_ops_map, map);
>>> +
>>> + st_map->st_ops_desc->st_ops->unreg(&st_map->kvalue.data, link);
>>> +
>>> + rcu_assign_pointer(st_link->map, NULL);
>>> + /* Pair with bpf_map_get() in bpf_struct_ops_link_create() or
>>> + * bpf_map_inc() in bpf_struct_ops_map_link_update().
>>> + */
>>> + bpf_map_put(&st_map->map);
>>> +
>>> + mutex_unlock(&update_mutex);
>>> +
>>> + return 0;
>>> +}
>>> +
>>> static const struct bpf_link_ops bpf_struct_ops_map_lops = {
>>> .dealloc = bpf_struct_ops_map_link_dealloc,
>>> + .detach = bpf_struct_ops_map_link_detach,
>>> .show_fdinfo = bpf_struct_ops_map_link_show_fdinfo,
>>> .fill_link_info = bpf_struct_ops_map_link_fill_link_info,
>>> .update_map = bpf_struct_ops_map_link_update,
>>> @@ -1176,13 +1208,19 @@ int bpf_struct_ops_link_create(union bpf_attr *attr)
>>> if (err)
>>> goto err_out;
>>> + /* Init link->map before calling reg() in case being detached
>>> + * immediately.
>>> + */
>>> + RCU_INIT_POINTER(link->map, map);
>>> +
>>> err = st_map->st_ops_desc->st_ops->reg(st_map->kvalue.data, &link->link);
>>> if (err) {
>>> + rcu_assign_pointer(link->map, NULL);
>>
>> nit. RCU_INIT_POINTER(link->map, NULL) is fine.
>
> Got it!
>
>>
>> There is a merge conflict with patch 4 also.
>
> What do you mean here? Do you mean the patch 4 can not be applied on top
> of the patch 2?
Please monitor the bpf CI report.
bpf CI complains:
https://patchwork.kernel.org/project/netdevbpf/patch/20240507055600.2382627-2-thinker.li@gmail.com/
snippet of the error:
Applying: bpf: enable detaching links of struct_ops objects.
Applying: bpf: support epoll from bpf struct_ops links.
Applying: selftests/bpf: test struct_ops with epoll
Patch failed at 0004 selftests/bpf: test struct_ops with epoll
>
>>
>> pw-bot: cr
>>
>>> bpf_link_cleanup(&link_primer);
>>> + /* The link has been free by bpf_link_cleanup() */
>>> link = NULL;
>>> goto err_out;
>>> }
>>> - RCU_INIT_POINTER(link->map, map);
>>> return bpf_link_settle(&link_primer);
>>
next prev parent reply other threads:[~2024-05-09 0:36 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 5:55 [PATCH bpf-next v2 0/6] Notify user space when a struct_ops object is detached/unregistered Kui-Feng Lee
2024-05-07 5:55 ` [PATCH bpf-next v2 1/6] bpf: pass bpf_struct_ops_link to callbacks in bpf_struct_ops Kui-Feng Lee
2024-05-07 5:55 ` [PATCH bpf-next v2 2/6] bpf: enable detaching links of struct_ops objects Kui-Feng Lee
2024-05-08 23:22 ` Martin KaFai Lau
2024-05-09 0:14 ` Kui-Feng Lee
2024-05-09 0:36 ` Martin KaFai Lau [this message]
2024-05-09 16:59 ` Kui-Feng Lee
2024-05-09 0:46 ` Kui-Feng Lee
2024-05-07 5:55 ` [PATCH bpf-next v2 3/6] bpf: support epoll from bpf struct_ops links Kui-Feng Lee
2024-05-07 5:55 ` [PATCH bpf-next v2 4/6] selftests/bpf: test struct_ops with epoll Kui-Feng Lee
2024-05-08 23:34 ` Martin KaFai Lau
2024-05-09 0:22 ` Kui-Feng Lee
2024-05-07 5:55 ` [PATCH bpf-next v2 5/6] selftests/bpf: detach a struct_ops link from the subsystem managing it Kui-Feng Lee
2024-05-08 23:50 ` Martin KaFai Lau
2024-05-09 5:50 ` Kui-Feng Lee
2024-05-07 5:56 ` [PATCH bpf-next v2 6/6] selftests/bpf: make sure bpf_testmod handling racing link destroying well Kui-Feng Lee
2024-05-09 0:04 ` Martin KaFai Lau
2024-05-09 17:02 ` Kui-Feng Lee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=799553a1-b916-4926-819f-c30aa6aa4d2a@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=kernel-team@meta.com \
--cc=kuifeng@meta.com \
--cc=sinquersw@gmail.com \
--cc=song@kernel.org \
--cc=thinker.li@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.