From: "Toke Høiland-Jørgensen" <toke@kernel.org>
To: Hou Tao <houtao@huaweicloud.com>, Hou Tao <hotforest@gmail.com>,
bpf@vger.kernel.org, rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
"Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: [RESEND] [PATCH bpf-next 2/3] bpf: Overwrite the element in hash map atomically
Date: Thu, 06 Feb 2025 16:05:02 +0100 [thread overview]
Message-ID: <8734gr3yht.fsf@toke.dk> (raw)
In-Reply-To: <cca6daf2-48f4-57b9-59a9-75578bb755b9@huaweicloud.com>
Hou Tao <houtao@huaweicloud.com> writes:
> +cc Cody Haas
>
> Sorry for the resend. I sent the reply in the HTML format.
>
> On 2/4/2025 4:28 PM, Hou Tao wrote:
>> Currently, the update of existing element in hash map involves two
>> steps:
>> 1) insert the new element at the head of the hash list
>> 2) remove the old element
>>
>> It is possible that the concurrent lookup operation may fail to find
>> either the old element or the new element if the lookup operation starts
>> before the addition and continues after the removal.
>>
>> Therefore, replacing the two-step update with an atomic update. After
>> the change, the update will be atomic in the perspective of the lookup
>> operation: it will either find the old element or the new element.
>>
>> Signed-off-by: Hou Tao <hotforest@gmail.com>
>> ---
>> kernel/bpf/hashtab.c | 14 ++++++++------
>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>> index 4a9eeb7aef85..a28b11ce74c6 100644
>> --- a/kernel/bpf/hashtab.c
>> +++ b/kernel/bpf/hashtab.c
>> @@ -1179,12 +1179,14 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value,
>> goto err;
>> }
>>
>> - /* add new element to the head of the list, so that
>> - * concurrent search will find it before old elem
>> - */
>> - hlist_nulls_add_head_rcu(&l_new->hash_node, head);
>> - if (l_old) {
>> - hlist_nulls_del_rcu(&l_old->hash_node);
>> + if (!l_old) {
>> + hlist_nulls_add_head_rcu(&l_new->hash_node, head);
>> + } else {
>> + /* Replace the old element atomically, so that
>> + * concurrent search will find either the new element or
>> + * the old element.
>> + */
>> + hlist_nulls_replace_rcu(&l_new->hash_node, &l_old->hash_node);
>>
>> /* l_old has already been stashed in htab->extra_elems, free
>> * its special fields before it is available for reuse. Also
>>
>
> After thinking about it the second time, the atomic list replacement on
> the update side is enough to make lookup operation always find the
> existing element. However, due to the immediate reuse, the lookup may
> find an unexpected value. Maybe we should disable the immediate reuse
> for specific map (e.g., htab of maps).
Hmm, in an RCU-protected data structure, reusing the memory before an
RCU grace period has elapsed is just as wrong as freeing it, isn't it?
I.e., the reuse logic should have some kind of call_rcu redirection to
be completely correct?
-Toke
next prev parent reply other threads:[~2025-02-06 15:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 8:28 [PATCH bpf-next 0/3] bpf: Overwrite the htab element atomically Hou Tao
2025-02-04 8:28 ` [PATCH bpf-next 1/3] rculist: add hlist_nulls_replace_rcu() helper Hou Tao
2025-02-04 8:28 ` [PATCH bpf-next 2/3] bpf: Overwrite the element in hash map atomically Hou Tao
2025-02-05 1:38 ` [RESEND] " Hou Tao
2025-02-06 15:05 ` Toke Høiland-Jørgensen [this message]
2025-02-08 10:16 ` Hou Tao
2025-02-26 3:24 ` Alexei Starovoitov
2025-02-26 4:05 ` Hou Tao
2025-02-26 5:42 ` Alexei Starovoitov
2025-02-26 23:17 ` Zvi Effron
2025-02-27 1:48 ` Hou Tao
2025-02-27 1:59 ` Alexei Starovoitov
2025-02-27 2:43 ` Hou Tao
2025-02-27 3:17 ` Alexei Starovoitov
2025-02-27 4:08 ` Hou Tao
2025-03-06 10:22 ` Nick Zavaritsky
2025-02-04 8:28 ` [PATCH bpf-next 3/3] selftests/bpf: Add test case for atomic htab update Hou Tao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8734gr3yht.fsf@toke.dk \
--to=toke@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=hotforest@gmail.com \
--cc=houtao@huaweicloud.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=sdf@fomichev.me \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox