From: Martin KaFai Lau <martin.lau@linux.dev>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kui-Feng Lee <thinker.li@gmail.com>,
kernel-team@meta.com
Subject: [PATCH v5 bpf-next 05/12] bpf: Postpone bpf_obj_free_fields to the rcu callback
Date: Mon, 14 Oct 2024 17:49:55 -0700 [thread overview]
Message-ID: <20241015005008.767267-6-martin.lau@linux.dev> (raw)
In-Reply-To: <20241015005008.767267-1-martin.lau@linux.dev>
From: Martin KaFai Lau <martin.lau@kernel.org>
A later patch will enable the uptr usage in the task_local_storage map.
This will require the unpin_user_page() to be done after the rcu
task trace gp for the cases that the uptr may still be used by
a bpf prog. The bpf_obj_free_fields() will be the one doing
unpin_user_page(), so this patch is to postpone calling
bpf_obj_free_fields() to the rcu callback.
The bpf_obj_free_fields() is only required to be done in
the rcu callback when bpf->bpf_ma==true and reuse_now==false.
bpf->bpf_ma==true case is because uptr will only be enabled
in task storage which has already been moved to bpf_mem_alloc.
The bpf->bpf_ma==false case can be supported in the future
also if there is a need.
reuse_now==false when the selem (aka storage) is deleted
by bpf prog (bpf_task_storage_delete) or by syscall delete_elem().
In both cases, bpf_obj_free_fields() needs to wait for
rcu gp.
A few words on reuse_now==true. reuse_now==true when the
storage's owner (i.e. the task_struct) is destructing or the map
itself is doing map_free(). In both cases, no bpf prog should
have a hold on the selem and its uptrs, so there is no need to
postpone bpf_obj_free_fields(). reuse_now==true should be the
common case for local storage usage where the storage exists
throughout the lifetime of its owner (task_struct).
The bpf_obj_free_fields() needs to use the map->record. Doing
bpf_obj_free_fields() in a rcu callback will require the
bpf_local_storage_map_free() to wait for rcu_barrier. An optimization
could be only waiting for rcu_barrier when the map has uptr in
its map_value. This will require either yet another rcu callback
function or adding a bool in the selem to flag if the SDATA(selem)->smap
is still valid. This patch chooses to keep it simple and wait for
rcu_barrier for maps that use bpf_mem_alloc.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
---
kernel/bpf/bpf_local_storage.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 09a67dff2336..ca871be1c42d 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -209,8 +209,12 @@ static void __bpf_selem_free(struct bpf_local_storage_elem *selem,
static void bpf_selem_free_rcu(struct rcu_head *rcu)
{
struct bpf_local_storage_elem *selem;
+ struct bpf_local_storage_map *smap;
selem = container_of(rcu, struct bpf_local_storage_elem, rcu);
+ /* The bpf_local_storage_map_free will wait for rcu_barrier */
+ smap = rcu_dereference_check(SDATA(selem)->smap, 1);
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
bpf_mem_cache_raw_free(selem);
}
@@ -226,16 +230,25 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem,
struct bpf_local_storage_map *smap,
bool reuse_now)
{
- bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
-
if (!smap->bpf_ma) {
+ /* Only task storage has uptrs and task storage
+ * has moved to bpf_mem_alloc. Meaning smap->bpf_ma == true
+ * for task storage, so this bpf_obj_free_fields() won't unpin
+ * any uptr.
+ */
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
__bpf_selem_free(selem, reuse_now);
return;
}
- if (!reuse_now) {
- call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_trace_rcu);
- } else {
+ if (reuse_now) {
+ /* reuse_now == true only happens when the storage owner
+ * (e.g. task_struct) is being destructed or the map itself
+ * is being destructed (ie map_free). In both cases,
+ * no bpf prog can have a hold on the selem. It is
+ * safe to unpin the uptrs and free the selem now.
+ */
+ bpf_obj_free_fields(smap->map.record, SDATA(selem)->data);
/* Instead of using the vanilla call_rcu(),
* bpf_mem_cache_free will be able to reuse selem
* immediately.
@@ -243,7 +256,10 @@ void bpf_selem_free(struct bpf_local_storage_elem *selem,
migrate_disable();
bpf_mem_cache_free(&smap->selem_ma, selem);
migrate_enable();
+ return;
}
+
+ call_rcu_tasks_trace(&selem->rcu, bpf_selem_free_trace_rcu);
}
static void bpf_selem_free_list(struct hlist_head *list, bool reuse_now)
@@ -908,6 +924,9 @@ void bpf_local_storage_map_free(struct bpf_map *map,
synchronize_rcu();
if (smap->bpf_ma) {
+ rcu_barrier_tasks_trace();
+ if (!rcu_trace_implies_rcu_gp())
+ rcu_barrier();
bpf_mem_alloc_destroy(&smap->selem_ma);
bpf_mem_alloc_destroy(&smap->storage_ma);
}
--
2.43.5
next prev parent reply other threads:[~2024-10-15 0:50 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-15 0:49 [PATCH v5 bpf-next 00/12] Share user memory to BPF program through task storage map Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 01/12] bpf: Support __uptr type tag in BTF Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 02/12] bpf: Handle BPF_UPTR in verifier Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 03/12] bpf: Add "bool swap_uptrs" arg to bpf_local_storage_update() and bpf_selem_alloc() Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 04/12] bpf: Postpone bpf_selem_free() in bpf_selem_unlink_storage_nolock() Martin KaFai Lau
2024-10-15 0:49 ` Martin KaFai Lau [this message]
2024-10-15 0:49 ` [PATCH v5 bpf-next 06/12] bpf: Add uptr support in the map_value of the task local storage Martin KaFai Lau
2024-10-22 23:07 ` Shakeel Butt
2024-10-23 0:57 ` Shakeel Butt
2024-10-24 0:44 ` Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 07/12] libbpf: define __uptr Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 08/12] selftests/bpf: Some basic __uptr tests Martin KaFai Lau
2024-10-15 0:49 ` [PATCH v5 bpf-next 09/12] selftests/bpf: Test a uptr struct spanning across pages Martin KaFai Lau
2024-10-15 0:50 ` [PATCH v5 bpf-next 10/12] selftests/bpf: Add update_elem failure test for task storage uptr Martin KaFai Lau
2024-10-15 0:50 ` [PATCH v5 bpf-next 11/12] selftests/bpf: Add uptr failure verifier tests Martin KaFai Lau
2024-10-15 0:50 ` [PATCH v5 bpf-next 12/12] selftests/bpf: Create task_local_storage map with invalid uptr's struct Martin KaFai Lau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241015005008.767267-6-martin.lau@linux.dev \
--to=martin.lau@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@meta.com \
--cc=thinker.li@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox