* [PATCH bpf v2] bpf: do not use kmalloc_nolock when !HAVE_CMPXCHG_DOUBLE
@ 2026-03-14 15:31 Levi Zim via B4 Relay
0 siblings, 0 replies; only message in thread
From: Levi Zim via B4 Relay @ 2026-03-14 15:31 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti
Cc: Amery Hung, linux-riscv, stable, bpf, linux-kernel,
linux-rt-devel, Levi Zim
From: Levi Zim <rsworktech@outlook.com>
kmalloc_nolock always fails for architectures that lack cmpxchg16b.
For example, this causes bpf_task_storage_get with flag
BPF_LOCAL_STORAGE_GET_F_CREATE to fails on riscv64 6.19 kernel.
Fix it by enabling use_kmalloc_nolock only when HAVE_CMPXCHG_DOUBLE.
But leave the PREEMPT_RT case as is because it requires kmalloc_nolock
for correctness. Add a comment about this limitation that architecture's
lack of CMPXCHG_DOUBLE combined with PREEMPT_RT could make
bpf_local_storage_alloc always fail.
Fixes: f484f4a3e058 ("bpf: Replace bpf memory allocator with kmalloc_nolock() in local storage")
Cc: stable@vger.kernel.org
Signed-off-by: Levi Zim <rsworktech@outlook.com>
---
I find that bpf_task_storage_get with flag BPF_LOCAL_STORAGE_GET_F_CREATE
always fails for me on 6.19 kernel on riscv64 and bisected it.
In f484f4a3e058 ("bpf: Replace bpf memory allocator with kmalloc_nolock()
in local storage"), bpf memory allocator is replaced with kmalloc_nolock.
This approach is problematic for architectures that lack CMPXCHG_DOUBLE
because kmalloc_nolock always fails in this case:
In function kmalloc_nolock (kmalloc_nolock_noprof):
if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
/*
* kmalloc_nolock() is not supported on architectures that
* don't implement cmpxchg16b, but debug caches don't use
* per-cpu slab and per-cpu partial slabs. They rely on
* kmem_cache_node->list_lock, so kmalloc_nolock() can
* attempt to allocate from debug caches by
* spin_trylock_irqsave(&n->list_lock, ...)
*/
return NULL;
Fix it by enabling use_kmalloc_nolock only when HAVE_CMPXCHG_DOUBLE.
(But not for a PREEMPT_RT case as explained in the comment and commitmsg)
Note for stable: this only needs to be picked into v6.19 if the patch
makes it into 7.0.
---
Changes in v2:
- Drop the modification to the PREEMPT_RT case as it requires
kmalloc_nolock for correctness.
- Add a comment to the PREEMPT_RT case about the limitation when
not HAVE_CMPXCHG_DOUBLE but enables PREEMPT_RT.
- Link to v1: https://lore.kernel.org/r/20260314-bpf-kmalloc-nolock-v1-1-24abf3f75a9f@outlook.com
---
include/linux/bpf_local_storage.h | 2 ++
kernel/bpf/bpf_cgrp_storage.c | 2 +-
kernel/bpf/bpf_local_storage.c | 4 ++++
kernel/bpf/bpf_task_storage.c | 2 +-
4 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h
index 8157e8da61d40..a7ae5dde15bcb 100644
--- a/include/linux/bpf_local_storage.h
+++ b/include/linux/bpf_local_storage.h
@@ -19,6 +19,8 @@
#define BPF_LOCAL_STORAGE_CACHE_SIZE 16
+static const bool KMALLOC_NOLOCK_SUPPORTED = IS_ENABLED(CONFIG_HAVE_CMPXCHG_DOUBLE);
+
struct bpf_local_storage_map_bucket {
struct hlist_head list;
rqspinlock_t lock;
diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c
index c2a2ead1f466d..09c557e426968 100644
--- a/kernel/bpf/bpf_cgrp_storage.c
+++ b/kernel/bpf/bpf_cgrp_storage.c
@@ -114,7 +114,7 @@ static int notsupp_get_next_key(struct bpf_map *map, void *key, void *next_key)
static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
{
- return bpf_local_storage_map_alloc(attr, &cgroup_cache, true);
+ return bpf_local_storage_map_alloc(attr, &cgroup_cache, KMALLOC_NOLOCK_SUPPORTED);
}
static void cgroup_storage_map_free(struct bpf_map *map)
diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index 9c96a4477f81a..a6c240da87668 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -893,6 +893,10 @@ bpf_local_storage_map_alloc(union bpf_attr *attr,
/* In PREEMPT_RT, kmalloc(GFP_ATOMIC) is still not safe in non
* preemptible context. Thus, enforce all storages to use
* kmalloc_nolock() when CONFIG_PREEMPT_RT is enabled.
+ *
+ * However, kmalloc_nolock would fail on architectures that do not
+ * have CMPXCHG_DOUBLE. On such architectures with PREEMPT_RT,
+ * bpf_local_storage_alloc would always fail.
*/
smap->use_kmalloc_nolock = IS_ENABLED(CONFIG_PREEMPT_RT) ? true : use_kmalloc_nolock;
diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c
index 605506792b5b4..9f74fd1ef7f46 100644
--- a/kernel/bpf/bpf_task_storage.c
+++ b/kernel/bpf/bpf_task_storage.c
@@ -212,7 +212,7 @@ static int notsupp_get_next_key(struct bpf_map *map, void *key, void *next_key)
static struct bpf_map *task_storage_map_alloc(union bpf_attr *attr)
{
- return bpf_local_storage_map_alloc(attr, &task_cache, true);
+ return bpf_local_storage_map_alloc(attr, &task_cache, KMALLOC_NOLOCK_SUPPORTED);
}
static void task_storage_map_free(struct bpf_map *map)
---
base-commit: e06e6b8001233241eb5b2e2791162f0585f50f4b
change-id: 20260314-bpf-kmalloc-nolock-60da80e613de
Best regards,
--
Levi Zim <rsworktech@outlook.com>
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-03-14 15:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-14 15:31 [PATCH bpf v2] bpf: do not use kmalloc_nolock when !HAVE_CMPXCHG_DOUBLE Levi Zim via B4 Relay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox