* [PATCH bpf 0/2] bpf: Copy per-CPU map value padding in copy_map_value_long()
@ 2026-06-24 15:51 Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 1/2] " Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 2/2] selftests/bpf: Verify no non-zeroed kernel heap memory exposure Leon Hwang
0 siblings, 2 replies; 3+ messages in thread
From: Leon Hwang @ 2026-06-24 15:51 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan, Leon Hwang, linux-kernel,
linux-kselftest, kernel-patches-bot
Sashiko reported [1]:
This is a pre-existing issue, but does iterating over per-CPU maps expose
uninitialized kernel heap memory?
When working with per-CPU maps, temporary buffers are allocated using kmalloc
without the __GFP_ZERO flag in functions like bpf_iter_init_array_map in
kernel/bpf/arraymap.c:
kernel/bpf/arraymap.c:bpf_iter_init_array_map() {
...
value_buf = kmalloc(buf_size, GFP_USER | __GFP_NOWARN);
...
}
This is also done in kernel/bpf/hashtab.c:bpf_iter_init_hash_map().
If the map contains a BTF record, bpf_obj_memcpy in include/linux/bpf.h
explicitly stops at map->value_size instead of filling the entire rounded-up
size:
include/linux/bpf.h:bpf_obj_memcpy() {
...
memcpy(dst + curr_off, src + curr_off, size - curr_off);
...
}
This fails to overwrite the padding bytes up to round_up(map->value_size, 8).
[1] https://lore.kernel.org/bpf/20260622150844.28C551F000E9@smtp.kernel.org/
===
For example,
struct map_uninit_value {
struct prog_test_ref_kfunc __kptr_untrusted *unref_ptr;
__u32 data;
} __attribute__((packed));
struct {
__uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
__type(key, int);
__type(value, struct map_uninit_value);
__uint(max_entries, 1);
} pcpu_array SEC(".maps");
There are 4 padding bytes in the kernel percpu_array map elements.
When lookup element from 'pcpu_array' map, for each CPU, the 4 padding
bytes memory allocated by syscall.c::map_lookup_elem():kvmalloc() could
be exposed to user space.
Without the fix, the selftest could fail with:
test_map_uninit_mem_exposure:FAIL:zeroed tail bytes unexpected memory
mismatch
actual:
2B 2B 2B 2B
expected:
00 00 00 00
Leon Hwang (2):
bpf: Copy per-CPU map value padding in copy_map_value_long()
selftests/bpf: Verify no non-zeroed kernel heap memory exposure
include/linux/bpf.h | 4 +-
.../bpf/prog_tests/test_map_uninit.c | 68 +++++++++++++++++++
tools/testing/selftests/bpf/progs/map_kptr.c | 12 ++++
3 files changed, 82 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_map_uninit.c
--
2.54.0
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH bpf 1/2] bpf: Copy per-CPU map value padding in copy_map_value_long()
2026-06-24 15:51 [PATCH bpf 0/2] bpf: Copy per-CPU map value padding in copy_map_value_long() Leon Hwang
@ 2026-06-24 15:51 ` Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 2/2] selftests/bpf: Verify no non-zeroed kernel heap memory exposure Leon Hwang
1 sibling, 0 replies; 3+ messages in thread
From: Leon Hwang @ 2026-06-24 15:51 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan, Leon Hwang, linux-kernel,
linux-kselftest, kernel-patches-bot
In kernel, per-CPU map elements are stored with
round_up(map->value_size, 8) bytes. On UAPI lookup paths, it copies the
rounded size for each CPU into a temporary buffer.
However, copy_map_value_long() passes 'map->value_size' to
bpf_obj_memcpy(). When the map has special fields, bpf_obj_memcpy() copies
around those fields with memcpy(), and does not copy the tail padding
between 'map->value_size' and round_up(map->value_size, 8).
The temporary UAPI lookup buffers are allocated without __GFP_ZERO. As a
result, when the per-CPU map's value size is not equal to
round_up(map->value_size, 8), UAPI LOOKUP_ELEM and its variants can return
stale heap contents from that padding to user space. The same issue
applies to bpf_iter for per-CPU maps.
Pass round_up(map->value_size, 8) to bpf_obj_memcpy() from
copy_map_value_long(), so per-CPU maps both with and without special
fields copy the entire per-CPU slot. Remove the now redundant round_up()
from bpf_obj_memcpy()'s long_memcpy path.
Fixes: 448325199f57 ("bpf: Add copy_map_value_long to copy to remote percpu memory")
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7719f6528445..ba09795e0bfd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -570,7 +570,7 @@ static inline void bpf_obj_memcpy(struct btf_record *rec,
if (IS_ERR_OR_NULL(rec)) {
if (long_memcpy)
- bpf_long_memcpy(dst, src, round_up(size, 8));
+ bpf_long_memcpy(dst, src, size);
else
memcpy(dst, src, size);
return;
@@ -593,7 +593,7 @@ static inline void copy_map_value(struct bpf_map *map, void *dst, void *src)
static inline void copy_map_value_long(struct bpf_map *map, void *dst, void *src)
{
- bpf_obj_memcpy(map->record, dst, src, map->value_size, true);
+ bpf_obj_memcpy(map->record, dst, src, round_up(map->value_size, 8), true);
}
static inline void bpf_obj_swap_uptrs(const struct btf_record *rec, void *dst, void *src)
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH bpf 2/2] selftests/bpf: Verify no non-zeroed kernel heap memory exposure
2026-06-24 15:51 [PATCH bpf 0/2] bpf: Copy per-CPU map value padding in copy_map_value_long() Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 1/2] " Leon Hwang
@ 2026-06-24 15:51 ` Leon Hwang
1 sibling, 0 replies; 3+ messages in thread
From: Leon Hwang @ 2026-06-24 15:51 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan, Leon Hwang, linux-kernel,
linux-kselftest, kernel-patches-bot
When lookup element from those per-CPU maps, which have special field
in their values and their value size is not equal to roundup(value_sz, 8),
the padding size of temporary non-zeroed kernel heap memory allocated by
kvmalloc should not be exposed to user space.
Without the fix:
test_map_uninit_mem_exposure:FAIL:zeroed tail bytes unexpected memory mismatch
actual:
2B 2B 2B 2B
expected:
00 00 00 00
Assisted-by: Codex:gpt-5.5
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
.../bpf/prog_tests/test_map_uninit.c | 68 +++++++++++++++++++
tools/testing/selftests/bpf/progs/map_kptr.c | 12 ++++
2 files changed, 80 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_map_uninit.c
diff --git a/tools/testing/selftests/bpf/prog_tests/test_map_uninit.c b/tools/testing/selftests/bpf/prog_tests/test_map_uninit.c
new file mode 100644
index 000000000000..d0ba2ca587b0
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_map_uninit.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+
+#include "map_kptr.skel.h"
+
+void test_map_uninit_mem_exposure(void)
+{
+ size_t value_sz, slot_sz, lookup_sz, tail_sz;
+ int err, key, nr_cpus, cpu, map_fd;
+ __u8 *value = NULL, *zero = NULL;
+ struct bpf_program *prog;
+ struct map_kptr *skel;
+
+ nr_cpus = libbpf_num_possible_cpus();
+ if (!ASSERT_GT(nr_cpus, 0, "libbpf_num_possible_cpus"))
+ return;
+
+ skel = map_kptr__open();
+ if (!ASSERT_OK_PTR(skel, "map_kptr__open"))
+ return;
+
+ bpf_object__for_each_program(prog, skel->obj) {
+ err = bpf_program__set_autoload(prog, false);
+ if (!ASSERT_OK(err, "bpf_program__set_autoload"))
+ goto out;
+ }
+
+ err = map_kptr__load(skel);
+ if (!ASSERT_OK(err, "map_kptr__load"))
+ goto out;
+
+ value_sz = bpf_map__value_size((skel)->maps.pcpu_array);
+ slot_sz = roundup(value_sz, 8);
+ tail_sz = slot_sz - value_sz;
+ if (!ASSERT_NEQ(tail_sz, 0, "tail_sz"))
+ goto out;
+
+ lookup_sz = slot_sz * nr_cpus;
+ map_fd = bpf_map__fd(skel->maps.pcpu_array);
+
+ value = malloc(lookup_sz);
+ zero = calloc(1, tail_sz);
+ if (!ASSERT_OK_PTR(value, "malloc value") || !ASSERT_OK_PTR(zero, "calloc zero"))
+ goto out;
+
+ key = 0;
+ memset(value, 0x2B, lookup_sz);
+ err = bpf_map_update_elem(map_fd, &key, value, BPF_ANY);
+ if (!ASSERT_OK(err, "bpf_map_update_elem"))
+ goto out;
+
+ memset(value, 0xFF, lookup_sz);
+ err = bpf_map_lookup_elem(map_fd, &key, value);
+ if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+ goto out;
+
+ for (cpu = 0; cpu < nr_cpus; cpu++) {
+ __u8 *tail = value + cpu * slot_sz + value_sz;
+
+ if (!ASSERT_MEMEQ(tail, zero, tail_sz, "zeroed tail bytes"))
+ goto out;
+ }
+
+out:
+ free(zero);
+ free(value);
+ map_kptr__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/map_kptr.c b/tools/testing/selftests/bpf/progs/map_kptr.c
index 3fbefc568e0a..0d87c97dac99 100644
--- a/tools/testing/selftests/bpf/progs/map_kptr.c
+++ b/tools/testing/selftests/bpf/progs/map_kptr.c
@@ -4,6 +4,18 @@
#include <bpf/bpf_helpers.h>
#include "../test_kmods/bpf_testmod_kfunc.h"
+struct map_uninit_value {
+ struct prog_test_ref_kfunc __kptr_untrusted *unref_ptr;
+ __u32 data;
+} __attribute__((packed));
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+ __type(key, int);
+ __type(value, struct map_uninit_value);
+ __uint(max_entries, 1);
+} pcpu_array SEC(".maps");
+
struct map_value {
struct prog_test_ref_kfunc __kptr_untrusted *unref_ptr;
struct prog_test_ref_kfunc __kptr *ref_ptr;
--
2.54.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-24 15:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 15:51 [PATCH bpf 0/2] bpf: Copy per-CPU map value padding in copy_map_value_long() Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 1/2] " Leon Hwang
2026-06-24 15:51 ` [PATCH bpf 2/2] selftests/bpf: Verify no non-zeroed kernel heap memory exposure Leon Hwang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.