[PATCH bpf-next v8 0/9] bpf: Introduce global percpu data

Linux Kernel Selftest development
 help / color / mirror / Atom feed

* [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data
@ 2026-06-29 15:23 Leon Hwang
  2026-06-29 15:23 ` [PATCH bpf-next v8 1/9] bpf: Drop duplicate blank lines in verifier Leon Hwang
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:23 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

This patch set introduces global percpu data, similar to commit
6316f78306c1 ("Merge branch 'support-global-data'"), to reduce restrictions
in C for BPF programs.

With this enhancement, it becomes possible to define and use global percpu
variables, like the DEFINE_PER_CPU() macro in the kernel
include/linux/percpu-defs.h.

The section name for global peurcpu data is ".percpu". Even though, a one-byte
percpu variable (e.g., char run SEC(".percpu") = 0;) can trigger a crash
with Clang 17 [1], users are expected to use such small variables as global
percpu data with newer Clang versions, which don't have the issue.

The idea stems from the bpfsnoop [2], which itself was inspired by
retsnoop [3]. During testing of bpfsnoop on the v6.6 kernel, two LBR
(Last Branch Record) entries were observed related to the
bpf_get_smp_processor_id() helper.

Since commit 1ae6921009e5 ("bpf: inline bpf_get_smp_processor_id() helper"),
the bpf_get_smp_processor_id() helper has been inlined on x86_64, reducing
the overhead and consequently minimizing these two LBR records.

However, the introduction of global percpu data offers a more robust
solution. By leveraging the percpu_array map and percpu instruction,
global percpu data can be implemented intrinsically.

This feature also facilitates sharing percpu information between tail
callers and callees or between freplace callers and callees through a
shared global percpu variable. Previously, this was achieved using a
1-entry percpu_array map, which this patch set aims to improve upon.

Links:
[1] https://lore.kernel.org/bpf/fd1b3f58-c27f-403d-ad99-644b7d06ecb3@linux.dev/
[2] https://github.com/bpfsnoop/bpfsnoop
[3] https://github.com/anakryiko/retsnoop

Changes:
v7 -> v8:
* Send patch #1 and #2 separately that fix interpreter fallback issues.
  (Andrii)
* Use 'array->elem_size' to avoid 'range' local variable in
  percpu_array_map_direct_value_meta(). (Andrii)
* Keep original map name for percpu data's map in libbpf. (Andrii)
* Factor out helper bpf_map_is_skel_data() in bpftool. (Andrii)
* Update commit message of direct access read-only percpu_array map.
  (Andrii)
* Add test to verify that it is disallowed to directly write data of
  read-only percpu_array map. (Andrii)
* Drop unused 'num_cpus' in test. (bot+bpf-ci)
* Factor out helper test_percpu_data_on_cpus() in test. (bot+bpf-ci)
* v7: https://lore.kernel.org/bpf/20260622143557.22955-1-leon.hwang@linux.dev/

v6 -> v7:
* Use tgt_endian() in bpf_gen__map_update_elem() in patch #6. (Sashiko)
* Use sizeof(args) in verifier_snprintf test in patch #10. (Sashiko)
* Drop xlated test of v6. (Alexei)
* v6: https://lore.kernel.org/bpf/20260615152646.27639-1-leon.hwang@linux.dev/

v5 -> v6:
* Prevent running user addr_space_cast and addr_percpu insns in
  interpreter. (Sashiko)
* Cast __percpu pointer to u64 with (__force unsigned long). (lkp)
* Exclude BPF_MAP_TYPE_PERCPU_ARRAY in check_mem_access() before calling
  bpf_map_direct_read(), and add a test to verify it.
  (Sashiko, bot+bpf-ci)
* Skip percpu data variables for subskeleton in bpftool. (Sashiko)
* Protect skel->percpu using mprotect(..., PROT_READ) in light skeleton.
  (Sashiko, bot+bpf-ci)
* Drop roundup() in tests. (Sashiko)
* Call test_global_percpu_data_verifier_log() without
  test__start_subtest(). (Sashiko)
* Cast insn->imm to __u64 with (__u32) in xlated test. (Sashiko)
* Check cnt using the new idx in xlated test. (Sashiko)
* v5: https://lore.kernel.org/bpf/20260608145113.65857-1-leon.hwang@linux.dev/

v4 -> v5:
* Add prog->jit_requested check to prevent running percpu data in
  interpreter in patch #1.
* Factor out verifier log tests using its own patch.
* Address comments from Alexei:
  * Move map_type check from check_mem_access() to bpf_map_direct_read()
    in patch #2.
  * Move BPF_MAP_TYPE_INSN_ARRAY map_type check from const_reg_xfer() to
    bpf_map_direct_read() in patch #2.
  * Add a test to verify that the off of xlated ldimm64 insn matches the
    off encoded in the ELF ldimm64 insn.
  * Drop patch #5 of v4.
* Address reviews from Sashiko:
  * Update commit message of patch #6 to indicate that maps.percpu->mmaped
    has been marked as read-only in libbpf.
  * Lookup elem on specified CPU using BPF_F_CPU in tests.
  * Drop unnecessary err == -EOPNOTSUPP in test.
  * Locate target field using its offset in the iter test.
* v4: https://lore.kernel.org/bpf/20260414132421.63409-1-leon.hwang@linux.dev/

v3 -> v4:
* Drop duplicate blank lines in verifier.
* Add percpu data feature probe in libbpf.
* Update percpu_array map using BPF_F_ALL_CPUS flag for lskel, if no cpu flag
  is set.
* Add two tests to verify verifier log.
* Add a test to verify mov64_percpu_reg instruction.
* Add a test to verify bpf_iter for percpu data map.
* Update percpu_array map using BPF_F_ALL_CPUS flag in libbpf
  (per Alexei and Andrii).
* Address comments from Andrii:
  * Use .percpu as section identifier.
  * Use bpf_jit_supports_percpu_insn() instead of CONFIG_SMP.
  * Drop bpf_map__is_internal_percpu() API.
  * Drop unnecessary __aligned(8) in libbpf, verified by selftest.
  * Make mmap data read-only after loading prog.
v3: https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@linux.dev/

v2 -> v3:
  * Use ".data..percpu" as PERCPU_DATA_SEC.
  * Address comment from Alexei:
    * Add u8, array of ints and struct { .. } vars to selftest.
v2: https://lore.kernel.org/bpf/20250213161931.46399-1-leon.hwang@linux.dev/

v1 -> v2:
  * Address comments from Andrii:
    * Use LIBBPF_MAP_PERCPU and SEC_PERCPU.
    * Reuse mmaped of libbpf's struct bpf_map for .percpu map data.
    * Set .percpu struct pointer to NULL after loading skeleton.
    * Make sure value size of .percpu map is __aligned(8).
    * Use raw_tp and opts.cpu to test global percpu variables on all CPUs.
  * Address comments from Alexei:
    * Test non-zero offset of global percpu variable.
    * Test case about BPF_PSEUDO_MAP_IDX_VALUE.
v1: https://lore.kernel.org/bpf/20250127162158.84906-1-leon.hwang@linux.dev/

rfc -> v1:
  * Address comments from Andrii:
    * Keep one image of global percpu variable for all CPUs.
    * Reject non-ARRAY map in bpf_map_direct_read(), check_reg_const_str(),
      and check_bpf_snprintf_call() in verifier.
    * Split out libbpf changes from kernel-side changes.
    * Use ".percpu" as PERCPU_DATA_SEC.
    * Use enum libbpf_map_type to distinguish BSS, DATA, RODATA and
      PERCPU_DATA.
    * Avoid using errno for checking err from libbpf_num_possible_cpus().
    * Use "map '%s': " prefix for error message.
rfc: https://lore.kernel.org/bpf/20250113152437.67196-1-leon.hwang@linux.dev/

Leon Hwang (9):
  bpf: Drop duplicate blank lines in verifier
  bpf: Introduce global percpu data
  libbpf: Probe percpu data feature
  libbpf: Add support for global percpu data
  bpftool: Generate skeleton for global percpu data
  selftests/bpf: Add tests to verify global percpu data
  selftests/bpf: Test direct reading/writing read-only percpu_array map
  selftests/bpf: Test verifier log for global percpu data
  selftests/bpf: Verify bpf_iter for global percpu data

 kernel/bpf/arraymap.c                         |  38 ++-
 kernel/bpf/const_fold.c                       |   1 -
 kernel/bpf/fixups.c                           |  32 ++
 kernel/bpf/verifier.c                         |  30 +-
 tools/bpf/bpftool/gen.c                       |  49 ++-
 tools/lib/bpf/bpf_gen_internal.h              |   3 +-
 tools/lib/bpf/features.c                      |  35 +++
 tools/lib/bpf/gen_loader.c                    |   3 +-
 tools/lib/bpf/libbpf.c                        |  57 +++-
 tools/lib/bpf/libbpf_internal.h               |   2 +
 tools/lib/bpf/skel_internal.h                 |  24 +-
 tools/testing/selftests/bpf/Makefile          |   2 +-
 .../bpf/prog_tests/global_data_init.c         | 290 ++++++++++++++++++
 .../bpf/prog_tests/global_percpu_subskel.c    |  37 +++
 .../bpf/progs/test_global_percpu_data.c       |  79 +++++
 15 files changed, 646 insertions(+), 36 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/global_percpu_subskel.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_global_percpu_data.c

--
2.54.0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 1/9] bpf: Drop duplicate blank lines in verifier
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
@ 2026-06-29 15:23 ` Leon Hwang
  2026-06-29 15:23 ` [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data Leon Hwang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:23 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

There are many adjacent blank lines in the verifier that have accumulated
over time.

Drop them for cleanup.

No functional changes intended.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 kernel/bpf/verifier.c | 15 ---------------
 1 file changed, 15 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 25aea4271cd0..49a331c27b43 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -652,7 +652,6 @@ static void __mark_dynptr_reg(struct bpf_reg_state *reg,
 			      enum bpf_dynptr_type type,
 			      bool first_slot, int id, int parent_id);
 
-
 static void mark_dynptr_stack_regs(struct bpf_verifier_env *env,
 				   struct bpf_reg_state *sreg1,
 				   struct bpf_reg_state *sreg2,
@@ -1689,7 +1688,6 @@ static bool same_callsites(struct bpf_verifier_state *a, struct bpf_verifier_sta
 	return true;
 }
 
-
 void bpf_free_backedges(struct bpf_scc_visit *visit)
 {
 	struct bpf_scc_backedge *backedge, *next;
@@ -2309,7 +2307,6 @@ static struct bpf_verifier_state *push_async_cb(struct bpf_verifier_env *env,
 	return &elem->st;
 }
 
-
 static int cmp_subprogs(const void *a, const void *b)
 {
 	return ((struct bpf_subprog_info *)a)->start -
@@ -3331,7 +3328,6 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
 	}
 }
 
-
 /* check if register is a constant scalar value */
 static bool is_reg_const(struct bpf_reg_state *reg, bool subreg32)
 {
@@ -4025,7 +4021,6 @@ static int check_stack_read(struct bpf_verifier_env *env,
 	return err;
 }
 
-
 /* check_stack_write dispatches to check_stack_write_fixed_off or
  * check_stack_write_var_off.
  *
@@ -4820,7 +4815,6 @@ static int check_sock_access(struct bpf_verifier_env *env, int insn_idx,
 		valid = false;
 	}
 
-
 	if (valid) {
 		env->insn_aux_data[insn_idx].ctx_field_size =
 			info.ctx_field_size;
@@ -6645,7 +6639,6 @@ static int check_stack_range_initialized(
 	if (err)
 		return err;
 
-
 	if (tnum_is_const(reg->var_off)) {
 		min_off = max_off = reg->var_off.value + off;
 	} else {
@@ -7359,7 +7352,6 @@ static bool is_iter_new_kfunc(struct bpf_kfunc_call_arg_meta *meta)
 	return meta->kfunc_flags & KF_ITER_NEW;
 }
 
-
 static bool is_iter_destroy_kfunc(struct bpf_kfunc_call_arg_meta *meta)
 {
 	return meta->kfunc_flags & KF_ITER_DESTROY;
@@ -11502,7 +11494,6 @@ static int process_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *
 	return 0;
 }
 
-
 static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct btf_record *rec = reg_btf_record(reg);
@@ -16394,7 +16385,6 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	return 0;
 }
 
-
 static bool return_retval_range(struct bpf_verifier_env *env, struct bpf_retval_range *range)
 {
 	enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
@@ -18288,8 +18278,6 @@ static void release_insn_arrays(struct bpf_verifier_env *env)
 		bpf_insn_array_release(env->insn_array_maps[i]);
 }
 
-
-
 /* The verifier does more data flow analysis than llvm and will not
  * explore branches that are dead at run time. Malicious programs can
  * have dead code too. Therefore replace all dead at-run-time code
@@ -18317,8 +18305,6 @@ static void sanitize_dead_code(struct bpf_verifier_env *env)
 	}
 }
 
-
-
 static void free_states(struct bpf_verifier_env *env)
 {
 	struct bpf_verifier_state_list *sl;
@@ -18600,7 +18586,6 @@ static int do_check_main(struct bpf_verifier_env *env)
 	return ret;
 }
 
-
 static void print_verification_stats(struct bpf_verifier_env *env)
 {
 	/* Skip over hidden subprogs which are not verified. */
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
  2026-06-29 15:23 ` [PATCH bpf-next v8 1/9] bpf: Drop duplicate blank lines in verifier Leon Hwang
@ 2026-06-29 15:23 ` Leon Hwang
  2026-07-01 19:31   ` Andrii Nakryiko
  2026-06-29 15:24 ` [PATCH bpf-next v8 3/9] libbpf: Probe percpu data feature Leon Hwang
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:23 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Introduce global percpu data, inspired by the commit
6316f78306c1 ("Merge branch 'support-global-data'"). It enables the
definition of global percpu variables in BPF, similar to the
include/linux/percpu-defs.h::DEFINE_PER_CPU() macro.

For example, in BPF, it is able to define a global percpu variable like:

int data SEC(".percpu");

With this patch, tools like retsnoop [1] and bpfsnoop [2] can simplify
their BPF code for handling LBRs. The code can be updated from

static struct perf_branch_entry lbrs[1][MAX_LBR_ENTRIES] SEC(".data.lbrs");

to

static struct perf_branch_entry lbrs[MAX_LBR_ENTRIES] SEC(".percpu.lbrs");

This eliminates the need to retrieve the CPU ID using the
bpf_get_smp_processor_id() helper.

Additionally, by reusing global percpu data map, sharing information
between tail callers and callees or freplace callers and callees becomes
simpler compared to reusing percpu_array maps.

Links:
[1] https://github.com/anakryiko/retsnoop
[2] https://github.com/bpfsnoop/bpfsnoop

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 kernel/bpf/arraymap.c   | 38 ++++++++++++++++++++++++++++++++++++--
 kernel/bpf/const_fold.c |  1 -
 kernel/bpf/fixups.c     | 32 ++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c   | 15 +++++++++++++++
 4 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 248b4818178c..c4e9430941e5 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -259,6 +259,37 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key)
 	return this_cpu_ptr(array->pptrs[index & array->index_mask]);
 }
 
+static int percpu_array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off)
+{
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
+
+	if (map->max_entries != 1)
+		return -EOPNOTSUPP;
+	if (off >= map->value_size)
+		return -EINVAL;
+	if (!bpf_jit_supports_percpu_insn())
+		return -EOPNOTSUPP;
+
+	*imm = (u64)(__force unsigned long) array->pptrs[0];
+	return 0;
+}
+
+static int percpu_array_map_direct_value_meta(const struct bpf_map *map, u64 imm, u32 *off)
+{
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
+	u64 base = (u64)(__force unsigned long) array->pptrs[0];
+
+	if (map->max_entries != 1)
+		return -EOPNOTSUPP;
+	if (imm < base || imm >= base + array->elem_size)
+		return -ENOENT;
+	if (!bpf_jit_supports_percpu_insn())
+		return -EOPNOTSUPP;
+
+	*off = imm - base;
+	return 0;
+}
+
 /* emit BPF instructions equivalent to C code of percpu_array_map_lookup_elem() */
 static int percpu_array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
 {
@@ -551,9 +582,10 @@ static int array_map_check_btf(struct bpf_map *map,
 			       const struct btf_type *key_type,
 			       const struct btf_type *value_type)
 {
-	/* One exception for keyless BTF: .bss/.data/.rodata map */
+	/* One exception for keyless BTF: .bss/.data/.rodata/.percpu map */
 	if (btf_type_is_void(key_type)) {
-		if (map->map_type != BPF_MAP_TYPE_ARRAY ||
+		if ((map->map_type != BPF_MAP_TYPE_ARRAY &&
+		     map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) ||
 		    map->max_entries != 1)
 			return -EINVAL;
 
@@ -832,6 +864,8 @@ const struct bpf_map_ops percpu_array_map_ops = {
 	.map_get_next_key = bpf_array_get_next_key,
 	.map_lookup_elem = percpu_array_map_lookup_elem,
 	.map_gen_lookup = percpu_array_map_gen_lookup,
+	.map_direct_value_addr = percpu_array_map_direct_value_addr,
+	.map_direct_value_meta = percpu_array_map_direct_value_meta,
 	.map_update_elem = array_map_update_elem,
 	.map_delete_elem = array_map_delete_elem,
 	.map_lookup_percpu_elem = percpu_array_map_lookup_percpu_elem,
diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
index b2a19acadb91..5787246bef30 100644
--- a/kernel/bpf/const_fold.c
+++ b/kernel/bpf/const_fold.c
@@ -182,7 +182,6 @@ static void const_reg_xfer(struct bpf_verifier_env *env, struct const_arg_info *
 		u64 val = 0;
 
 		if (!bpf_map_is_rdonly(map) || !map->ops->map_direct_value_addr ||
-		    map->map_type == BPF_MAP_TYPE_INSN_ARRAY ||
 		    off < 0 || off + size > map->value_size ||
 		    bpf_map_direct_read(map, off, size, &val, is_ldsx)) {
 			*dst = unknown;
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 3cf2cc6e3ab6..4f84d087ca69 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -1819,6 +1819,38 @@ int bpf_do_misc_fixups(struct bpf_verifier_env *env)
 			goto next_insn;
 		}
 
+		if (env->prog->jit_requested &&
+		    bpf_jit_supports_percpu_insn() &&
+		    insn->code == (BPF_LD | BPF_IMM | BPF_DW) &&
+		    (insn->src_reg == BPF_PSEUDO_MAP_VALUE ||
+		     insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE)) {
+			struct bpf_map *map;
+
+			aux = &env->insn_aux_data[i + delta];
+			map = env->used_maps[aux->map_index];
+			if (map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
+				goto next_insn;
+
+			/*
+			 * Reuse the original ld_imm64 insn, and add one
+			 * mov64_percpu_reg insn.
+			 */
+
+			insn_buf[0] = insn[1];
+			insn_buf[1] = BPF_MOV64_PERCPU_REG(insn->dst_reg, insn->dst_reg);
+			cnt = 2;
+
+			i++;
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			goto next_insn;
+		}
+
 		if (insn->code != (BPF_JMP | BPF_CALL))
 			goto next_insn;
 		if (insn->src_reg == BPF_PSEUDO_CALL)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 49a331c27b43..dbf76fa9d43d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5594,6 +5594,8 @@ int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val,
 	u64 addr;
 	int err;
 
+	if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY || map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
+		return -EINVAL;
 	err = map->ops->map_direct_value_addr(map, &addr, off);
 	if (err)
 		return err;
@@ -6149,6 +6151,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, struct b
 			if (tnum_is_const(reg->var_off) &&
 			    bpf_map_is_rdonly(map) &&
 			    map->ops->map_direct_value_addr &&
+			    map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY &&
 			    map->map_type != BPF_MAP_TYPE_INSN_ARRAY) {
 				int map_off = off + reg->var_off.value;
 				u64 val = 0;
@@ -8117,6 +8120,12 @@ static int check_arg_const_str(struct bpf_verifier_env *env,
 		return -EACCES;
 	}
 
+	if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
+		verbose(env, "%s points to percpu_array map which cannot be used as const string\n",
+			reg_arg_name(env, argno));
+		return -EACCES;
+	}
+
 	if (!bpf_map_is_rdonly(map)) {
 		verbose(env, "%s does not point to a readonly map'\n", reg_arg_name(env, argno));
 		return -EACCES;
@@ -18203,6 +18212,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
 					return -EINVAL;
 				}
 
+				if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY &&
+				    !env->prog->jit_requested) {
+					verbose(env, "JIT is required to use global percpu data\n");
+					return -EOPNOTSUPP;
+				}
+
 				err = map->ops->map_direct_value_addr(map, &addr, off);
 				if (err) {
 					verbose(env, "invalid access to map value pointer, value_size=%u off=%u\n",
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data
  2026-06-29 15:23 ` [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data Leon Hwang
@ 2026-07-01 19:31   ` Andrii Nakryiko
  2026-07-02  6:15     ` Leon Hwang
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2026-07-01 19:31 UTC (permalink / raw)
  To: Leon Hwang
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, linux-kernel, linux-kselftest,
	kernel-patches-bot

On Mon, Jun 29, 2026 at 8:24 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>
> Introduce global percpu data, inspired by the commit
> 6316f78306c1 ("Merge branch 'support-global-data'"). It enables the
> definition of global percpu variables in BPF, similar to the
> include/linux/percpu-defs.h::DEFINE_PER_CPU() macro.
>
> For example, in BPF, it is able to define a global percpu variable like:
>
> int data SEC(".percpu");
>
> With this patch, tools like retsnoop [1] and bpfsnoop [2] can simplify
> their BPF code for handling LBRs. The code can be updated from
>
> static struct perf_branch_entry lbrs[1][MAX_LBR_ENTRIES] SEC(".data.lbrs");
>
> to
>
> static struct perf_branch_entry lbrs[MAX_LBR_ENTRIES] SEC(".percpu.lbrs");
>
> This eliminates the need to retrieve the CPU ID using the
> bpf_get_smp_processor_id() helper.
>
> Additionally, by reusing global percpu data map, sharing information
> between tail callers and callees or freplace callers and callees becomes
> simpler compared to reusing percpu_array maps.
>
> Links:
> [1] https://github.com/anakryiko/retsnoop
> [2] https://github.com/bpfsnoop/bpfsnoop
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
>  kernel/bpf/arraymap.c   | 38 ++++++++++++++++++++++++++++++++++++--
>  kernel/bpf/const_fold.c |  1 -
>  kernel/bpf/fixups.c     | 32 ++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c   | 15 +++++++++++++++
>  4 files changed, 83 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 248b4818178c..c4e9430941e5 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -259,6 +259,37 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key)
>         return this_cpu_ptr(array->pptrs[index & array->index_mask]);
>  }
>
> +static int percpu_array_map_direct_value_addr(const struct bpf_map *map, u64 *imm, u32 off)
> +{
> +       struct bpf_array *array = container_of(map, struct bpf_array, map);
> +
> +       if (map->max_entries != 1)
> +               return -EOPNOTSUPP;
> +       if (off >= map->value_size)
> +               return -EINVAL;
> +       if (!bpf_jit_supports_percpu_insn())
> +               return -EOPNOTSUPP;
> +
> +       *imm = (u64)(__force unsigned long) array->pptrs[0];
> +       return 0;
> +}
> +
> +static int percpu_array_map_direct_value_meta(const struct bpf_map *map, u64 imm, u32 *off)
> +{
> +       struct bpf_array *array = container_of(map, struct bpf_array, map);
> +       u64 base = (u64)(__force unsigned long) array->pptrs[0];
> +
> +       if (map->max_entries != 1)
> +               return -EOPNOTSUPP;
> +       if (imm < base || imm >= base + array->elem_size)
> +               return -ENOENT;
> +       if (!bpf_jit_supports_percpu_insn())
> +               return -EOPNOTSUPP;
> +
> +       *off = imm - base;
> +       return 0;
> +}
> +
>  /* emit BPF instructions equivalent to C code of percpu_array_map_lookup_elem() */
>  static int percpu_array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
>  {
> @@ -551,9 +582,10 @@ static int array_map_check_btf(struct bpf_map *map,
>                                const struct btf_type *key_type,
>                                const struct btf_type *value_type)
>  {
> -       /* One exception for keyless BTF: .bss/.data/.rodata map */
> +       /* One exception for keyless BTF: .bss/.data/.rodata/.percpu map */
>         if (btf_type_is_void(key_type)) {
> -               if (map->map_type != BPF_MAP_TYPE_ARRAY ||
> +               if ((map->map_type != BPF_MAP_TYPE_ARRAY &&
> +                    map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY) ||
>                     map->max_entries != 1)
>                         return -EINVAL;
>
> @@ -832,6 +864,8 @@ const struct bpf_map_ops percpu_array_map_ops = {
>         .map_get_next_key = bpf_array_get_next_key,
>         .map_lookup_elem = percpu_array_map_lookup_elem,
>         .map_gen_lookup = percpu_array_map_gen_lookup,
> +       .map_direct_value_addr = percpu_array_map_direct_value_addr,
> +       .map_direct_value_meta = percpu_array_map_direct_value_meta,
>         .map_update_elem = array_map_update_elem,
>         .map_delete_elem = array_map_delete_elem,
>         .map_lookup_percpu_elem = percpu_array_map_lookup_percpu_elem,
> diff --git a/kernel/bpf/const_fold.c b/kernel/bpf/const_fold.c
> index b2a19acadb91..5787246bef30 100644
> --- a/kernel/bpf/const_fold.c
> +++ b/kernel/bpf/const_fold.c
> @@ -182,7 +182,6 @@ static void const_reg_xfer(struct bpf_verifier_env *env, struct const_arg_info *
>                 u64 val = 0;
>
>                 if (!bpf_map_is_rdonly(map) || !map->ops->map_direct_value_addr ||
> -                   map->map_type == BPF_MAP_TYPE_INSN_ARRAY ||
>                     off < 0 || off + size > map->value_size ||
>                     bpf_map_direct_read(map, off, size, &val, is_ldsx)) {
>                         *dst = unknown;
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index 3cf2cc6e3ab6..4f84d087ca69 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
> @@ -1819,6 +1819,38 @@ int bpf_do_misc_fixups(struct bpf_verifier_env *env)
>                         goto next_insn;
>                 }
>
> +               if (env->prog->jit_requested &&
> +                   bpf_jit_supports_percpu_insn() &&
> +                   insn->code == (BPF_LD | BPF_IMM | BPF_DW) &&
> +                   (insn->src_reg == BPF_PSEUDO_MAP_VALUE ||
> +                    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE)) {
> +                       struct bpf_map *map;
> +
> +                       aux = &env->insn_aux_data[i + delta];
> +                       map = env->used_maps[aux->map_index];
> +                       if (map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
> +                               goto next_insn;
> +
> +                       /*
> +                        * Reuse the original ld_imm64 insn, and add one
> +                        * mov64_percpu_reg insn.
> +                        */
> +
> +                       insn_buf[0] = insn[1];
> +                       insn_buf[1] = BPF_MOV64_PERCPU_REG(insn->dst_reg, insn->dst_reg);
> +                       cnt = 2;
> +
> +                       i++;

oof, this was a subtle head scratcher for me.. that i++ is easy to
miss. let's update the comment to be more explicit: we are *skipping*
first half of ld_imm64, patching over second half of it with that same
half + percpu mov. All because bpf_patch_insn_data() can only replace
one 8-byte instruction, which doesn't work well for ldimm64.

Anyways, this looks correct, it just took me a bit to figure this out
and while the above comment warned me about this, it didn't really
make it any easier for figure out what's going on.

pw-bot: cr



> +                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
> +                       if (!new_prog)
> +                               return -ENOMEM;
> +
> +                       delta    += cnt - 1;
> +                       env->prog = prog = new_prog;
> +                       insn      = new_prog->insnsi + i + delta;
> +                       goto next_insn;
> +               }
> +
>                 if (insn->code != (BPF_JMP | BPF_CALL))
>                         goto next_insn;
>                 if (insn->src_reg == BPF_PSEUDO_CALL)
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 49a331c27b43..dbf76fa9d43d 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5594,6 +5594,8 @@ int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val,
>         u64 addr;
>         int err;
>
> +       if (map->map_type == BPF_MAP_TYPE_INSN_ARRAY || map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
> +               return -EINVAL;
>         err = map->ops->map_direct_value_addr(map, &addr, off);
>         if (err)
>                 return err;
> @@ -6149,6 +6151,7 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, struct b
>                         if (tnum_is_const(reg->var_off) &&
>                             bpf_map_is_rdonly(map) &&
>                             map->ops->map_direct_value_addr &&
> +                           map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY &&
>                             map->map_type != BPF_MAP_TYPE_INSN_ARRAY) {
>                                 int map_off = off + reg->var_off.value;
>                                 u64 val = 0;
> @@ -8117,6 +8120,12 @@ static int check_arg_const_str(struct bpf_verifier_env *env,
>                 return -EACCES;
>         }
>
> +       if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
> +               verbose(env, "%s points to percpu_array map which cannot be used as const string\n",
> +                       reg_arg_name(env, argno));
> +               return -EACCES;
> +       }
> +
>         if (!bpf_map_is_rdonly(map)) {
>                 verbose(env, "%s does not point to a readonly map'\n", reg_arg_name(env, argno));
>                 return -EACCES;
> @@ -18203,6 +18212,12 @@ static int check_and_resolve_insns(struct bpf_verifier_env *env)
>                                         return -EINVAL;
>                                 }
>
> +                               if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY &&
> +                                   !env->prog->jit_requested) {
> +                                       verbose(env, "JIT is required to use global percpu data\n");
> +                                       return -EOPNOTSUPP;
> +                               }
> +
>                                 err = map->ops->map_direct_value_addr(map, &addr, off);
>                                 if (err) {
>                                         verbose(env, "invalid access to map value pointer, value_size=%u off=%u\n",
> --
> 2.54.0
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data
  2026-07-01 19:31   ` Andrii Nakryiko
@ 2026-07-02  6:15     ` Leon Hwang
  0 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-07-02  6:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, linux-kernel, linux-kselftest,
	kernel-patches-bot

On 2/7/26 03:31, Andrii Nakryiko wrote:
> On Mon, Jun 29, 2026 at 8:24 AM Leon Hwang <leon.hwang@linux.dev> wrote:
[...]
>> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
>> index 3cf2cc6e3ab6..4f84d087ca69 100644
>> --- a/kernel/bpf/fixups.c
>> +++ b/kernel/bpf/fixups.c
>> @@ -1819,6 +1819,38 @@ int bpf_do_misc_fixups(struct bpf_verifier_env *env)
>>                         goto next_insn;
>>                 }
>>
>> +               if (env->prog->jit_requested &&
>> +                   bpf_jit_supports_percpu_insn() &&
>> +                   insn->code == (BPF_LD | BPF_IMM | BPF_DW) &&
>> +                   (insn->src_reg == BPF_PSEUDO_MAP_VALUE ||
>> +                    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE)) {
>> +                       struct bpf_map *map;
>> +
>> +                       aux = &env->insn_aux_data[i + delta];
>> +                       map = env->used_maps[aux->map_index];
>> +                       if (map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
>> +                               goto next_insn;
>> +
>> +                       /*
>> +                        * Reuse the original ld_imm64 insn, and add one
>> +                        * mov64_percpu_reg insn.
>> +                        */
>> +
>> +                       insn_buf[0] = insn[1];
>> +                       insn_buf[1] = BPF_MOV64_PERCPU_REG(insn->dst_reg, insn->dst_reg);
>> +                       cnt = 2;
>> +
>> +                       i++;
> 
> oof, this was a subtle head scratcher for me.. that i++ is easy to
> miss. let's update the comment to be more explicit: we are *skipping*
> first half of ld_imm64, patching over second half of it with that same
> half + percpu mov. All because bpf_patch_insn_data() can only replace
> one 8-byte instruction, which doesn't work well for ldimm64.
> 
> Anyways, this looks correct, it just took me a bit to figure this out
> and while the above comment warned me about this, it didn't really
> make it any easier for figure out what's going on.
> 

Ack. Will update the comment.

Thanks,
Leon


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 3/9] libbpf: Probe percpu data feature
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
  2026-06-29 15:23 ` [PATCH bpf-next v8 1/9] bpf: Drop duplicate blank lines in verifier Leon Hwang
  2026-06-29 15:23 ` [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-06-29 15:24 ` [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data Leon Hwang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

libbpf needs a reliable way to distinguish kernels that can support
global percpu data from those that cannot.

Add a dedicated feature probe, so libbpf can make capability decisions
early and fail predictably when global percpu data is unavailable.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 tools/lib/bpf/features.c        | 35 +++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf_internal.h |  2 ++
 2 files changed, 37 insertions(+)

diff --git a/tools/lib/bpf/features.c b/tools/lib/bpf/features.c
index b7e388f99d0b..ef9581c11303 100644
--- a/tools/lib/bpf/features.c
+++ b/tools/lib/bpf/features.c
@@ -620,6 +620,38 @@ static int probe_bpf_syscall_common_attrs(int token_fd)
 	return probe_sys_bpf_ext();
 }
 
+static int probe_kern_percpu_data(int token_fd)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, 0),
+		BPF_EXIT_INSN(),
+	};
+	LIBBPF_OPTS(bpf_map_create_opts, map_opts,
+		.token_fd = token_fd,
+		.map_flags = token_fd ? BPF_F_TOKEN_FD : 0,
+	);
+	LIBBPF_OPTS(bpf_prog_load_opts, prog_opts,
+		.token_fd = token_fd,
+		.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
+	);
+	int ret, map, insn_cnt = ARRAY_SIZE(insns);
+
+	map = bpf_map_create(BPF_MAP_TYPE_PERCPU_ARRAY, "libbpf_percpu", sizeof(int), 8, 1,
+			     &map_opts);
+	if (map < 0) {
+		pr_warn("Error in %s(): %s. Couldn't create simple percpu_array map.\n",
+			__func__, errstr(map));
+		return map;
+	}
+
+	insns[0].imm = map;
+
+	ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &prog_opts);
+	close(map);
+	return probe_fd(ret);
+}
+
 typedef int (*feature_probe_fn)(int /* token_fd */);
 
 static struct kern_feature_cache feature_cache;
@@ -707,6 +739,9 @@ static struct kern_feature_desc {
 	[FEAT_BPF_SYSCALL_COMMON_ATTRS] = {
 		"BPF syscall common attributes support", probe_bpf_syscall_common_attrs,
 	},
+	[FEAT_PERCPU_DATA] = {
+		"kernel supports percpu data", probe_kern_percpu_data,
+	},
 };
 
 bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_id)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 04cd303fb5a8..47ae39125f68 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -401,6 +401,8 @@ enum kern_feature_id {
 	FEAT_BTF_LAYOUT,
 	/* Kernel supports BPF syscall common attributes */
 	FEAT_BPF_SYSCALL_COMMON_ATTRS,
+	/* Kernel supports percpu data */
+	FEAT_PERCPU_DATA,
 	__FEAT_CNT,
 };
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (2 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 3/9] libbpf: Probe percpu data feature Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-07-01 19:32   ` Andrii Nakryiko
  2026-06-29 15:24 ` [PATCH bpf-next v8 5/9] bpftool: Generate skeleton " Leon Hwang
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Add support for global percpu data in libbpf by adding a new ".percpu"
section, similar to ".data". It enables efficient handling of percpu
global variables in bpf programs.

When generating loader for lightweight skeleton, update the percpu_array
map used for global percpu data using BPF_F_ALL_CPUS, in order to update
values across all CPUs using one value slot.

Unlike global data, the mmaped data for global percpu data will be marked
as read-only after populating the percpu_array map. Thereafter, users can
read those initialized percpu data after loading prog. If they want to
update the percpu data after loading prog, they have to update the
percpu_array map using key=0 instead.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 tools/lib/bpf/bpf_gen_internal.h |  3 +-
 tools/lib/bpf/gen_loader.c       |  3 +-
 tools/lib/bpf/libbpf.c           | 57 +++++++++++++++++++++++++++-----
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
index 49af4260b8e6..5ea8383805d3 100644
--- a/tools/lib/bpf/bpf_gen_internal.h
+++ b/tools/lib/bpf/bpf_gen_internal.h
@@ -66,7 +66,8 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 			enum bpf_prog_type prog_type, const char *prog_name,
 			const char *license, struct bpf_insn *insns, size_t insn_cnt,
 			struct bpf_prog_load_opts *load_attr, int prog_idx);
-void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
+void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size,
+			      __u64 flags);
 void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
 void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
 void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index c7f2d2ac7bb3..60a1204e9a26 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -1190,7 +1190,7 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 }
 
 void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
-			      __u32 value_size)
+			      __u32 value_size, __u64 flags)
 {
 	int attr_size = offsetofend(union bpf_attr, flags);
 	int map_update_attr, value, key;
@@ -1198,6 +1198,7 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
 	int zero = 0;
 
 	memset(&attr, 0, attr_size);
+	attr.flags = tgt_endian(flags);
 
 	value = add_data(gen, pvalue, value_size);
 	key = add_data(gen, &zero, sizeof(zero));
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7162146280a8..6e18a1628e13 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -541,6 +541,7 @@ struct bpf_struct_ops {
 };
 
 #define DATA_SEC ".data"
+#define PERCPU_SEC ".percpu"
 #define BSS_SEC ".bss"
 #define RODATA_SEC ".rodata"
 #define KCONFIG_SEC ".kconfig"
@@ -555,6 +556,7 @@ enum libbpf_map_type {
 	LIBBPF_MAP_BSS,
 	LIBBPF_MAP_RODATA,
 	LIBBPF_MAP_KCONFIG,
+	LIBBPF_MAP_PERCPU,
 };
 
 struct bpf_map_def {
@@ -666,6 +668,7 @@ enum sec_type {
 	SEC_DATA,
 	SEC_RODATA,
 	SEC_ST_OPS,
+	SEC_PERCPU,
 };
 
 struct elf_sec_desc {
@@ -1839,6 +1842,8 @@ static size_t bpf_map_mmap_sz(const struct bpf_map *map)
 	switch (map->def.type) {
 	case BPF_MAP_TYPE_ARRAY:
 		return array_map_mmap_sz(map->def.value_size, map->def.max_entries);
+	case BPF_MAP_TYPE_PERCPU_ARRAY:
+		return map->def.value_size;
 	case BPF_MAP_TYPE_ARENA:
 		return page_sz * map->def.max_entries;
 	default:
@@ -1938,7 +1943,7 @@ static bool map_is_mmapable(struct bpf_object *obj, struct bpf_map *map)
 	struct btf_var_secinfo *vsi;
 	int i, n;
 
-	if (!map->btf_value_type_id)
+	if (!map->btf_value_type_id || map->libbpf_type == LIBBPF_MAP_PERCPU)
 		return false;
 
 	t = btf__type_by_id(obj->btf, map->btf_value_type_id);
@@ -1962,6 +1967,7 @@ static int
 bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 			      const char *real_name, int sec_idx, void *data, size_t data_sz)
 {
+	bool is_percpu = type == LIBBPF_MAP_PERCPU;
 	struct bpf_map_def *def;
 	struct bpf_map *map;
 	size_t mmap_sz;
@@ -1975,7 +1981,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	map->sec_idx = sec_idx;
 	map->sec_offset = 0;
 	map->real_name = strdup(real_name);
-	map->name = internal_map_name(obj, real_name);
+	map->name = is_percpu ? strdup(real_name) : internal_map_name(obj, real_name);
 	if (!map->real_name || !map->name) {
 		zfree(&map->real_name);
 		zfree(&map->name);
@@ -1983,7 +1989,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	}
 
 	def = &map->def;
-	def->type = BPF_MAP_TYPE_ARRAY;
+	def->type = is_percpu ? BPF_MAP_TYPE_PERCPU_ARRAY : BPF_MAP_TYPE_ARRAY;
 	def->key_size = sizeof(int);
 	def->value_size = data_sz;
 	def->max_entries = 1;
@@ -1996,8 +2002,9 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	if (map_is_mmapable(obj, map))
 		def->map_flags |= BPF_F_MMAPABLE;
 
-	pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n",
-		 map->name, map->sec_idx, map->sec_offset, def->map_flags);
+	pr_debug("map '%s' (global %sdata): at sec_idx %d, offset %zu, flags %x.\n",
+		 map->name, is_percpu ? "percpu " : "", map->sec_idx,
+		 map->sec_offset, def->map_flags);
 
 	mmap_sz = bpf_map_mmap_sz(map);
 	map->mmaped = mmap(NULL, mmap_sz, PROT_READ | PROT_WRITE,
@@ -2057,6 +2064,13 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 							    NULL,
 							    sec_desc->data->d_size);
 			break;
+		case SEC_PERCPU:
+			sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, sec_idx));
+			err = bpf_object__init_internal_map(obj, LIBBPF_MAP_PERCPU,
+							    sec_name, sec_idx,
+							    sec_desc->data->d_buf,
+							    sec_desc->data->d_size);
+			break;
 		default:
 			/* skip */
 			break;
@@ -4016,6 +4030,11 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 				sec_desc->sec_type = SEC_RODATA;
 				sec_desc->shdr = sh;
 				sec_desc->data = data;
+			} else if (strcmp(name, PERCPU_SEC) == 0 ||
+				   str_has_pfx(name, PERCPU_SEC ".")) {
+				sec_desc->sec_type = SEC_PERCPU;
+				sec_desc->shdr = sh;
+				sec_desc->data = data;
 			} else if (strcmp(name, STRUCT_OPS_SEC) == 0 ||
 				   strcmp(name, STRUCT_OPS_LINK_SEC) == 0 ||
 				   strcmp(name, "?" STRUCT_OPS_SEC) == 0 ||
@@ -4544,6 +4563,7 @@ static bool bpf_object__shndx_is_data(const struct bpf_object *obj,
 	case SEC_BSS:
 	case SEC_DATA:
 	case SEC_RODATA:
+	case SEC_PERCPU:
 		return true;
 	default:
 		return false;
@@ -4569,6 +4589,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_DATA;
 	case SEC_RODATA:
 		return LIBBPF_MAP_RODATA;
+	case SEC_PERCPU:
+		return LIBBPF_MAP_PERCPU;
 	default:
 		return LIBBPF_MAP_UNSPEC;
 	}
@@ -4944,7 +4966,7 @@ static int map_fill_btf_type_info(struct bpf_object *obj, struct bpf_map *map)
 
 	/*
 	 * LLVM annotates global data differently in BTF, that is,
-	 * only as '.data', '.bss' or '.rodata'.
+	 * only as '.data', '.bss', '.percpu' or '.rodata'.
 	 */
 	if (!bpf_map__is_internal(map))
 		return -ENOENT;
@@ -5297,18 +5319,30 @@ static int
 bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 {
 	enum libbpf_map_type map_type = map->libbpf_type;
+	bool is_percpu = map_type == LIBBPF_MAP_PERCPU;
+	__u64 update_flags = 0;
 	int err, zero = 0;
 	size_t mmap_sz;
 
+	if (is_percpu) {
+		if (!obj->gen_loader && !kernel_supports(obj, FEAT_PERCPU_DATA)) {
+			pr_warn("map '%s': kernel does not support percpu data.\n",
+				bpf_map__name(map));
+			return -EOPNOTSUPP;
+		}
+
+		update_flags = BPF_F_ALL_CPUS;
+	}
+
 	if (obj->gen_loader) {
 		bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps,
-					 map->mmaped, map->def.value_size);
+					 map->mmaped, map->def.value_size, update_flags);
 		if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
 			bpf_gen__map_freeze(obj->gen_loader, map - obj->maps);
 		return 0;
 	}
 
-	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
+	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, update_flags);
 	if (err) {
 		err = -errno;
 		pr_warn("map '%s': failed to set initial contents: %s\n",
@@ -5353,6 +5387,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 			return err;
 		}
 		map->mmaped = mmaped;
+	} else if (is_percpu) {
+		if (mprotect(map->mmaped, mmap_sz, PROT_READ)) {
+			err = -errno;
+			pr_warn("map '%s': failed to mprotect() contents: %s\n",
+				bpf_map__name(map), errstr(err));
+			return err;
+		}
 	} else if (map->mmaped) {
 		munmap(map->mmaped, mmap_sz);
 		map->mmaped = NULL;
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data
  2026-06-29 15:24 ` [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data Leon Hwang
@ 2026-07-01 19:32   ` Andrii Nakryiko
  2026-07-02  6:16     ` Leon Hwang
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2026-07-01 19:32 UTC (permalink / raw)
  To: Leon Hwang
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, linux-kernel, linux-kselftest,
	kernel-patches-bot

On Mon, Jun 29, 2026 at 8:25 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>
> Add support for global percpu data in libbpf by adding a new ".percpu"
> section, similar to ".data". It enables efficient handling of percpu
> global variables in bpf programs.
>
> When generating loader for lightweight skeleton, update the percpu_array
> map used for global percpu data using BPF_F_ALL_CPUS, in order to update
> values across all CPUs using one value slot.
>
> Unlike global data, the mmaped data for global percpu data will be marked
> as read-only after populating the percpu_array map. Thereafter, users can
> read those initialized percpu data after loading prog. If they want to
> update the percpu data after loading prog, they have to update the
> percpu_array map using key=0 instead.
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
>  tools/lib/bpf/bpf_gen_internal.h |  3 +-
>  tools/lib/bpf/gen_loader.c       |  3 +-
>  tools/lib/bpf/libbpf.c           | 57 +++++++++++++++++++++++++++-----
>  3 files changed, 53 insertions(+), 10 deletions(-)
>

[...]

> @@ -1975,7 +1981,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
>         map->sec_idx = sec_idx;
>         map->sec_offset = 0;
>         map->real_name = strdup(real_name);
> -       map->name = internal_map_name(obj, real_name);
> +       map->name = is_percpu ? strdup(real_name) : internal_map_name(obj, real_name);

nit: I'd probably pass type into internal_map_name() and let it handle
all this in one place, consider that for follow up

>         if (!map->real_name || !map->name) {
>                 zfree(&map->real_name);
>                 zfree(&map->name);
> @@ -1983,7 +1989,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
>         }
>
>         def = &map->def;
> -       def->type = BPF_MAP_TYPE_ARRAY;
> +       def->type = is_percpu ? BPF_MAP_TYPE_PERCPU_ARRAY : BPF_MAP_TYPE_ARRAY;
>         def->key_size = sizeof(int);
>         def->value_size = data_sz;
>         def->max_entries = 1;

[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data
  2026-07-01 19:32   ` Andrii Nakryiko
@ 2026-07-02  6:16     ` Leon Hwang
  0 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-07-02  6:16 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, linux-kernel, linux-kselftest,
	kernel-patches-bot

On 2/7/26 03:32, Andrii Nakryiko wrote:
> On Mon, Jun 29, 2026 at 8:25 AM Leon Hwang <leon.hwang@linux.dev> wrote:
[...]
>> @@ -1975,7 +1981,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
>>         map->sec_idx = sec_idx;
>>         map->sec_offset = 0;
>>         map->real_name = strdup(real_name);
>> -       map->name = internal_map_name(obj, real_name);
>> +       map->name = is_percpu ? strdup(real_name) : internal_map_name(obj, real_name);
> 
> nit: I'd probably pass type into internal_map_name() and let it handle
> all this in one place, consider that for follow up
> 
hmm, it is a true issue that 'map->name' must have the tail '\0'. And
strdup(real_name) does not guarantee the tail '\0'.

Will pass type into internal_map_name() instead.

Thanks,
Leon


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (3 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-07-01 16:49   ` Quentin Monnet
  2026-06-29 15:24 ` [PATCH bpf-next v8 6/9] selftests/bpf: Add tests to verify " Leon Hwang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Enhance bpftool to generate skeletons that properly handle global percpu
variables. The generated skeleton now includes a dedicated structure for
percpu data, allowing users to initialize and access percpu variables more
efficiently.

For global percpu variables, the skeleton now includes a nested
structure, e.g.:

struct test_global_percpu_data {
	struct bpf_object_skeleton *skeleton;
	struct bpf_object *obj;
	struct {
		struct bpf_map *percpu;
	} maps;
	// ...
	struct test_global_percpu_data__percpu {
		int data;
		char run;
		struct {
			char set;
			int i;
			int nums[7];
		} struct_data;
		int nums[7];
	} *percpu;

	// ...
};

  * The "struct test_global_percpu_data__percpu *percpu" points to
    initialized data, which is actually "maps.percpu->mmaped".
  * Before loading the skeleton, updating the
    "struct test_global_percpu_data__percpu *percpu" modifies the initial
    value of the corresponding global percpu variables.
  * After loading the skeleton, "maps.percpu->mmaped" has been marked as
    read-only in libbpf. If users want to update the global percpu
    variables, they have to update the "maps.percpu" map instead.
  * For lightweight skeleton, "lskel->percpu" will be protected by
    "mprotect(p, sz, PROT_READ)".
  * For subskeleton, those variables of global percpu data will be
    skipped.

Assisted-by: Codex:gpt-5.5-xhigh
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 tools/bpf/bpftool/gen.c       | 49 +++++++++++++++++++++++++++++++----
 tools/lib/bpf/skel_internal.h | 24 +++++++++++++++--
 2 files changed, 66 insertions(+), 7 deletions(-)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index 6ae7262ebe0c..2e60296358db 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -92,7 +92,7 @@ static void get_header_guard(char *guard, const char *obj_name, const char *suff
 
 static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
 {
-	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
+	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
 	const char *name = bpf_map__name(map);
 	int i, n;
 
@@ -117,7 +117,7 @@ static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
 
 static bool get_datasec_ident(const char *sec_name, char *buf, size_t buf_sz)
 {
-	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
+	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
 	int i, n;
 
 	/* recognize hard coded LLVM section name */
@@ -254,6 +254,20 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
 	return NULL;
 }
 
+static bool bpf_map_is_skel_data(const struct bpf_map *map)
+{
+	if (!bpf_map__is_internal(map))
+		return false;
+
+	if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
+		return true;
+
+	if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
+		return true;
+
+	return false;
+}
+
 static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
 {
 	size_t tmp_sz;
@@ -263,7 +277,7 @@ static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
 		return true;
 	}
 
-	if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
+	if (!bpf_map_is_skel_data(map))
 		return false;
 
 	if (!get_map_ident(map, buf, sz))
@@ -321,6 +335,11 @@ static bool btf_is_ptr_to_func_proto(const struct btf *btf,
 	return btf_is_ptr(v) && btf_is_func_proto(btf__type_by_id(btf, v->type));
 }
 
+static bool bpf_map_is_percpu_data(const struct bpf_map *map)
+{
+	return bpf_map__is_internal(map) && bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY;
+}
+
 static int codegen_subskel_datasecs(struct bpf_object *obj, const char *obj_name)
 {
 	struct btf *btf = bpf_object__btf(obj);
@@ -343,6 +362,9 @@ static int codegen_subskel_datasecs(struct bpf_object *obj, const char *obj_name
 		if (!is_mmapable_map(map, map_ident, sizeof(map_ident)))
 			continue;
 
+		if (bpf_map_is_percpu_data(map))
+			continue;
+
 		sec = find_type_for_map(btf, map_ident);
 		if (!sec)
 			continue;
@@ -668,8 +690,7 @@ static void codegen_destroy(struct bpf_object *obj, const char *obj_name)
 	bpf_object__for_each_map(map, obj) {
 		if (!get_map_ident(map, ident, sizeof(ident)))
 			continue;
-		if (bpf_map__is_internal(map) &&
-		    (bpf_map__map_flags(map) & BPF_F_MMAPABLE))
+		if (bpf_map_is_skel_data(map))
 			printf("\tskel_free_map_data(skel->%1$s, skel->maps.%1$s.initial_value, %2$zu);\n",
 			       ident, bpf_map_mmap_sz(map));
 		codegen("\
@@ -850,6 +871,20 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
 		if (!is_mmapable_map(map, ident, sizeof(ident)))
 			continue;
 
+		if (bpf_map_is_percpu_data(map)) {
+			codegen("\
+		\n\
+			err = skel_protect_map_data(skel->%1$s, &skel->maps.%1$s.initial_value, %2$zd);\n\
+			if (err)					    \n\
+				return err;				    \n\
+		#ifdef __KERNEL__					    \n\
+			skel->%1$s = NULL;				    \n\
+		#endif							    \n\
+			",
+			ident, bpf_map_mmap_sz(map));
+			continue;
+		}
+
 		if (bpf_map__map_flags(map) & BPF_F_RDONLY_PROG)
 			mmap_flags = "PROT_READ";
 		else
@@ -1740,6 +1775,8 @@ static int do_subskeleton(int argc, char **argv)
 
 		if (!is_mmapable_map(map, ident, sizeof(ident)))
 			continue;
+		if (bpf_map_is_percpu_data(map))
+			continue;
 
 		map_type_id = bpf_map__btf_value_type_id(map);
 		if (map_type_id <= 0) {
@@ -1863,6 +1900,8 @@ static int do_subskeleton(int argc, char **argv)
 	bpf_object__for_each_map(map, obj) {
 		if (!is_mmapable_map(map, ident, sizeof(ident)))
 			continue;
+		if (bpf_map_is_percpu_data(map))
+			continue;
 
 		map_type_id = bpf_map__btf_value_type_id(map);
 		if (map_type_id <= 0)
diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h
index 74503d358bc8..485b0cd17017 100644
--- a/tools/lib/bpf/skel_internal.h
+++ b/tools/lib/bpf/skel_internal.h
@@ -135,8 +135,10 @@ static inline void skel_free_map_data(void *p, __u64 addr, size_t sz)
 {
 	if (addr != ~0ULL)
 		kvfree(p);
-	/* When addr == ~0ULL the 'p' points to
-	 * ((struct bpf_array *)map)->value. See skel_finalize_map_data.
+	/*
+	 * When addr == ~0ULL the init buffer has already been released.
+	 * For skel_finalize_map_data(), 'p' points to
+	 * ((struct bpf_array *)map)->value.
 	 */
 }
 
@@ -174,6 +176,15 @@ static inline void *skel_finalize_map_data(__u64 *init_val, size_t mmap_sz, int
 	return addr;
 }
 
+static inline int skel_protect_map_data(void *p, __u64 *init_val, size_t sz)
+{
+	(void)sz;
+
+	kvfree(p);
+	*init_val = ~0ULL;
+	return 0;
+}
+
 #else
 
 static inline void *skel_alloc(size_t size)
@@ -212,6 +223,15 @@ static inline void *skel_finalize_map_data(__u64 *init_val, size_t mmap_sz, int
 		return NULL;
 	return addr;
 }
+
+static inline int skel_protect_map_data(void *p, __u64 *init_val, size_t sz)
+{
+	(void)init_val;
+
+	if (mprotect(p, sz, PROT_READ))
+		return -errno;
+	return 0;
+}
 #endif
 
 static inline int skel_closenz(int fd)
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-06-29 15:24 ` [PATCH bpf-next v8 5/9] bpftool: Generate skeleton " Leon Hwang
@ 2026-07-01 16:49   ` Quentin Monnet
  2026-07-01 19:32     ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Quentin Monnet @ 2026-07-01 16:49 UTC (permalink / raw)
  To: Leon Hwang, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Shuah Khan,
	linux-kernel, linux-kselftest, kernel-patches-bot

2026-06-29 23:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
> Enhance bpftool to generate skeletons that properly handle global percpu
> variables. The generated skeleton now includes a dedicated structure for
> percpu data, allowing users to initialize and access percpu variables more
> efficiently.
> 
> For global percpu variables, the skeleton now includes a nested
> structure, e.g.:
> 
> struct test_global_percpu_data {
> 	struct bpf_object_skeleton *skeleton;
> 	struct bpf_object *obj;
> 	struct {
> 		struct bpf_map *percpu;
> 	} maps;
> 	// ...
> 	struct test_global_percpu_data__percpu {
> 		int data;
> 		char run;
> 		struct {
> 			char set;
> 			int i;
> 			int nums[7];
> 		} struct_data;
> 		int nums[7];
> 	} *percpu;
> 
> 	// ...
> };
> 
>   * The "struct test_global_percpu_data__percpu *percpu" points to
>     initialized data, which is actually "maps.percpu->mmaped".
>   * Before loading the skeleton, updating the
>     "struct test_global_percpu_data__percpu *percpu" modifies the initial
>     value of the corresponding global percpu variables.
>   * After loading the skeleton, "maps.percpu->mmaped" has been marked as
>     read-only in libbpf. If users want to update the global percpu
>     variables, they have to update the "maps.percpu" map instead.
>   * For lightweight skeleton, "lskel->percpu" will be protected by
>     "mprotect(p, sz, PROT_READ)".
>   * For subskeleton, those variables of global percpu data will be
>     skipped.
> 
> Assisted-by: Codex:gpt-5.5-xhigh
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
>  tools/bpf/bpftool/gen.c       | 49 +++++++++++++++++++++++++++++++----
>  tools/lib/bpf/skel_internal.h | 24 +++++++++++++++--
>  2 files changed, 66 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index 6ae7262ebe0c..2e60296358db 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c
> @@ -92,7 +92,7 @@ static void get_header_guard(char *guard, const char *obj_name, const char *suff
>  
>  static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
>  {
> -	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
> +	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
>  	const char *name = bpf_map__name(map);
>  	int i, n;
>  
> @@ -117,7 +117,7 @@ static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
>  
>  static bool get_datasec_ident(const char *sec_name, char *buf, size_t buf_sz)
>  {
> -	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
> +	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
>  	int i, n;
>  
>  	/* recognize hard coded LLVM section name */
> @@ -254,6 +254,20 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
>  	return NULL;
>  }
>  
> +static bool bpf_map_is_skel_data(const struct bpf_map *map)
> +{
> +	if (!bpf_map__is_internal(map))
> +		return false;
> +
> +	if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
> +		return true;
> +
> +	if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
> +		return true;
> +
> +	return false;
> +}
> +
>  static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>  {
>  	size_t tmp_sz;
> @@ -263,7 +277,7 @@ static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>  		return true;
>  	}
>  
> -	if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
> +	if (!bpf_map_is_skel_data(map))
>  		return false;
>  
>  	if (!get_map_ident(map, buf, sz))


Thanks! The bpftool patch looks good, with one reservation: after this
patch, I believe "is_mmapable_map(map, ...)" will return true if map is
a percpu map, although percpu maps aren't mmap-able, so we should
probably update the name of that function to avoid any confusion?

Quentin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-07-01 16:49   ` Quentin Monnet
@ 2026-07-01 19:32     ` Andrii Nakryiko
  2026-07-02  6:24       ` Leon Hwang
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2026-07-01 19:32 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Leon Hwang, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, Song Liu, Yonghong Song, Jiri Olsa,
	John Fastabend, Shuah Khan, linux-kernel, linux-kselftest,
	kernel-patches-bot

On Wed, Jul 1, 2026 at 9:49 AM Quentin Monnet <qmo@kernel.org> wrote:
>
> 2026-06-29 23:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
> > Enhance bpftool to generate skeletons that properly handle global percpu
> > variables. The generated skeleton now includes a dedicated structure for
> > percpu data, allowing users to initialize and access percpu variables more
> > efficiently.
> >
> > For global percpu variables, the skeleton now includes a nested
> > structure, e.g.:
> >
> > struct test_global_percpu_data {
> >       struct bpf_object_skeleton *skeleton;
> >       struct bpf_object *obj;
> >       struct {
> >               struct bpf_map *percpu;
> >       } maps;
> >       // ...
> >       struct test_global_percpu_data__percpu {
> >               int data;
> >               char run;
> >               struct {
> >                       char set;
> >                       int i;
> >                       int nums[7];
> >               } struct_data;
> >               int nums[7];
> >       } *percpu;
> >
> >       // ...
> > };
> >
> >   * The "struct test_global_percpu_data__percpu *percpu" points to
> >     initialized data, which is actually "maps.percpu->mmaped".
> >   * Before loading the skeleton, updating the
> >     "struct test_global_percpu_data__percpu *percpu" modifies the initial
> >     value of the corresponding global percpu variables.
> >   * After loading the skeleton, "maps.percpu->mmaped" has been marked as
> >     read-only in libbpf. If users want to update the global percpu
> >     variables, they have to update the "maps.percpu" map instead.
> >   * For lightweight skeleton, "lskel->percpu" will be protected by
> >     "mprotect(p, sz, PROT_READ)".
> >   * For subskeleton, those variables of global percpu data will be
> >     skipped.
> >
> > Assisted-by: Codex:gpt-5.5-xhigh
> > Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> > ---
> >  tools/bpf/bpftool/gen.c       | 49 +++++++++++++++++++++++++++++++----
> >  tools/lib/bpf/skel_internal.h | 24 +++++++++++++++--
> >  2 files changed, 66 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> > index 6ae7262ebe0c..2e60296358db 100644
> > --- a/tools/bpf/bpftool/gen.c
> > +++ b/tools/bpf/bpftool/gen.c
> > @@ -92,7 +92,7 @@ static void get_header_guard(char *guard, const char *obj_name, const char *suff
> >
> >  static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
> >  {
> > -     static const char *sfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
> > +     static const char *sfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
> >       const char *name = bpf_map__name(map);
> >       int i, n;
> >
> > @@ -117,7 +117,7 @@ static bool get_map_ident(const struct bpf_map *map, char *buf, size_t buf_sz)
> >
> >  static bool get_datasec_ident(const char *sec_name, char *buf, size_t buf_sz)
> >  {
> > -     static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
> > +     static const char *pfxs[] = { ".data", ".rodata", ".bss", ".percpu", ".kconfig" };
> >       int i, n;
> >
> >       /* recognize hard coded LLVM section name */
> > @@ -254,6 +254,20 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
> >       return NULL;
> >  }
> >
> > +static bool bpf_map_is_skel_data(const struct bpf_map *map)
> > +{
> > +     if (!bpf_map__is_internal(map))
> > +             return false;
> > +
> > +     if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
> > +             return true;
> > +
> > +     if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
> > +             return true;
> > +
> > +     return false;
> > +}
> > +
> >  static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
> >  {
> >       size_t tmp_sz;
> > @@ -263,7 +277,7 @@ static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
> >               return true;
> >       }
> >
> > -     if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
> > +     if (!bpf_map_is_skel_data(map))
> >               return false;
> >
> >       if (!get_map_ident(map, buf, sz))
>
>
> Thanks! The bpftool patch looks good, with one reservation: after this
> patch, I believe "is_mmapable_map(map, ...)" will return true if map is
> a percpu map, although percpu maps aren't mmap-able, so we should
> probably update the name of that function to avoid any confusion?
>

Great observation, Quentin!

bpf_map_is_skel_data() I think was supposed to be exactly that generic
name. But it seems like Leon went half-way through with unification.
Unless there are some subtle situations where per-cpu array shouldn't
be handled where is_mmapable_map() is handled, we should rename
is_mmapable_map() and add BPF_MAP_TYPE_PERCPU_ARRAY check (assuming
it's internal map, of course) there.

Leon, can you please check?

pw-bot: cr


> Quentin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-07-01 19:32     ` Andrii Nakryiko
@ 2026-07-02  6:24       ` Leon Hwang
  2026-07-02 10:14         ` Quentin Monnet
  0 siblings, 1 reply; 19+ messages in thread
From: Leon Hwang @ 2026-07-02  6:24 UTC (permalink / raw)
  To: Andrii Nakryiko, Quentin Monnet
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Shuah Khan,
	linux-kernel, linux-kselftest, kernel-patches-bot

On 2/7/26 03:32, Andrii Nakryiko wrote:
> On Wed, Jul 1, 2026 at 9:49 AM Quentin Monnet <qmo@kernel.org> wrote:
>>
>> 2026-06-29 23:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
[...]
>>> @@ -254,6 +254,20 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
>>>       return NULL;
>>>  }
>>>
>>> +static bool bpf_map_is_skel_data(const struct bpf_map *map)
>>> +{
>>> +     if (!bpf_map__is_internal(map))
>>> +             return false;
>>> +
>>> +     if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
>>> +             return true;
>>> +
>>> +     if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
>>> +             return true;
>>> +
>>> +     return false;
>>> +}
>>> +
>>>  static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>>>  {
>>>       size_t tmp_sz;
>>> @@ -263,7 +277,7 @@ static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>>>               return true;
>>>       }
>>>
>>> -     if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
>>> +     if (!bpf_map_is_skel_data(map))
>>>               return false;
>>>
>>>       if (!get_map_ident(map, buf, sz))
>>
>>
>> Thanks! The bpftool patch looks good, with one reservation: after this
>> patch, I believe "is_mmapable_map(map, ...)" will return true if map is
>> a percpu map, although percpu maps aren't mmap-able, so we should
>> probably update the name of that function to avoid any confusion?
>>
> 
> Great observation, Quentin!
> 
> bpf_map_is_skel_data() I think was supposed to be exactly that generic
> name. But it seems like Leon went half-way through with unification.
> Unless there are some subtle situations where per-cpu array shouldn't
> be handled where is_mmapable_map() is handled, we should rename
> is_mmapable_map() and add BPF_MAP_TYPE_PERCPU_ARRAY check (assuming
> it's internal map, of course) there.
> 
> Leon, can you please check?
> 
Aha, my bad.

Try to rename is_mmapable_map() to bpf_map_is_skel_data(). See below patch.

Thanks,
Leon

---

From 654306cd9091a175883dd61c7b251996c42e7cae Mon Sep 17 00:00:00 2001
From: Leon Hwang <leon.hwang@linux.dev>
Date: Mon, 29 Jun 2026 23:24:02 +0800
Subject: [PATCH bpf-next v9 5/9] bpftool: Generate skeleton for global
percpu
 data

Enhance bpftool to generate skeletons that properly handle global percpu
variables. The generated skeleton now includes a dedicated structure for
percpu data, allowing users to initialize and access percpu variables more
efficiently.

For global percpu variables, the skeleton now includes a nested
structure, e.g.:

struct test_global_percpu_data {
	struct bpf_object_skeleton *skeleton;
	struct bpf_object *obj;
	struct {
		struct bpf_map *percpu;
	} maps;
	// ...
	struct test_global_percpu_data__percpu {
		int data;
		char run;
		struct {
			char set;
			int i;
			int nums[7];
		} struct_data;
		int nums[7];
	} *percpu;

	// ...
};

  * The "struct test_global_percpu_data__percpu *percpu" points to
    initialized data, which is actually "maps.percpu->mmaped".
  * Before loading the skeleton, updating the
    "struct test_global_percpu_data__percpu *percpu" modifies the initial
    value of the corresponding global percpu variables.
  * After loading the skeleton, "maps.percpu->mmaped" has been marked as
    read-only in libbpf. If users want to update the global percpu
    variables, they have to update the "maps.percpu" map instead.
  * For lightweight skeleton, "lskel->percpu" will be protected by
    "mprotect(p, sz, PROT_READ)".
  * For subskeleton, those variables of global percpu data will be
    skipped.

Assisted-by: Codex:gpt-5.5-xhigh
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 tools/bpf/bpftool/gen.c       | 72 ++++++++++++++++++++++++++---------
 tools/lib/bpf/skel_internal.h | 24 +++++++++++-
 2 files changed, 76 insertions(+), 20 deletions(-)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index 6ae7262ebe0c..798a34366e08 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -92,7 +92,7 @@ static void get_header_guard(char *guard, const char
*obj_name, const char *suff

 static bool get_map_ident(const struct bpf_map *map, char *buf, size_t
buf_sz)
 {
-	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
+	static const char *sfxs[] = { ".data", ".rodata", ".bss", ".percpu",
".kconfig" };
 	const char *name = bpf_map__name(map);
 	int i, n;

@@ -117,7 +117,7 @@ static bool get_map_ident(const struct bpf_map *map,
char *buf, size_t buf_sz)

 static bool get_datasec_ident(const char *sec_name, char *buf, size_t
buf_sz)
 {
-	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".kconfig" };
+	static const char *pfxs[] = { ".data", ".rodata", ".bss", ".percpu",
".kconfig" };
 	int i, n;

 	/* recognize hard coded LLVM section name */
@@ -254,7 +254,7 @@ static const struct btf_type
*find_type_for_map(struct btf *btf, const char *map
 	return NULL;
 }

-static bool is_mmapable_map(const struct bpf_map *map, char *buf,
size_t sz)
+static bool bpf_map_is_skel_data(const struct bpf_map *map, char *buf,
size_t sz)
 {
 	size_t tmp_sz;

@@ -263,13 +263,19 @@ static bool is_mmapable_map(const struct bpf_map
*map, char *buf, size_t sz)
 		return true;
 	}

-	if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) &
BPF_F_MMAPABLE))
+	if (!bpf_map__is_internal(map))
 		return false;

 	if (!get_map_ident(map, buf, sz))
 		return false;

-	return true;
+	if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
+		return true;
+
+	if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
+		return true;
+
+	return false;
 }

 static int codegen_datasecs(struct bpf_object *obj, const char *obj_name)
@@ -286,8 +292,11 @@ static int codegen_datasecs(struct bpf_object *obj,
const char *obj_name)
 		return -errno;

 	bpf_object__for_each_map(map, obj) {
-		/* only generate definitions for memory-mapped internal maps */
-		if (!is_mmapable_map(map, map_ident, sizeof(map_ident)))
+		/*
+		 * Only generate definitions for internal maps that have
+		 * mmapped data.
+		 */
+		if (!bpf_map_is_skel_data(map, map_ident, sizeof(map_ident)))
 			continue;

 		sec = find_type_for_map(btf, map_ident);
@@ -339,8 +348,14 @@ static int codegen_subskel_datasecs(struct
bpf_object *obj, const char *obj_name
 		return -errno;

 	bpf_object__for_each_map(map, obj) {
-		/* only generate definitions for memory-mapped internal maps */
-		if (!is_mmapable_map(map, map_ident, sizeof(map_ident)))
+		/*
+		 * Only generate definitions for internal maps that have
+		 * mmapped data.
+		 */
+		if (!bpf_map_is_skel_data(map, map_ident, sizeof(map_ident)))
+			continue;
+
+		if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
 			continue;

 		sec = find_type_for_map(btf, map_ident);
@@ -493,7 +508,10 @@ static size_t bpf_map_mmap_sz(const struct bpf_map
*map)
 	return map_sz;
 }

-/* Emit type size asserts for all top-level fields in memory-mapped
internal maps. */
+/*
+ * Emit type size asserts for all top-level fields in internal maps that
+ * have mmaped data.
+ */
 static void codegen_asserts(struct bpf_object *obj, const char *obj_name)
 {
 	struct btf *btf = bpf_object__btf(obj);
@@ -517,7 +535,7 @@ static void codegen_asserts(struct bpf_object *obj,
const char *obj_name)
 		", obj_name);

 	bpf_object__for_each_map(map, obj) {
-		if (!is_mmapable_map(map, map_ident, sizeof(map_ident)))
+		if (!bpf_map_is_skel_data(map, map_ident, sizeof(map_ident)))
 			continue;

 		sec = find_type_for_map(btf, map_ident);
@@ -669,7 +687,8 @@ static void codegen_destroy(struct bpf_object *obj,
const char *obj_name)
 		if (!get_map_ident(map, ident, sizeof(ident)))
 			continue;
 		if (bpf_map__is_internal(map) &&
-		    (bpf_map__map_flags(map) & BPF_F_MMAPABLE))
+		    ((bpf_map__map_flags(map) & BPF_F_MMAPABLE) ||
+		     bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY))
 			printf("\tskel_free_map_data(skel->%1$s,
skel->maps.%1$s.initial_value, %2$zu);\n",
 			       ident, bpf_map_mmap_sz(map));
 		codegen("\
@@ -741,7 +760,7 @@ static int gen_trace(struct bpf_object *obj, const
char *obj_name, const char *h
 		const void *mmap_data = NULL;
 		size_t mmap_size = 0;

-		if (!is_mmapable_map(map, ident, sizeof(ident)))
+		if (!bpf_map_is_skel_data(map, ident, sizeof(ident)))
 			continue;

 		codegen("\
@@ -847,9 +866,23 @@ static int gen_trace(struct bpf_object *obj, const
char *obj_name, const char *h
 	bpf_object__for_each_map(map, obj) {
 		const char *mmap_flags;

-		if (!is_mmapable_map(map, ident, sizeof(ident)))
+		if (!bpf_map_is_skel_data(map, ident, sizeof(ident)))
 			continue;

+		if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY) {
+			codegen("\
+		\n\
+			err = skel_protect_map_data(skel->%1$s,
&skel->maps.%1$s.initial_value, %2$zd);\n\
+			if (err)					    \n\
+				return err;				    \n\
+		#ifdef __KERNEL__					    \n\
+			skel->%1$s = NULL;				    \n\
+		#endif							    \n\
+			",
+			ident, bpf_map_mmap_sz(map));
+			continue;
+		}
+
 		if (bpf_map__map_flags(map) & BPF_F_RDONLY_PROG)
 			mmap_flags = "PROT_READ";
 		else
@@ -953,8 +986,7 @@ codegen_maps_skeleton(struct bpf_object *obj, size_t
map_cnt, bool mmaped, bool
 				map->map = &obj->maps.%s;	    \n\
 			",
 			i, bpf_map__name(map), ident);
-		/* memory-mapped internal maps */
-		if (mmaped && is_mmapable_map(map, ident, sizeof(ident))) {
+		if (mmaped && bpf_map_is_skel_data(map, ident, sizeof(ident))) {
 			printf("\tmap->mmaped = (void **)&obj->%s;\n", ident);
 		}

@@ -1738,7 +1770,9 @@ static int do_subskeleton(int argc, char **argv)
 		/* Also count all maps that have a name */
 		map_cnt++;

-		if (!is_mmapable_map(map, ident, sizeof(ident)))
+		if (!bpf_map_is_skel_data(map, ident, sizeof(ident)))
+			continue;
+		if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
 			continue;

 		map_type_id = bpf_map__btf_value_type_id(map);
@@ -1861,7 +1895,9 @@ static int do_subskeleton(int argc, char **argv)

 	/* walk through each symbol and emit the runtime representation */
 	bpf_object__for_each_map(map, obj) {
-		if (!is_mmapable_map(map, ident, sizeof(ident)))
+		if (!bpf_map_is_skel_data(map, ident, sizeof(ident)))
+			continue;
+		if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
 			continue;

 		map_type_id = bpf_map__btf_value_type_id(map);
diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h
index 74503d358bc8..485b0cd17017 100644
--- a/tools/lib/bpf/skel_internal.h
+++ b/tools/lib/bpf/skel_internal.h
@@ -135,8 +135,10 @@ static inline void skel_free_map_data(void *p,
__u64 addr, size_t sz)
 {
 	if (addr != ~0ULL)
 		kvfree(p);
-	/* When addr == ~0ULL the 'p' points to
-	 * ((struct bpf_array *)map)->value. See skel_finalize_map_data.
+	/*
+	 * When addr == ~0ULL the init buffer has already been released.
+	 * For skel_finalize_map_data(), 'p' points to
+	 * ((struct bpf_array *)map)->value.
 	 */
 }

@@ -174,6 +176,15 @@ static inline void *skel_finalize_map_data(__u64
*init_val, size_t mmap_sz, int
 	return addr;
 }

+static inline int skel_protect_map_data(void *p, __u64 *init_val,
size_t sz)
+{
+	(void)sz;
+
+	kvfree(p);
+	*init_val = ~0ULL;
+	return 0;
+}
+
 #else

 static inline void *skel_alloc(size_t size)
@@ -212,6 +223,15 @@ static inline void *skel_finalize_map_data(__u64
*init_val, size_t mmap_sz, int
 		return NULL;
 	return addr;
 }
+
+static inline int skel_protect_map_data(void *p, __u64 *init_val,
size_t sz)
+{
+	(void)init_val;
+
+	if (mprotect(p, sz, PROT_READ))
+		return -errno;
+	return 0;
+}
 #endif

 static inline int skel_closenz(int fd)
-- 
2.54.0



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-07-02  6:24       ` Leon Hwang
@ 2026-07-02 10:14         ` Quentin Monnet
  2026-07-02 14:08           ` Leon Hwang
  0 siblings, 1 reply; 19+ messages in thread
From: Quentin Monnet @ 2026-07-02 10:14 UTC (permalink / raw)
  To: Leon Hwang, Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Shuah Khan,
	linux-kernel, linux-kselftest, kernel-patches-bot

2026-07-02 14:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
> On 2/7/26 03:32, Andrii Nakryiko wrote:
>> On Wed, Jul 1, 2026 at 9:49 AM Quentin Monnet <qmo@kernel.org> wrote:
>>>
>>> 2026-06-29 23:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
> [...]
>>>> @@ -254,6 +254,20 @@ static const struct btf_type *find_type_for_map(struct btf *btf, const char *map
>>>>       return NULL;
>>>>  }
>>>>
>>>> +static bool bpf_map_is_skel_data(const struct bpf_map *map)
>>>> +{
>>>> +     if (!bpf_map__is_internal(map))
>>>> +             return false;
>>>> +
>>>> +     if (bpf_map__map_flags(map) & BPF_F_MMAPABLE)
>>>> +             return true;
>>>> +
>>>> +     if (bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY)
>>>> +             return true;
>>>> +
>>>> +     return false;
>>>> +}
>>>> +
>>>>  static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>>>>  {
>>>>       size_t tmp_sz;
>>>> @@ -263,7 +277,7 @@ static bool is_mmapable_map(const struct bpf_map *map, char *buf, size_t sz)
>>>>               return true;
>>>>       }
>>>>
>>>> -     if (!bpf_map__is_internal(map) || !(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
>>>> +     if (!bpf_map_is_skel_data(map))
>>>>               return false;
>>>>
>>>>       if (!get_map_ident(map, buf, sz))
>>>
>>>
>>> Thanks! The bpftool patch looks good, with one reservation: after this
>>> patch, I believe "is_mmapable_map(map, ...)" will return true if map is
>>> a percpu map, although percpu maps aren't mmap-able, so we should
>>> probably update the name of that function to avoid any confusion?
>>>
>>
>> Great observation, Quentin!
>>
>> bpf_map_is_skel_data() I think was supposed to be exactly that generic
>> name. But it seems like Leon went half-way through with unification.
>> Unless there are some subtle situations where per-cpu array shouldn't
>> be handled where is_mmapable_map() is handled, we should rename
>> is_mmapable_map() and add BPF_MAP_TYPE_PERCPU_ARRAY check (assuming
>> it's internal map, of course) there.
>>
>> Leon, can you please check?
>>
> Aha, my bad.
> 
> Try to rename is_mmapable_map() to bpf_map_is_skel_data(). See below patch.
> 
> Thanks,
> Leon
> 
> ---
> 
> From 654306cd9091a175883dd61c7b251996c42e7cae Mon Sep 17 00:00:00 2001
> From: Leon Hwang <leon.hwang@linux.dev>
> Date: Mon, 29 Jun 2026 23:24:02 +0800
> Subject: [PATCH bpf-next v9 5/9] bpftool: Generate skeleton for global
> percpu
>  data
> 

[...]

> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index 6ae7262ebe0c..798a34366e08 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c

[...]

> @@ -254,7 +254,7 @@ static const struct btf_type
> *find_type_for_map(struct btf *btf, const char *map
>  	return NULL;
>  }
> 
> -static bool is_mmapable_map(const struct bpf_map *map, char *buf,
> size_t sz)
> +static bool bpf_map_is_skel_data(const struct bpf_map *map, char *buf,
> size_t sz)
>  {


Yes, I think it addresses the issue, thanks! I'd maybe drop the
"bpf_map_" prefix in the function name ("is_skel_data_map()" instead?)
to remain closer to is_mmapable_map(), and to avoid creating confusion
with libbpf functions names, although I don't feel strongly about it.
You can add my ACK to this v9 for the bpftool patch when you repost the
series.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v8 5/9] bpftool: Generate skeleton for global percpu data
  2026-07-02 10:14         ` Quentin Monnet
@ 2026-07-02 14:08           ` Leon Hwang
  0 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-07-02 14:08 UTC (permalink / raw)
  To: Quentin Monnet, Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend, Shuah Khan,
	linux-kernel, linux-kselftest, kernel-patches-bot

On 2026/7/2 18:14, Quentin Monnet wrote:
> 2026-07-02 14:24 UTC+0800 ~ Leon Hwang <leon.hwang@linux.dev>
>> On 2/7/26 03:32, Andrii Nakryiko wrote:
[...]
>> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
>> index 6ae7262ebe0c..798a34366e08 100644
>> --- a/tools/bpf/bpftool/gen.c
>> +++ b/tools/bpf/bpftool/gen.c
> 
> [...]
> 
>> @@ -254,7 +254,7 @@ static const struct btf_type
>> *find_type_for_map(struct btf *btf, const char *map
>>  	return NULL;
>>  }
>>
>> -static bool is_mmapable_map(const struct bpf_map *map, char *buf,
>> size_t sz)
>> +static bool bpf_map_is_skel_data(const struct bpf_map *map, char *buf,
>> size_t sz)
>>  {
> 
> 
> Yes, I think it addresses the issue, thanks! I'd maybe drop the
> "bpf_map_" prefix in the function name ("is_skel_data_map()" instead?)
> to remain closer to is_mmapable_map(), and to avoid creating confusion
> with libbpf functions names, although I don't feel strongly about it.


Makes sense. Will drop the prefix.

> You can add my ACK to this v9 for the bpftool patch when you repost the
> series.


Thanks for your review.

Leon


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 6/9] selftests/bpf: Add tests to verify global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (4 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 5/9] bpftool: Generate skeleton " Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-06-29 15:24 ` [PATCH bpf-next v8 7/9] selftests/bpf: Test direct reading/writing read-only percpu_array map Leon Hwang
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

If the arch, like s390x, does not support percpu insn, these cases won't
test global percpu data by checking FEAT_PERCPU_DATA support.

The following APIs have been tested for global percpu data:

1. bpf_map__set_initial_value()
2. bpf_map__initial_value()
3. generated percpu struct pointer pointing to internal map's mmaped data
4. bpf_map__lookup_elem() for global percpu data map

At the same time, the case is also tested with 'bpftool gen skeleton -L'.

Add a test to verify that the live vars of subskel won't include the vars
for global percpu data.

Assisted-by: Codex:gpt-5.5-xhigh
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 tools/testing/selftests/bpf/Makefile          |   2 +-
 .../bpf/prog_tests/global_data_init.c         | 149 ++++++++++++++++++
 .../bpf/prog_tests/global_percpu_subskel.c    |  37 +++++
 .../bpf/progs/test_global_percpu_data.c       |  31 ++++
 4 files changed, 218 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/global_percpu_subskel.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_global_percpu_data.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index b642ee489ea6..c37ed9e7b97c 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -533,7 +533,7 @@ LSKELS_SIGNED := fentry_test.c fexit_test.c atomics.c
 
 # Generate both light skeleton and libbpf skeleton for these
 LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c kfunc_call_test.c \
-	kfunc_call_test_subprog.c
+	kfunc_call_test_subprog.c test_global_percpu_data.c
 SKEL_BLACKLIST += $$(LSKELS) $$(LSKELS_SIGNED)
 
 test_static_linked.skel.h-deps := test_static_linked1.bpf.o test_static_linked2.bpf.o
diff --git a/tools/testing/selftests/bpf/prog_tests/global_data_init.c b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
index 8466332d7406..59db2cc771e7 100644
--- a/tools/testing/selftests/bpf/prog_tests/global_data_init.c
+++ b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
@@ -1,5 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <test_progs.h>
+#include "bpf/libbpf_internal.h"
+#include "test_global_percpu_data.skel.h"
+#include "test_global_percpu_data.lskel.h"
 
 void test_global_data_init(void)
 {
@@ -60,3 +63,149 @@ void test_global_data_init(void)
 	free(newval);
 	bpf_object__close(obj);
 }
+
+static void test_percpu_data_on_cpus(int map_fd, int prog_fd, int num_online, bool *online)
+{
+	__u64 args[2] = {0x1234ULL, 0x5678ULL};
+	LIBBPF_OPTS(bpf_test_run_opts, topts,
+		    .ctx_in = args,
+		    .ctx_size_in = sizeof(args),
+		    .flags = BPF_F_TEST_RUN_ON_CPU,
+	);
+	int i, err, key = 0;
+
+	/* run on every online-CPU */
+	for (i = 0; i < num_online; i++) {
+		struct test_global_percpu_data__percpu data = {};
+		__u64 flags;
+
+		if (!online[i])
+			continue;
+
+		topts.cpu = i;
+		topts.retval = -1;
+		err = bpf_prog_test_run_opts(prog_fd, &topts);
+		ASSERT_OK(err, "bpf_prog_test_run_opts");
+		ASSERT_EQ(topts.retval, 0, "bpf_prog_test_run_opts retval");
+
+		flags = ((__u64) i << 32) | BPF_F_CPU;
+		err = bpf_map_lookup_elem_flags(map_fd, &key, &data, flags);
+		if (!ASSERT_OK(err, "bpf_map_lookup_elem_flags"))
+			return;
+
+		ASSERT_EQ(data.data, 1, "data.data");
+		ASSERT_TRUE(data.run, "data.run");
+		ASSERT_EQ(data.nums[6], 0xc0de, "data.nums[6]");
+		ASSERT_EQ(data.struct_data.i, 1, "struct_data.i");
+		ASSERT_TRUE(data.struct_data.set, "struct_data.set");
+		ASSERT_EQ(data.struct_data.nums[6], 0xc0de, "struct_data.nums[6]");
+	}
+}
+
+static void test_global_percpu_data_init(void)
+{
+	struct test_global_percpu_data__percpu init_value = {};
+	struct test_global_percpu_data__percpu *init_data;
+	struct test_global_percpu_data *skel = NULL;
+	int prog_fd, map_fd, err, num_online;
+	size_t init_data_sz;
+	struct bpf_map *map;
+	bool *online;
+
+	err = parse_cpu_mask_file("/sys/devices/system/cpu/online", &online, &num_online);
+	if (!ASSERT_OK(err, "parse_cpu_mask_file"))
+		return;
+
+	skel = test_global_percpu_data__open();
+	if (!ASSERT_OK_PTR(skel, "test_global_percpu_data__open"))
+		goto out;
+	if (!ASSERT_OK_PTR(skel->percpu, "skel->percpu"))
+		goto out;
+
+	ASSERT_EQ(skel->percpu->data, -1, "skel->percpu->data");
+	ASSERT_FALSE(skel->percpu->run, "skel->percpu->run");
+	ASSERT_EQ(skel->percpu->nums[6], 0, "skel->percpu->nums[6]");
+	ASSERT_EQ(skel->percpu->struct_data.i, -1, "struct_data.i");
+	ASSERT_FALSE(skel->percpu->struct_data.set, "struct_data.set");
+	ASSERT_EQ(skel->percpu->struct_data.nums[6], 0, "struct_data.nums[6]");
+
+	map = skel->maps.percpu;
+	if (!ASSERT_EQ(bpf_map__type(map), BPF_MAP_TYPE_PERCPU_ARRAY, "bpf_map__type"))
+		goto out;
+
+	init_value.data = 2;
+	init_value.nums[6] = -1;
+	init_value.struct_data.i = 2;
+	init_value.struct_data.nums[6] = -1;
+	err = bpf_map__set_initial_value(map, &init_value, sizeof(init_value));
+	if (!ASSERT_OK(err, "bpf_map__set_initial_value"))
+		goto out;
+
+	init_data = bpf_map__initial_value(map, &init_data_sz);
+	if (!ASSERT_OK_PTR(init_data, "bpf_map__initial_value"))
+		goto out;
+
+	ASSERT_EQ(init_data->data, init_value.data, "init_value data");
+	ASSERT_EQ(init_data->run, init_value.run, "init_value run");
+	ASSERT_EQ(init_data->struct_data.i, init_value.struct_data.i, "init_value struct_data.i");
+	ASSERT_EQ(init_data->struct_data.nums[6], init_value.struct_data.nums[6],
+		  "init_value struct_data.nums[6]");
+	ASSERT_EQ(init_data_sz, sizeof(init_value), "init_value size");
+	ASSERT_EQ((void *) init_data, (void *) skel->percpu, "skel->percpu eq init_data");
+	ASSERT_EQ(skel->percpu->data, init_value.data, "skel->percpu->data");
+	ASSERT_EQ(skel->percpu->run, init_value.run, "skel->percpu->run");
+	ASSERT_EQ(skel->percpu->struct_data.i, init_value.struct_data.i,
+		  "skel->percpu->struct_data.i");
+	ASSERT_EQ(skel->percpu->struct_data.nums[6], init_value.struct_data.nums[6],
+		  "skel->percpu->struct_data.nums[6]");
+
+	err = test_global_percpu_data__load(skel);
+	if (!ASSERT_OK(err, "test_global_percpu_data__load"))
+		goto out;
+
+	ASSERT_OK_PTR(skel->percpu, "skel->percpu");
+
+	map_fd = bpf_map__fd(map);
+	prog_fd = bpf_program__fd(skel->progs.update_percpu_data);
+	test_percpu_data_on_cpus(map_fd, prog_fd, num_online, online);
+
+out:
+	test_global_percpu_data__destroy(skel);
+	free(online);
+}
+
+static void test_global_percpu_data_lskel(void)
+{
+	struct test_global_percpu_data_lskel *lskel = NULL;
+	int prog_fd, map_fd, err, num_online;
+	bool *online;
+
+	err = parse_cpu_mask_file("/sys/devices/system/cpu/online", &online, &num_online);
+	if (!ASSERT_OK(err, "parse_cpu_mask_file"))
+		return;
+
+	lskel = test_global_percpu_data_lskel__open_and_load();
+	if (!ASSERT_OK_PTR(lskel, "test_global_percpu_data_lskel__open_and_load"))
+		goto out;
+
+	map_fd = lskel->maps.percpu.map_fd;
+	prog_fd = lskel->progs.update_percpu_data.prog_fd;
+	test_percpu_data_on_cpus(map_fd, prog_fd, num_online, online);
+
+out:
+	test_global_percpu_data_lskel__destroy(lskel);
+	free(online);
+}
+
+void test_global_percpu_data(void)
+{
+	if (!feat_supported(NULL, FEAT_PERCPU_DATA)) {
+		test__skip();
+		return;
+	}
+
+	if (test__start_subtest("init"))
+		test_global_percpu_data_init();
+	if (test__start_subtest("lskel"))
+		test_global_percpu_data_lskel();
+}
diff --git a/tools/testing/selftests/bpf/prog_tests/global_percpu_subskel.c b/tools/testing/selftests/bpf/prog_tests/global_percpu_subskel.c
new file mode 100644
index 000000000000..8aebd533d86b
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/global_percpu_subskel.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "test_global_percpu_data.subskel.h"
+
+void test_global_percpu_data_subskel(void)
+{
+	struct test_global_percpu_data *subskel = NULL;
+	struct bpf_object *obj;
+	int i;
+
+	obj = bpf_object__open_file("./test_global_percpu_data.bpf.o", NULL);
+	if (!ASSERT_OK_PTR(obj, "bpf_object__open_file"))
+		return;
+
+	subskel = test_global_percpu_data__open(obj);
+	if (!ASSERT_OK_PTR(subskel, "test_global_percpu_data__open"))
+		goto out;
+
+	if (!ASSERT_OK_PTR(subskel->subskel, "subskel"))
+		goto out;
+	if (!ASSERT_OK_PTR(subskel->maps.percpu, "maps.percpu"))
+		goto out;
+	ASSERT_EQ(bpf_map__type(subskel->maps.percpu), BPF_MAP_TYPE_PERCPU_ARRAY,
+		  "percpu_map_type");
+	ASSERT_GT(subskel->subskel->var_cnt, 0, "var_cnt");
+
+	for (i = 0; i < subskel->subskel->var_cnt; i++) {
+		const struct bpf_var_skeleton *var;
+
+		var = (void *) subskel->subskel->vars + i * subskel->subskel->var_skel_sz;
+		ASSERT_NEQ(var->map, &subskel->maps.percpu, "var");
+	}
+
+out:
+	test_global_percpu_data__destroy(subskel);
+	bpf_object__close(obj);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_global_percpu_data.c b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
new file mode 100644
index 000000000000..ba92ffb0ca49
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+int data SEC(".percpu") = -1;
+int nums[7] SEC(".percpu");
+char run SEC(".percpu") = 0;
+struct {
+	char set;
+	int i;
+	int nums[7];
+} struct_data SEC(".percpu") = {
+	.set = 0,
+	.i = -1,
+};
+
+SEC("raw_tp/task_rename")
+__auxiliary
+int update_percpu_data(void *ctx)
+{
+	struct_data.nums[6] = 0xc0de;
+	struct_data.set = 1;
+	struct_data.i = 1;
+	nums[6] = 0xc0de;
+	data = 1;
+	run = 1;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 7/9] selftests/bpf: Test direct reading/writing read-only percpu_array map
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (5 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 6/9] selftests/bpf: Add tests to verify " Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-06-29 15:24 ` [PATCH bpf-next v8 8/9] selftests/bpf: Test verifier log for global percpu data Leon Hwang
  2026-06-29 15:24 ` [PATCH bpf-next v8 9/9] selftests/bpf: Verify bpf_iter " Leon Hwang
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Verify these two cases:

1. Direct reading the data of read-only percpu data's percpu_array map
   is allowed.
2. Direct writing the data of read-only percpu data's percpu_array map
   is disallowed.

Assisted-by: Codex:gpt-5.5-xhigh
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 .../bpf/prog_tests/global_data_init.c         | 83 +++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/global_data_init.c b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
index 59db2cc771e7..4f9eff36d856 100644
--- a/tools/testing/selftests/bpf/prog_tests/global_data_init.c
+++ b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
@@ -197,6 +197,85 @@ static void test_global_percpu_data_lskel(void)
 	free(online);
 }
 
+static int create_rdonly_percpu_array(void)
+{
+	LIBBPF_OPTS(bpf_map_create_opts, map_opts,
+		    .map_flags = BPF_F_RDONLY_PROG,
+	);
+	int key = 0, map_fd, err;
+	__u64 value = 0;
+
+	map_fd = bpf_map_create(BPF_MAP_TYPE_PERCPU_ARRAY, "percpu_ro_map", sizeof(int),
+				sizeof(__u64), 1, &map_opts);
+	if (!ASSERT_GE(map_fd, 0, "bpf_map_create"))
+		return -1;
+
+	err = bpf_map_update_elem(map_fd, &key, &value, BPF_F_ALL_CPUS);
+	if (!ASSERT_OK(err, "bpf_map_update_elem"))
+		goto out;
+
+	err = bpf_map_freeze(map_fd);
+	if (!ASSERT_OK(err, "bpf_map_freeze"))
+		goto out;
+
+	return map_fd;
+
+out:
+	close(map_fd);
+	return -1;
+}
+
+static void test_global_percpu_data_rdonly_direct_read(void)
+{
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, 0),
+		BPF_EXIT_INSN(),
+	};
+	int map_fd, prog_fd;
+
+	map_fd = create_rdonly_percpu_array();
+	if (map_fd < 0)
+		return;
+
+	insns[0].imm = map_fd;
+	prog_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, "percpu_ro_prog", "GPL", insns,
+				ARRAY_SIZE(insns), NULL);
+	if (ASSERT_GE(prog_fd, 0, "bpf_prog_load"))
+		close(prog_fd);
+	close(map_fd);
+}
+
+static void test_global_percpu_data_rdonly_direct_write(void)
+{
+	LIBBPF_OPTS(bpf_prog_load_opts, prog_opts);
+	struct bpf_insn insns[] = {
+		BPF_LD_MAP_VALUE(BPF_REG_1, 0, 0),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, 0),
+		BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0),
+		BPF_EXIT_INSN(),
+	};
+	char log_buf[256] = {};
+	int map_fd, prog_fd;
+
+	prog_opts.log_buf = log_buf;
+	prog_opts.log_size = sizeof(log_buf);
+	prog_opts.log_level = 1;
+
+	map_fd = create_rdonly_percpu_array();
+	if (map_fd < 0)
+		return;
+
+	insns[0].imm = map_fd;
+	prog_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, "percpu_ro_prog", "GPL", insns,
+				ARRAY_SIZE(insns), &prog_opts);
+	if (!ASSERT_LT(prog_fd, 0, "bpf_prog_load"))
+		close(prog_fd);
+	else
+		ASSERT_HAS_SUBSTR(log_buf, "write into map forbidden", "verifier log");
+	close(map_fd);
+}
+
 void test_global_percpu_data(void)
 {
 	if (!feat_supported(NULL, FEAT_PERCPU_DATA)) {
@@ -208,4 +287,8 @@ void test_global_percpu_data(void)
 		test_global_percpu_data_init();
 	if (test__start_subtest("lskel"))
 		test_global_percpu_data_lskel();
+	if (test__start_subtest("rdonly_direct_read"))
+		test_global_percpu_data_rdonly_direct_read();
+	if (test__start_subtest("rdonly_direct_write"))
+		test_global_percpu_data_rdonly_direct_write();
 }
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 8/9] selftests/bpf: Test verifier log for global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (6 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 7/9] selftests/bpf: Test direct reading/writing read-only percpu_array map Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  2026-06-29 15:24 ` [PATCH bpf-next v8 9/9] selftests/bpf: Verify bpf_iter " Leon Hwang
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Add two tests to verify the verifier log
"R%d points to percpu_array map which cannot be used as const string\n".

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 .../bpf/prog_tests/global_data_init.c         |  6 +++++
 .../bpf/progs/test_global_percpu_data.c       | 23 +++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/global_data_init.c b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
index 4f9eff36d856..ed085e2eb956 100644
--- a/tools/testing/selftests/bpf/prog_tests/global_data_init.c
+++ b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
@@ -276,6 +276,11 @@ static void test_global_percpu_data_rdonly_direct_write(void)
 	close(map_fd);
 }
 
+static void test_global_percpu_data_verifier_log(void)
+{
+	RUN_TESTS(test_global_percpu_data);
+}
+
 void test_global_percpu_data(void)
 {
 	if (!feat_supported(NULL, FEAT_PERCPU_DATA)) {
@@ -291,4 +296,5 @@ void test_global_percpu_data(void)
 		test_global_percpu_data_rdonly_direct_read();
 	if (test__start_subtest("rdonly_direct_write"))
 		test_global_percpu_data_rdonly_direct_write();
+	test_global_percpu_data_verifier_log();
 }
diff --git a/tools/testing/selftests/bpf/progs/test_global_percpu_data.c b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
index ba92ffb0ca49..a6109e835948 100644
--- a/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
+++ b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
@@ -28,4 +28,27 @@ int update_percpu_data(void *ctx)
 	return 0;
 }
 
+static const char fmt[] SEC(".percpu.fmt") = "data %d\n";
+
+SEC("?kprobe")
+__failure __msg("R{{[0-9]+}} points to percpu_array map which cannot be used as const string")
+int verifier_strncmp(void *ctx)
+{
+	return bpf_strncmp("test", 5, fmt);
+}
+
+SEC("?kprobe")
+__failure __msg("R{{[0-9]+}} points to percpu_array map which cannot be used as const string")
+int verifier_snprintf(void *ctx)
+{
+	u64 args[] = { data };
+	char buf[128];
+	int len;
+
+	len = bpf_snprintf(buf, sizeof(buf), fmt, args, sizeof(args));
+	if (len > 0)
+		bpf_printk("snprintf: %s\n", buf);
+	return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v8 9/9] selftests/bpf: Verify bpf_iter for global percpu data
  2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
                   ` (7 preceding siblings ...)
  2026-06-29 15:24 ` [PATCH bpf-next v8 8/9] selftests/bpf: Test verifier log for global percpu data Leon Hwang
@ 2026-06-29 15:24 ` Leon Hwang
  8 siblings, 0 replies; 19+ messages in thread
From: Leon Hwang @ 2026-06-29 15:24 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	Quentin Monnet, Shuah Khan, Leon Hwang, linux-kernel,
	linux-kselftest, kernel-patches-bot

Add a test to verify that it is OK to iter the percpu_array map used for
global percpu data.

Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
 .../bpf/prog_tests/global_data_init.c         | 52 +++++++++++++++++++
 .../bpf/progs/test_global_percpu_data.c       | 25 +++++++++
 2 files changed, 77 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/global_data_init.c b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
index ed085e2eb956..a5d87768f6ee 100644
--- a/tools/testing/selftests/bpf/prog_tests/global_data_init.c
+++ b/tools/testing/selftests/bpf/prog_tests/global_data_init.c
@@ -281,6 +281,56 @@ static void test_global_percpu_data_verifier_log(void)
 	RUN_TESTS(test_global_percpu_data);
 }
 
+static void test_global_percpu_data_iter(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts);
+	struct test_global_percpu_data *skel;
+	union bpf_iter_link_info linfo = {};
+	struct bpf_link *link = NULL;
+	int fd, num_cpus, len, err;
+	char buf[16];
+
+	num_cpus = libbpf_num_possible_cpus();
+	if (!ASSERT_GT(num_cpus, 0, "libbpf_num_possible_cpus"))
+		return;
+
+	skel = test_global_percpu_data__open();
+	if (!ASSERT_OK_PTR(skel, "test_global_percpu_data__open"))
+		return;
+
+	skel->rodata->num_cpus = num_cpus;
+	skel->rodata->offsetof_num = offsetof(struct test_global_percpu_data__percpu, struct_data);
+	skel->rodata->offsetof_num += sizeof(skel->percpu->struct_data) - sizeof(int);
+	skel->rodata->elem_sz = roundup(sizeof(struct test_global_percpu_data__percpu), 8);
+	skel->percpu->struct_data.nums[6] = 0xc0de;
+
+	err = test_global_percpu_data__load(skel);
+	if (!ASSERT_OK(err, "test_global_percpu_data__load"))
+		goto out;
+
+	linfo.map.map_fd = bpf_map__fd(skel->maps.percpu);
+	opts.link_info = &linfo;
+	opts.link_info_len = sizeof(linfo);
+	link = bpf_program__attach_iter(skel->progs.dump_percpu_data, &opts);
+	if (!ASSERT_OK_PTR(link, "bpf_program__attach_iter"))
+		goto out;
+
+	fd = bpf_iter_create(bpf_link__fd(link));
+	if (!ASSERT_GE(fd, 0, "bpf_iter_create"))
+		goto out;
+
+	while ((len = read(fd, buf, sizeof(buf))) > 0)
+		do { } while (0);
+	ASSERT_EQ(len, 0, "read iter");
+	ASSERT_TRUE(skel->bss->run_iter, "run_iter");
+	ASSERT_EQ(skel->bss->percpu_data_sum, 0xc0de * num_cpus, "percpu_data_sum");
+
+	close(fd);
+out:
+	bpf_link__destroy(link);
+	test_global_percpu_data__destroy(skel);
+}
+
 void test_global_percpu_data(void)
 {
 	if (!feat_supported(NULL, FEAT_PERCPU_DATA)) {
@@ -297,4 +347,6 @@ void test_global_percpu_data(void)
 	if (test__start_subtest("rdonly_direct_write"))
 		test_global_percpu_data_rdonly_direct_write();
 	test_global_percpu_data_verifier_log();
+	if (test__start_subtest("iter"))
+		test_global_percpu_data_iter();
 }
diff --git a/tools/testing/selftests/bpf/progs/test_global_percpu_data.c b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
index a6109e835948..eef88fd61af5 100644
--- a/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
+++ b/tools/testing/selftests/bpf/progs/test_global_percpu_data.c
@@ -51,4 +51,29 @@ int verifier_snprintf(void *ctx)
 	return 0;
 }
 
+volatile const __u32 num_cpus = 0;
+volatile const int offsetof_num;
+volatile const int elem_sz;
+__u32 percpu_data_sum = 0;
+bool run_iter = false;
+
+SEC("iter/bpf_map_elem")
+__auxiliary
+int dump_percpu_data(struct bpf_iter__bpf_map_elem *ctx)
+{
+	void *pptr = ctx->value;
+	int i;
+
+	if (!pptr)
+		return 0;
+
+	run_iter = true;
+
+	for (i = 0; i < num_cpus; i++) {
+		percpu_data_sum += *(int *) (pptr + offsetof_num);
+		pptr += elem_sz;
+	}
+	return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2026-07-02 14:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 15:23 [PATCH bpf-next v8 0/9] bpf: Introduce global percpu data Leon Hwang
2026-06-29 15:23 ` [PATCH bpf-next v8 1/9] bpf: Drop duplicate blank lines in verifier Leon Hwang
2026-06-29 15:23 ` [PATCH bpf-next v8 2/9] bpf: Introduce global percpu data Leon Hwang
2026-07-01 19:31   ` Andrii Nakryiko
2026-07-02  6:15     ` Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 3/9] libbpf: Probe percpu data feature Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 4/9] libbpf: Add support for global percpu data Leon Hwang
2026-07-01 19:32   ` Andrii Nakryiko
2026-07-02  6:16     ` Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 5/9] bpftool: Generate skeleton " Leon Hwang
2026-07-01 16:49   ` Quentin Monnet
2026-07-01 19:32     ` Andrii Nakryiko
2026-07-02  6:24       ` Leon Hwang
2026-07-02 10:14         ` Quentin Monnet
2026-07-02 14:08           ` Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 6/9] selftests/bpf: Add tests to verify " Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 7/9] selftests/bpf: Test direct reading/writing read-only percpu_array map Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 8/9] selftests/bpf: Test verifier log for global percpu data Leon Hwang
2026-06-29 15:24 ` [PATCH bpf-next v8 9/9] selftests/bpf: Verify bpf_iter " Leon Hwang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox