* [PATCH bpf-next v2 0/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
@ 2025-08-05 16:30 Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-05 16:30 UTC (permalink / raw)
To: bpf
Cc: ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
leon.hwang, kernel-patches-bot
This patch set introduces the BPF_F_CPU flag for percpu_array maps, as
discussed in the thread of
"[PATCH bpf-next v3 0/4] bpf: Introduce global percpu data"[1].
The goal is to reduce data caching overhead in light skeletons by allowing
a single value to be reused across all CPUs. This avoids the M:N problem
where M cached values are used to update a map on N CPUs kernel.
The BPF_F_CPU flag is accompanied by *flags*-embedded cpu info, which
specifies the target CPUs for the operation:
* For lookup operations: the flag field alongside cpu info enable querying
a value on the specified CPU.
* For update operations:
* If cpu == (u32)~0, the provided value is copied to all CPUs.
* Otherwise, the value is copied to the specified CPU only.
Currently, this functionality is only supported for percpu_array maps.
Links:
[1] https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@linux.dev/
Changes:
v1 -> v2:
* Address comments from Andrii:
* Embed cpu info as high 32 bits of *flags* totally.
* Use ERANGE instead of E2BIG.
* Few format issues.
RFC v2 -> v1:
* Address comments from Andrii:
* Use '&=' and '|='.
* Replace 'reuse_value' with simpler and less duplication code.
* Replace 'ASSERT_FALSE' with two 'ASSERT_OK_PTR's in self test.
RFC v1 -> RFC v2:
* Address comments from Andrii:
* Embed cpu to flags on kernel side.
* Change BPF_ALL_CPU macro to BPF_ALL_CPUS enum.
* Copy/update element within RCU protection.
* Update bpf_map_value_size() including BPF_F_CPU case.
* Use zero as default value to get cpu option.
* Update documents of APIs to be generic.
* Add size_t:0 to opts definitions.
* Update validate_map_op() including BPF_F_CPU case.
* Use LIBBPF_OPTS instead of DECLARE_LIBBPF_OPTS.
Leon Hwang (3):
bpf: Introduce BPF_F_CPU flag for percpu_array maps
libbpf: Support BPF_F_CPU for percpu_array maps
selftests/bpf: Add case to test BPF_F_CPU
Leon Hwang (3):
bpf: Introduce BPF_F_CPU flag for percpu_array maps
libbpf: Support BPF_F_CPU for percpu_array maps
selftests/bpf: Add case to test BPF_F_CPU
include/linux/bpf.h | 3 +-
include/uapi/linux/bpf.h | 6 +
kernel/bpf/arraymap.c | 54 +++++--
kernel/bpf/syscall.c | 77 +++++----
tools/include/uapi/linux/bpf.h | 6 +
tools/lib/bpf/bpf.h | 5 +
tools/lib/bpf/libbpf.c | 28 +++-
tools/lib/bpf/libbpf.h | 17 +-
.../selftests/bpf/prog_tests/percpu_alloc.c | 149 ++++++++++++++++++
.../selftests/bpf/progs/percpu_array_flag.c | 24 +++
10 files changed, 309 insertions(+), 60 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/percpu_array_flag.c
--
2.50.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-05 16:30 [PATCH bpf-next v2 0/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps Leon Hwang
@ 2025-08-05 16:30 ` Leon Hwang
2025-08-07 8:34 ` Jiri Olsa
2025-08-07 17:20 ` Alexei Starovoitov
2025-08-05 16:30 ` [PATCH bpf-next v2 2/3] libbpf: Support BPF_F_CPU " Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add case to test BPF_F_CPU Leon Hwang
2 siblings, 2 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-05 16:30 UTC (permalink / raw)
To: bpf
Cc: ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
leon.hwang, kernel-patches-bot
Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
updating values for specified CPU or for all CPUs with a single value.
This enhancement enables:
* Efficient update of all CPUs using a single value when cpu == (u32)~0.
* Targeted update or lookup for a specified CPU otherwise.
The flag is passed via:
* map_flags in bpf_percpu_array_update() along with embedded cpu field.
* elem_flags in generic_map_update_batch() along with embedded cpu field.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 3 +-
include/uapi/linux/bpf.h | 6 +++
kernel/bpf/arraymap.c | 54 ++++++++++++++++++------
kernel/bpf/syscall.c | 77 +++++++++++++++++++++-------------
tools/include/uapi/linux/bpf.h | 6 +++
5 files changed, 103 insertions(+), 43 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cc700925b802f..c17c45f797ed9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2691,7 +2691,8 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
struct bpf_func_state *callee);
int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
-int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
+int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
+ u64 flags);
int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
u64 flags);
int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 233de8677382e..67bc35e4d6a8d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1372,6 +1372,12 @@ enum {
BPF_NOEXIST = 1, /* create new element if it didn't exist */
BPF_EXIST = 2, /* update existing element */
BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
+ BPF_F_CPU = 8, /* map_update for percpu_array */
+};
+
+enum {
+ /* indicate updating value across all CPUs for percpu maps. */
+ BPF_ALL_CPUS = (__u32)~0,
};
/* flags for BPF_MAP_CREATE command */
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 3d080916faf97..98759f0b22397 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -295,17 +295,24 @@ static void *percpu_array_map_lookup_percpu_elem(struct bpf_map *map, void *key,
return per_cpu_ptr(array->pptrs[index & array->index_mask], cpu);
}
-int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
+int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value, u64 flags)
{
struct bpf_array *array = container_of(map, struct bpf_array, map);
u32 index = *(u32 *)key;
void __percpu *pptr;
- int cpu, off = 0;
- u32 size;
+ u32 size, cpu;
+ int off = 0;
if (unlikely(index >= array->map.max_entries))
return -ENOENT;
+ cpu = flags >> 32;
+ flags &= (u32)~0;
+ if (unlikely(flags > BPF_F_CPU))
+ return -EINVAL;
+ if (unlikely((flags & BPF_F_CPU) && cpu >= num_possible_cpus()))
+ return -ERANGE;
+
/* per_cpu areas are zero-filled and bpf programs can only
* access 'value_size' of them, so copying rounded areas
* will not leak any kernel data
@@ -313,10 +320,15 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
size = array->elem_size;
rcu_read_lock();
pptr = array->pptrs[index & array->index_mask];
- for_each_possible_cpu(cpu) {
- copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
- check_and_init_map_value(map, value + off);
- off += size;
+ if (flags & BPF_F_CPU) {
+ copy_map_value_long(map, value, per_cpu_ptr(pptr, cpu));
+ check_and_init_map_value(map, value);
+ } else {
+ for_each_possible_cpu(cpu) {
+ copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
+ check_and_init_map_value(map, value + off);
+ off += size;
+ }
}
rcu_read_unlock();
return 0;
@@ -387,13 +399,20 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
struct bpf_array *array = container_of(map, struct bpf_array, map);
u32 index = *(u32 *)key;
void __percpu *pptr;
- int cpu, off = 0;
- u32 size;
+ u32 size, cpu;
+ int off = 0;
- if (unlikely(map_flags > BPF_EXIST))
+ cpu = map_flags >> 32;
+ map_flags &= (u32)~0;
+ if (unlikely(map_flags > BPF_F_CPU))
/* unknown flags */
return -EINVAL;
+ if (unlikely((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS &&
+ cpu >= num_possible_cpus()))
+ /* invalid cpu */
+ return -ERANGE;
+
if (unlikely(index >= array->map.max_entries))
/* all elements were pre-allocated, cannot insert a new one */
return -E2BIG;
@@ -411,10 +430,19 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
size = array->elem_size;
rcu_read_lock();
pptr = array->pptrs[index & array->index_mask];
- for_each_possible_cpu(cpu) {
- copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
+ if ((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS) {
+ copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value);
bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
- off += size;
+ } else {
+ for_each_possible_cpu(cpu) {
+ copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
+ /* same user-provided value is used if BPF_F_CPU is specified,
+ * otherwise value is an array of per-cpu values.
+ */
+ if (!(map_flags & BPF_F_CPU))
+ off += size;
+ bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
+ }
}
rcu_read_unlock();
return 0;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0fbfa8532c392..43f19d02bc5ce 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -131,8 +131,11 @@ bool bpf_map_write_active(const struct bpf_map *map)
return atomic64_read(&map->writecnt) != 0;
}
-static u32 bpf_map_value_size(const struct bpf_map *map)
+static u32 bpf_map_value_size(const struct bpf_map *map, u64 flags)
{
+ if ((flags & BPF_F_CPU) && map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
+ return round_up(map->value_size, 8);
+
if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY ||
@@ -314,7 +317,7 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value,
map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
err = bpf_percpu_hash_copy(map, key, value);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
- err = bpf_percpu_array_copy(map, key, value);
+ err = bpf_percpu_array_copy(map, key, value, flags);
} else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
err = bpf_percpu_cgroup_storage_copy(map, key, value);
} else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) {
@@ -1669,7 +1672,10 @@ static int map_lookup_elem(union bpf_attr *attr)
if (CHECK_ATTR(BPF_MAP_LOOKUP_ELEM))
return -EINVAL;
- if (attr->flags & ~BPF_F_LOCK)
+ if ((u32)attr->flags & ~(BPF_F_LOCK | BPF_F_CPU))
+ return -EINVAL;
+
+ if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32)
return -EINVAL;
CLASS(fd, f)(attr->map_fd);
@@ -1679,7 +1685,7 @@ static int map_lookup_elem(union bpf_attr *attr)
if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
return -EPERM;
- if ((attr->flags & BPF_F_LOCK) &&
+ if (((u32)attr->flags & BPF_F_LOCK) &&
!btf_record_has_field(map->record, BPF_SPIN_LOCK))
return -EINVAL;
@@ -1687,7 +1693,7 @@ static int map_lookup_elem(union bpf_attr *attr)
if (IS_ERR(key))
return PTR_ERR(key);
- value_size = bpf_map_value_size(map);
+ value_size = bpf_map_value_size(map, attr->flags);
err = -ENOMEM;
value = kvmalloc(value_size, GFP_USER | __GFP_NOWARN);
@@ -1744,19 +1750,24 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
goto err_put;
}
- if ((attr->flags & BPF_F_LOCK) &&
+ if (((u32)attr->flags & BPF_F_LOCK) &&
!btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
err = -EINVAL;
goto err_put;
}
+ if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32) {
+ err = -EINVAL;
+ goto err_put;
+ }
+
key = ___bpf_copy_key(ukey, map->key_size);
if (IS_ERR(key)) {
err = PTR_ERR(key);
goto err_put;
}
- value_size = bpf_map_value_size(map);
+ value_size = bpf_map_value_size(map, attr->flags);
value = kvmemdup_bpfptr(uvalue, value_size);
if (IS_ERR(value)) {
err = PTR_ERR(value);
@@ -1942,6 +1953,25 @@ int generic_map_delete_batch(struct bpf_map *map,
return err;
}
+static int check_map_batch_elem_flags(struct bpf_map *map, u64 elem_flags)
+{
+ u32 flags = elem_flags;
+
+ if (flags & ~(BPF_F_LOCK | BPF_F_CPU))
+ return -EINVAL;
+
+ if ((flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK))
+ return -EINVAL;
+
+ if (!(flags & BPF_F_CPU) && elem_flags >> 32)
+ return -EINVAL;
+
+ if ((flags & BPF_F_CPU) && map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
+ return -EINVAL;
+
+ return 0;
+}
+
int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
const union bpf_attr *attr,
union bpf_attr __user *uattr)
@@ -1952,15 +1982,11 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
void *key, *value;
int err = 0;
- if (attr->batch.elem_flags & ~BPF_F_LOCK)
- return -EINVAL;
-
- if ((attr->batch.elem_flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
- return -EINVAL;
- }
+ err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
+ if (err)
+ return err;
- value_size = bpf_map_value_size(map);
+ value_size = bpf_map_value_size(map, attr->batch.elem_flags);
max_count = attr->batch.count;
if (!max_count)
@@ -1986,9 +2012,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
copy_from_user(value, values + cp * value_size, value_size))
break;
- err = bpf_map_update_value(map, map_file, key, value,
- attr->batch.elem_flags);
-
+ err = bpf_map_update_value(map, map_file, key, value, attr->batch.elem_flags);
if (err)
break;
cond_resched();
@@ -2015,14 +2039,11 @@ int generic_map_lookup_batch(struct bpf_map *map,
u32 value_size, cp, max_count;
int err;
- if (attr->batch.elem_flags & ~BPF_F_LOCK)
- return -EINVAL;
-
- if ((attr->batch.elem_flags & BPF_F_LOCK) &&
- !btf_record_has_field(map->record, BPF_SPIN_LOCK))
- return -EINVAL;
+ err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
+ if (err)
+ return err;
- value_size = bpf_map_value_size(map);
+ value_size = bpf_map_value_size(map, attr->batch.elem_flags);
max_count = attr->batch.count;
if (!max_count)
@@ -2056,9 +2077,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
rcu_read_unlock();
if (err)
break;
- err = bpf_map_copy_value(map, key, value,
- attr->batch.elem_flags);
-
+ err = bpf_map_copy_value(map, key, value, attr->batch.elem_flags);
if (err == -ENOENT)
goto next_key;
@@ -2144,7 +2163,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
goto err_put;
}
- value_size = bpf_map_value_size(map);
+ value_size = bpf_map_value_size(map, 0);
err = -ENOMEM;
value = kvmalloc(value_size, GFP_USER | __GFP_NOWARN);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 233de8677382e..67bc35e4d6a8d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1372,6 +1372,12 @@ enum {
BPF_NOEXIST = 1, /* create new element if it didn't exist */
BPF_EXIST = 2, /* update existing element */
BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
+ BPF_F_CPU = 8, /* map_update for percpu_array */
+};
+
+enum {
+ /* indicate updating value across all CPUs for percpu maps. */
+ BPF_ALL_CPUS = (__u32)~0,
};
/* flags for BPF_MAP_CREATE command */
--
2.50.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH bpf-next v2 2/3] libbpf: Support BPF_F_CPU for percpu_array maps
2025-08-05 16:30 [PATCH bpf-next v2 0/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
@ 2025-08-05 16:30 ` Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add case to test BPF_F_CPU Leon Hwang
2 siblings, 0 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-05 16:30 UTC (permalink / raw)
To: bpf
Cc: ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
leon.hwang, kernel-patches-bot
Add libbpf support for the BPF_F_CPU flag for percpu_array maps by
embedding the cpu info into the high 32 bits of:
1. **flags**: bpf_map_lookup_elem_flags(), bpf_map__lookup_elem(),
bpf_map_update_elem() and bpf_map__update_elem()
2. **opts->elem_flags**: bpf_map_lookup_batch() and
bpf_map_update_batch()
Behavior:
* If cpu is (u32)~0, the update is applied across all CPUs.
* Otherwise, it updates value only to the specified CPU.
* If cpu is (u32)~0, lookup values across all CPUs.
* Otherwise, it lookups value only from the specified CPU.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
tools/lib/bpf/bpf.h | 5 +++++
tools/lib/bpf/libbpf.c | 28 ++++++++++++++++++++++------
tools/lib/bpf/libbpf.h | 17 ++++++-----------
3 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 7252150e7ad35..8dea8216d5992 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -286,6 +286,11 @@ LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
* Update spin_lock-ed map elements. This must be
* specified if the map value contains a spinlock.
*
+ * **BPF_F_CPU**
+ * As for percpu map, the cpu info is embedded into the high 32 bits of
+ * **opts->elem_flags**. Update value across all CPUs if cpu is (__u32)~0,
+ * or on specified CPU otherwise.
+ *
* @param fd BPF map file descriptor
* @param keys pointer to an array of *count* keys
* @param values pointer to an array of *count* values
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fb4d92c5c3394..29ada5d62ba26 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10593,8 +10593,10 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name)
}
static int validate_map_op(const struct bpf_map *map, size_t key_sz,
- size_t value_sz, bool check_value_sz)
+ size_t value_sz, bool check_value_sz, __u64 flags)
{
+ __u32 cpu;
+
if (!map_is_created(map)) /* map is not yet created */
return -ENOENT;
@@ -10612,6 +10614,20 @@ static int validate_map_op(const struct bpf_map *map, size_t key_sz,
if (!check_value_sz)
return 0;
+ if (flags & BPF_F_CPU) {
+ if (map->def.type != BPF_MAP_TYPE_PERCPU_ARRAY)
+ return -EINVAL;
+ cpu = flags >> 32;
+ if (cpu != BPF_ALL_CPUS && cpu >= libbpf_num_possible_cpus())
+ return -ERANGE;
+ if (map->def.value_size != value_sz) {
+ pr_warn("map '%s': unexpected value size %zu provided, expected %u\n",
+ map->name, value_sz, map->def.value_size);
+ return -EINVAL;
+ }
+ return 0;
+ }
+
switch (map->def.type) {
case BPF_MAP_TYPE_PERCPU_ARRAY:
case BPF_MAP_TYPE_PERCPU_HASH:
@@ -10644,7 +10660,7 @@ int bpf_map__lookup_elem(const struct bpf_map *map,
{
int err;
- err = validate_map_op(map, key_sz, value_sz, true);
+ err = validate_map_op(map, key_sz, value_sz, true, flags);
if (err)
return libbpf_err(err);
@@ -10657,7 +10673,7 @@ int bpf_map__update_elem(const struct bpf_map *map,
{
int err;
- err = validate_map_op(map, key_sz, value_sz, true);
+ err = validate_map_op(map, key_sz, value_sz, true, flags);
if (err)
return libbpf_err(err);
@@ -10669,7 +10685,7 @@ int bpf_map__delete_elem(const struct bpf_map *map,
{
int err;
- err = validate_map_op(map, key_sz, 0, false /* check_value_sz */);
+ err = validate_map_op(map, key_sz, 0, false /* check_value_sz */, 0);
if (err)
return libbpf_err(err);
@@ -10682,7 +10698,7 @@ int bpf_map__lookup_and_delete_elem(const struct bpf_map *map,
{
int err;
- err = validate_map_op(map, key_sz, value_sz, true);
+ err = validate_map_op(map, key_sz, value_sz, true, 0);
if (err)
return libbpf_err(err);
@@ -10694,7 +10710,7 @@ int bpf_map__get_next_key(const struct bpf_map *map,
{
int err;
- err = validate_map_op(map, key_sz, 0, false /* check_value_sz */);
+ err = validate_map_op(map, key_sz, 0, false /* check_value_sz */, 0);
if (err)
return libbpf_err(err);
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index d1cf813a057bc..bde22b017a3ce 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1169,10 +1169,11 @@ LIBBPF_API struct bpf_map *bpf_map__inner_map(struct bpf_map *map);
* @param key_sz size in bytes of key data, needs to match BPF map definition's **key_size**
* @param value pointer to memory in which looked up value will be stored
* @param value_sz size in byte of value data memory; it has to match BPF map
- * definition's **value_size**. For per-CPU BPF maps value size has to be
- * a product of BPF map value size and number of possible CPUs in the system
- * (could be fetched with **libbpf_num_possible_cpus()**). Note also that for
- * per-CPU values value size has to be aligned up to closest 8 bytes for
+ * definition's **value_size**. For per-CPU BPF maps, value size can be
+ * definition's **value_size** if **BPF_F_CPU** is specified in **flags**,
+ * otherwise a product of BPF map value size and number of possible CPUs in the
+ * system (could be fetched with **libbpf_num_possible_cpus()**). Note else that
+ * for per-CPU values value size has to be aligned up to closest 8 bytes for
* alignment reasons, so expected size is: `round_up(value_size, 8)
* * libbpf_num_possible_cpus()`.
* @flags extra flags passed to kernel for this operation
@@ -1192,13 +1193,7 @@ LIBBPF_API int bpf_map__lookup_elem(const struct bpf_map *map,
* @param key pointer to memory containing bytes of the key
* @param key_sz size in bytes of key data, needs to match BPF map definition's **key_size**
* @param value pointer to memory containing bytes of the value
- * @param value_sz size in byte of value data memory; it has to match BPF map
- * definition's **value_size**. For per-CPU BPF maps value size has to be
- * a product of BPF map value size and number of possible CPUs in the system
- * (could be fetched with **libbpf_num_possible_cpus()**). Note also that for
- * per-CPU values value size has to be aligned up to closest 8 bytes for
- * alignment reasons, so expected size is: `round_up(value_size, 8)
- * * libbpf_num_possible_cpus()`.
+ * @param value_sz refer to **bpf_map__lookup_elem**'s description.'
* @flags extra flags passed to kernel for this operation
* @return 0, on success; negative error, otherwise
*
--
2.50.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH bpf-next v2 3/3] selftests/bpf: Add case to test BPF_F_CPU
2025-08-05 16:30 [PATCH bpf-next v2 0/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 2/3] libbpf: Support BPF_F_CPU " Leon Hwang
@ 2025-08-05 16:30 ` Leon Hwang
2 siblings, 0 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-05 16:30 UTC (permalink / raw)
To: bpf
Cc: ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
leon.hwang, kernel-patches-bot
This patch adds test coverage for the new BPF_F_CPU flag support in
percpu_array maps. The following APIs are exercised:
* bpf_map_update_batch()
* bpf_map_lookup_batch()
* bpf_map_update_elem()
* bpf_map__update_elem()
* bpf_map_lookup_elem_flags()
* bpf_map__lookup_elem()
cd tools/testing/selftests/bpf/
./test_progs -t percpu_alloc
253/13 percpu_alloc/cpu_flag_tests:OK
253 percpu_alloc:OK
Summary: 1/13 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
.../selftests/bpf/prog_tests/percpu_alloc.c | 149 ++++++++++++++++++
.../selftests/bpf/progs/percpu_array_flag.c | 24 +++
2 files changed, 173 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/percpu_array_flag.c
diff --git a/tools/testing/selftests/bpf/prog_tests/percpu_alloc.c b/tools/testing/selftests/bpf/prog_tests/percpu_alloc.c
index 343da65864d6d..2500675d489df 100644
--- a/tools/testing/selftests/bpf/prog_tests/percpu_alloc.c
+++ b/tools/testing/selftests/bpf/prog_tests/percpu_alloc.c
@@ -3,6 +3,7 @@
#include "percpu_alloc_array.skel.h"
#include "percpu_alloc_cgrp_local_storage.skel.h"
#include "percpu_alloc_fail.skel.h"
+#include "percpu_array_flag.skel.h"
static void test_array(void)
{
@@ -115,6 +116,152 @@ static void test_failure(void) {
RUN_TESTS(percpu_alloc_fail);
}
+static void test_cpu_flag(void)
+{
+ int map_fd, *keys = NULL, value_size, cpu, i, j, nr_cpus, err;
+ size_t key_sz = sizeof(int), value_sz = sizeof(u64);
+ u64 batch = 0, *values = NULL, flags;
+ struct percpu_array_flag *skel;
+ const u64 value = 0xDEADC0DE;
+ u32 count, max_entries;
+ struct bpf_map *map;
+ LIBBPF_OPTS(bpf_map_batch_opts, batch_opts);
+
+ nr_cpus = libbpf_num_possible_cpus();
+ if (!ASSERT_GT(nr_cpus, 0, "libbpf_num_possible_cpus"))
+ return;
+
+ skel = percpu_array_flag__open_and_load();
+ if (!ASSERT_OK_PTR(skel, "percpu_array_flag__open_and_load"))
+ return;
+
+ map = skel->maps.percpu;
+ map_fd = bpf_map__fd(map);
+ max_entries = bpf_map__max_entries(map);
+
+ value_size = value_sz * nr_cpus;
+ values = calloc(max_entries, value_size);
+ if (!ASSERT_OK_PTR(values, "calloc values"))
+ goto out;
+ keys = calloc(max_entries, key_sz);
+ if (!ASSERT_OK_PTR(keys, "calloc keys"))
+ goto out;
+
+ for (i = 0; i < max_entries; i++)
+ keys[i] = i;
+ memset(values, 0, max_entries * value_size);
+
+ batch_opts.elem_flags = (u64)nr_cpus << 32 | BPF_F_CPU;
+ err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &batch_opts);
+ if (!ASSERT_EQ(err, -ERANGE, "bpf_map_update_batch -ERANGE"))
+ goto out;
+
+ for (cpu = 0; cpu < nr_cpus; cpu++) {
+ memset(values, 0, max_entries * value_size);
+
+ /* clear values across all CPUs */
+ batch_opts.elem_flags = (u64)BPF_ALL_CPUS << 32 | BPF_F_CPU;
+ err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &batch_opts);
+ if (!ASSERT_OK(err, "bpf_map_update_batch all cpus"))
+ goto out;
+
+ /* update values on specified CPU */
+ for (i = 0; i < max_entries; i++)
+ values[i] = value;
+
+ batch_opts.elem_flags = (u64)cpu << 32 | BPF_F_CPU;
+ err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &batch_opts);
+ if (!ASSERT_OK(err, "bpf_map_update_batch specified cpu"))
+ goto out;
+
+ /* lookup values on specified CPU */
+ memset(values, 0, max_entries * value_sz);
+ err = bpf_map_lookup_batch(map_fd, NULL, &batch, keys, values, &count, &batch_opts);
+ if (!ASSERT_TRUE(!err || err == -ENOENT, "bpf_map_lookup_batch specified cpu"))
+ goto out;
+
+ for (i = 0; i < max_entries; i++)
+ if (!ASSERT_EQ(values[i], value, "value on specified cpu"))
+ goto out;
+
+ /* lookup values from all CPUs */
+ batch_opts.elem_flags = 0;
+ memset(values, 0, max_entries * value_size);
+ err = bpf_map_lookup_batch(map_fd, NULL, &batch, keys, values, &count, &batch_opts);
+ if (!ASSERT_TRUE(!err || err == -ENOENT, "bpf_map_lookup_batch all cpus"))
+ goto out;
+
+ for (i = 0; i < max_entries; i++) {
+ for (j = 0; j < nr_cpus; j++) {
+ if (!ASSERT_EQ(values[i*nr_cpus + j], j != cpu ? 0 : value,
+ "value on specified cpu"))
+ goto out;
+ }
+ }
+ }
+
+ flags = (u64)nr_cpus << 32 | BPF_F_CPU;
+ err = bpf_map_update_elem(map_fd, keys, values, flags);
+ if (!ASSERT_EQ(err, -ERANGE, "bpf_map_update_elem -ERANGE"))
+ goto out;
+
+ err = bpf_map__update_elem(map, keys, key_sz, values, value_sz, flags);
+ if (!ASSERT_EQ(err, -ERANGE, "bpf_map__update_elem -ERANGE"))
+ goto out;
+
+ err = bpf_map_lookup_elem_flags(map_fd, keys, values, flags);
+ if (!ASSERT_EQ(err, -ERANGE, "bpf_map_lookup_elem_flags -ERANGE"))
+ goto out;
+
+ err = bpf_map__lookup_elem(map, keys, key_sz, values, value_sz, flags);
+ if (!ASSERT_EQ(err, -ERANGE, "bpf_map__lookup_elem -ERANGE"))
+ goto out;
+
+ /* clear value on all cpus */
+ batch_opts.elem_flags = (u64)BPF_ALL_CPUS << 32 | BPF_F_CPU;
+ memset(values, 0, max_entries * value_sz);
+ err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &batch_opts);
+ if (!ASSERT_OK(err, "bpf_map_update_batch all cpus"))
+ goto out;
+
+ for (cpu = 0; cpu < nr_cpus; cpu++) {
+ /* update value on specified cpu */
+ values[0] = value;
+ flags = (u64)cpu << 32 | BPF_F_CPU;
+ for (i = 0; i < max_entries; i++) {
+ err = bpf_map__update_elem(map, keys + i, key_sz, values, value_sz, flags);
+ if (!ASSERT_OK(err, "bpf_map__update_elem specified cpu"))
+ goto out;
+
+ for (j = 0; j < nr_cpus; j++) {
+ /* lookup then check value on CPUs */
+ flags = (u64)j << 32 | BPF_F_CPU;
+ err = bpf_map__lookup_elem(map, keys + i, key_sz, values, value_sz,
+ flags);
+ if (!ASSERT_OK(err, "bpf_map__lookup_elem specified cpu"))
+ goto out;
+ if (!ASSERT_EQ(values[0], j != cpu ? 0 : value,
+ "bpf_map__lookup_elem value on specified cpu"))
+ goto out;
+ }
+ }
+
+ /* clear value on specified cpu */
+ values[0] = 0;
+ flags = (u64)cpu << 32 | BPF_F_CPU;
+ err = bpf_map__update_elem(map, keys, key_sz, values, value_sz, flags);
+ if (!ASSERT_OK(err, "bpf_map__update_elem specified cpu"))
+ goto out;
+ }
+
+out:
+ if (keys)
+ free(keys);
+ if (values)
+ free(values);
+ percpu_array_flag__destroy(skel);
+}
+
void test_percpu_alloc(void)
{
if (test__start_subtest("array"))
@@ -125,4 +272,6 @@ void test_percpu_alloc(void)
test_cgrp_local_storage();
if (test__start_subtest("failure_tests"))
test_failure();
+ if (test__start_subtest("cpu_flag_tests"))
+ test_cpu_flag();
}
diff --git a/tools/testing/selftests/bpf/progs/percpu_array_flag.c b/tools/testing/selftests/bpf/progs/percpu_array_flag.c
new file mode 100644
index 0000000000000..4d92e121958ee
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/percpu_array_flag.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY);
+ __uint(max_entries, 2);
+ __type(key, int);
+ __type(value, u64);
+} percpu SEC(".maps");
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(test_percpu_array, int x)
+{
+ u64 value = 0xDEADC0DE;
+ int key = 0;
+
+ bpf_map_update_elem(&percpu, &key, &value, BPF_ANY);
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
+
--
2.50.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
@ 2025-08-07 8:34 ` Jiri Olsa
2025-08-07 16:26 ` Leon Hwang
2025-08-07 17:20 ` Alexei Starovoitov
1 sibling, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2025-08-07 8:34 UTC (permalink / raw)
To: Leon Hwang
Cc: bpf, ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
kernel-patches-bot
On Wed, Aug 06, 2025 at 12:30:15AM +0800, Leon Hwang wrote:
> Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
> updating values for specified CPU or for all CPUs with a single value.
>
> This enhancement enables:
>
> * Efficient update of all CPUs using a single value when cpu == (u32)~0.
> * Targeted update or lookup for a specified CPU otherwise.
>
> The flag is passed via:
>
> * map_flags in bpf_percpu_array_update() along with embedded cpu field.
> * elem_flags in generic_map_update_batch() along with embedded cpu field.
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
> include/linux/bpf.h | 3 +-
> include/uapi/linux/bpf.h | 6 +++
> kernel/bpf/arraymap.c | 54 ++++++++++++++++++------
> kernel/bpf/syscall.c | 77 +++++++++++++++++++++-------------
> tools/include/uapi/linux/bpf.h | 6 +++
> 5 files changed, 103 insertions(+), 43 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index cc700925b802f..c17c45f797ed9 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2691,7 +2691,8 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
> struct bpf_func_state *callee);
>
> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
> + u64 flags);
> int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
> u64 flags);
> int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 233de8677382e..67bc35e4d6a8d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1372,6 +1372,12 @@ enum {
> BPF_NOEXIST = 1, /* create new element if it didn't exist */
> BPF_EXIST = 2, /* update existing element */
> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
> + BPF_F_CPU = 8, /* map_update for percpu_array */
> +};
> +
> +enum {
> + /* indicate updating value across all CPUs for percpu maps. */
> + BPF_ALL_CPUS = (__u32)~0,
> };
>
> /* flags for BPF_MAP_CREATE command */
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 3d080916faf97..98759f0b22397 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -295,17 +295,24 @@ static void *percpu_array_map_lookup_percpu_elem(struct bpf_map *map, void *key,
> return per_cpu_ptr(array->pptrs[index & array->index_mask], cpu);
> }
>
> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value, u64 flags)
> {
> struct bpf_array *array = container_of(map, struct bpf_array, map);
> u32 index = *(u32 *)key;
> void __percpu *pptr;
> - int cpu, off = 0;
> - u32 size;
> + u32 size, cpu;
> + int off = 0;
>
> if (unlikely(index >= array->map.max_entries))
> return -ENOENT;
>
> + cpu = flags >> 32;
> + flags &= (u32)~0;
is this necessary?
> + if (unlikely(flags > BPF_F_CPU))
> + return -EINVAL;
> + if (unlikely((flags & BPF_F_CPU) && cpu >= num_possible_cpus()))
> + return -ERANGE;
should we check cpu != BPF_ALL_CPUS in here?
> +
> /* per_cpu areas are zero-filled and bpf programs can only
> * access 'value_size' of them, so copying rounded areas
> * will not leak any kernel data
> @@ -313,10 +320,15 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
> size = array->elem_size;
> rcu_read_lock();
> pptr = array->pptrs[index & array->index_mask];
> - for_each_possible_cpu(cpu) {
> - copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
> - check_and_init_map_value(map, value + off);
> - off += size;
> + if (flags & BPF_F_CPU) {
> + copy_map_value_long(map, value, per_cpu_ptr(pptr, cpu));
> + check_and_init_map_value(map, value);
> + } else {
> + for_each_possible_cpu(cpu) {
> + copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
> + check_and_init_map_value(map, value + off);
> + off += size;
> + }
> }
> rcu_read_unlock();
> return 0;
> @@ -387,13 +399,20 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
> struct bpf_array *array = container_of(map, struct bpf_array, map);
> u32 index = *(u32 *)key;
> void __percpu *pptr;
> - int cpu, off = 0;
> - u32 size;
> + u32 size, cpu;
> + int off = 0;
>
> - if (unlikely(map_flags > BPF_EXIST))
> + cpu = map_flags >> 32;
> + map_flags &= (u32)~0;
> + if (unlikely(map_flags > BPF_F_CPU))
> /* unknown flags */
> return -EINVAL;
>
> + if (unlikely((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS &&
> + cpu >= num_possible_cpus()))
> + /* invalid cpu */
> + return -ERANGE;
looks like same check as in bpf_percpu_array_copy, maybe we could add
some helper function for that?
> +
> if (unlikely(index >= array->map.max_entries))
> /* all elements were pre-allocated, cannot insert a new one */
> return -E2BIG;
> @@ -411,10 +430,19 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
> size = array->elem_size;
> rcu_read_lock();
> pptr = array->pptrs[index & array->index_mask];
> - for_each_possible_cpu(cpu) {
> - copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
> + if ((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS) {
> + copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value);
> bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
> - off += size;
> + } else {
> + for_each_possible_cpu(cpu) {
> + copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
> + /* same user-provided value is used if BPF_F_CPU is specified,
> + * otherwise value is an array of per-cpu values.
> + */
> + if (!(map_flags & BPF_F_CPU))
> + off += size;
> + bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
> + }
> }
> rcu_read_unlock();
> return 0;
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0fbfa8532c392..43f19d02bc5ce 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -131,8 +131,11 @@ bool bpf_map_write_active(const struct bpf_map *map)
> return atomic64_read(&map->writecnt) != 0;
> }
>
> -static u32 bpf_map_value_size(const struct bpf_map *map)
> +static u32 bpf_map_value_size(const struct bpf_map *map, u64 flags)
> {
> + if ((flags & BPF_F_CPU) && map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
> + return round_up(map->value_size, 8);
> +
nit, maybe we could keep the same style like below and check the map
type first:
if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY && (flags & BPF_F_CPU))
return round_up(map->value_size, 8);
else if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
> map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
> map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY ||
> @@ -314,7 +317,7 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value,
> map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
> err = bpf_percpu_hash_copy(map, key, value);
> } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
> - err = bpf_percpu_array_copy(map, key, value);
> + err = bpf_percpu_array_copy(map, key, value, flags);
> } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
> err = bpf_percpu_cgroup_storage_copy(map, key, value);
> } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) {
> @@ -1669,7 +1672,10 @@ static int map_lookup_elem(union bpf_attr *attr)
> if (CHECK_ATTR(BPF_MAP_LOOKUP_ELEM))
> return -EINVAL;
>
> - if (attr->flags & ~BPF_F_LOCK)
> + if ((u32)attr->flags & ~(BPF_F_LOCK | BPF_F_CPU))
> + return -EINVAL;
I understand the u32 cast in here..
> +
> + if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32)
> return -EINVAL;
.. but do we need it in here and other similar places below?
>
> CLASS(fd, f)(attr->map_fd);
> @@ -1679,7 +1685,7 @@ static int map_lookup_elem(union bpf_attr *attr)
> if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
> return -EPERM;
>
> - if ((attr->flags & BPF_F_LOCK) &&
> + if (((u32)attr->flags & BPF_F_LOCK) &&
> !btf_record_has_field(map->record, BPF_SPIN_LOCK))
> return -EINVAL;
>
> @@ -1687,7 +1693,7 @@ static int map_lookup_elem(union bpf_attr *attr)
> if (IS_ERR(key))
> return PTR_ERR(key);
>
> - value_size = bpf_map_value_size(map);
> + value_size = bpf_map_value_size(map, attr->flags);
>
> err = -ENOMEM;
> value = kvmalloc(value_size, GFP_USER | __GFP_NOWARN);
> @@ -1744,19 +1750,24 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
> goto err_put;
> }
>
> - if ((attr->flags & BPF_F_LOCK) &&
> + if (((u32)attr->flags & BPF_F_LOCK) &&
> !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
> err = -EINVAL;
> goto err_put;
> }
>
> + if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32) {
> + err = -EINVAL;
> + goto err_put;
> + }
> +
> key = ___bpf_copy_key(ukey, map->key_size);
> if (IS_ERR(key)) {
> err = PTR_ERR(key);
> goto err_put;
> }
>
> - value_size = bpf_map_value_size(map);
> + value_size = bpf_map_value_size(map, attr->flags);
> value = kvmemdup_bpfptr(uvalue, value_size);
> if (IS_ERR(value)) {
> err = PTR_ERR(value);
> @@ -1942,6 +1953,25 @@ int generic_map_delete_batch(struct bpf_map *map,
> return err;
> }
>
> +static int check_map_batch_elem_flags(struct bpf_map *map, u64 elem_flags)
> +{
> + u32 flags = elem_flags;
> +
> + if (flags & ~(BPF_F_LOCK | BPF_F_CPU))
> + return -EINVAL;
> +
> + if ((flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK))
> + return -EINVAL;
> +
> + if (!(flags & BPF_F_CPU) && elem_flags >> 32)
> + return -EINVAL;
> +
> + if ((flags & BPF_F_CPU) && map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
> + return -EINVAL;
> +
> + return 0;
> +}
it seems like this check could be used also for non-batch functions as well?
also it might be more readable if we factor some check_flags function in
separate patch and then add BPF_F_CPU support
> +
> int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> const union bpf_attr *attr,
> union bpf_attr __user *uattr)
> @@ -1952,15 +1982,11 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> void *key, *value;
> int err = 0;
>
> - if (attr->batch.elem_flags & ~BPF_F_LOCK)
> - return -EINVAL;
> -
> - if ((attr->batch.elem_flags & BPF_F_LOCK) &&
> - !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
> - return -EINVAL;
> - }
> + err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
> + if (err)
> + return err;
>
> - value_size = bpf_map_value_size(map);
> + value_size = bpf_map_value_size(map, attr->batch.elem_flags);
>
> max_count = attr->batch.count;
> if (!max_count)
> @@ -1986,9 +2012,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
> copy_from_user(value, values + cp * value_size, value_size))
> break;
>
> - err = bpf_map_update_value(map, map_file, key, value,
> - attr->batch.elem_flags);
> -
> + err = bpf_map_update_value(map, map_file, key, value, attr->batch.elem_flags);
there's no change in here right? I'd keep it as it is
> if (err)
> break;
> cond_resched();
> @@ -2015,14 +2039,11 @@ int generic_map_lookup_batch(struct bpf_map *map,
> u32 value_size, cp, max_count;
> int err;
>
> - if (attr->batch.elem_flags & ~BPF_F_LOCK)
> - return -EINVAL;
> -
> - if ((attr->batch.elem_flags & BPF_F_LOCK) &&
> - !btf_record_has_field(map->record, BPF_SPIN_LOCK))
> - return -EINVAL;
> + err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
> + if (err)
> + return err;
>
> - value_size = bpf_map_value_size(map);
> + value_size = bpf_map_value_size(map, attr->batch.elem_flags);
>
> max_count = attr->batch.count;
> if (!max_count)
> @@ -2056,9 +2077,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
> rcu_read_unlock();
> if (err)
> break;
> - err = bpf_map_copy_value(map, key, value,
> - attr->batch.elem_flags);
> -
> + err = bpf_map_copy_value(map, key, value, attr->batch.elem_flags);
ditto
thanks,
jirka
> if (err == -ENOENT)
> goto next_key;
>
> @@ -2144,7 +2163,7 @@ static int map_lookup_and_delete_elem(union bpf_attr *attr)
> goto err_put;
> }
>
> - value_size = bpf_map_value_size(map);
> + value_size = bpf_map_value_size(map, 0);
>
> err = -ENOMEM;
> value = kvmalloc(value_size, GFP_USER | __GFP_NOWARN);
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 233de8677382e..67bc35e4d6a8d 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -1372,6 +1372,12 @@ enum {
> BPF_NOEXIST = 1, /* create new element if it didn't exist */
> BPF_EXIST = 2, /* update existing element */
> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
> + BPF_F_CPU = 8, /* map_update for percpu_array */
> +};
> +
> +enum {
> + /* indicate updating value across all CPUs for percpu maps. */
> + BPF_ALL_CPUS = (__u32)~0,
> };
>
> /* flags for BPF_MAP_CREATE command */
> --
> 2.50.1
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-07 8:34 ` Jiri Olsa
@ 2025-08-07 16:26 ` Leon Hwang
0 siblings, 0 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-07 16:26 UTC (permalink / raw)
To: Jiri Olsa
Cc: bpf, ast, andrii, daniel, yonghong.song, song, eddyz87, dxu, deso,
kernel-patches-bot
On Thu Aug 7, 2025 at 4:34 PM +08, Jiri Olsa wrote:
> On Wed, Aug 06, 2025 at 12:30:15AM +0800, Leon Hwang wrote:
>> Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
>> updating values for specified CPU or for all CPUs with a single value.
>>
[...]
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 3d080916faf97..98759f0b22397 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -295,17 +295,24 @@ static void *percpu_array_map_lookup_percpu_elem(struct bpf_map *map, void *key,
>> return per_cpu_ptr(array->pptrs[index & array->index_mask], cpu);
>> }
>>
>> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
>> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value, u64 flags)
>> {
>> struct bpf_array *array = container_of(map, struct bpf_array, map);
>> u32 index = *(u32 *)key;
>> void __percpu *pptr;
>> - int cpu, off = 0;
>> - u32 size;
>> + u32 size, cpu;
>> + int off = 0;
>>
>> if (unlikely(index >= array->map.max_entries))
>> return -ENOENT;
>>
>> + cpu = flags >> 32;
>> + flags &= (u32)~0;
>
> is this necessary?
>
It is unnecessary.
I'll remove it and update
if (unlikely((u32)flags > BPF_F_CPU))
return -EINVAL;
>> + if (unlikely(flags > BPF_F_CPU))
>> + return -EINVAL;
>> + if (unlikely((flags & BPF_F_CPU) && cpu >= num_possible_cpus()))
>> + return -ERANGE;
>
> should we check cpu != BPF_ALL_CPUS in here?
>
No. It is meaningless to support cpu == BPF_ALL_CPUS, because
(flags & BPF_F_CPU) && cpu == BPF_ALL_CPUS is same as ~BPF_F_CPU.
>> +
>> /* per_cpu areas are zero-filled and bpf programs can only
>> * access 'value_size' of them, so copying rounded areas
>> * will not leak any kernel data
>> @@ -313,10 +320,15 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
>> size = array->elem_size;
>> rcu_read_lock();
>> pptr = array->pptrs[index & array->index_mask];
>> - for_each_possible_cpu(cpu) {
>> - copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
>> - check_and_init_map_value(map, value + off);
>> - off += size;
>> + if (flags & BPF_F_CPU) {
>> + copy_map_value_long(map, value, per_cpu_ptr(pptr, cpu));
>> + check_and_init_map_value(map, value);
>> + } else {
>> + for_each_possible_cpu(cpu) {
>> + copy_map_value_long(map, value + off, per_cpu_ptr(pptr, cpu));
>> + check_and_init_map_value(map, value + off);
>> + off += size;
>> + }
>> }
>> rcu_read_unlock();
>> return 0;
>> @@ -387,13 +399,20 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
>> struct bpf_array *array = container_of(map, struct bpf_array, map);
>> u32 index = *(u32 *)key;
>> void __percpu *pptr;
>> - int cpu, off = 0;
>> - u32 size;
>> + u32 size, cpu;
>> + int off = 0;
>>
>> - if (unlikely(map_flags > BPF_EXIST))
>> + cpu = map_flags >> 32;
>> + map_flags &= (u32)~0;
>> + if (unlikely(map_flags > BPF_F_CPU))
>> /* unknown flags */
>> return -EINVAL;
>>
>> + if (unlikely((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS &&
>> + cpu >= num_possible_cpus()))
>> + /* invalid cpu */
>> + return -ERANGE;
>
> looks like same check as in bpf_percpu_array_copy, maybe we could add
> some helper function for that?
>
If they are same, I'd like to add a helper function.
>> +
>> if (unlikely(index >= array->map.max_entries))
>> /* all elements were pre-allocated, cannot insert a new one */
>> return -E2BIG;
>> @@ -411,10 +430,19 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
>> size = array->elem_size;
>> rcu_read_lock();
>> pptr = array->pptrs[index & array->index_mask];
>> - for_each_possible_cpu(cpu) {
>> - copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
>> + if ((map_flags & BPF_F_CPU) && cpu != BPF_ALL_CPUS) {
>> + copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value);
>> bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
>> - off += size;
>> + } else {
>> + for_each_possible_cpu(cpu) {
>> + copy_map_value_long(map, per_cpu_ptr(pptr, cpu), value + off);
>> + /* same user-provided value is used if BPF_F_CPU is specified,
>> + * otherwise value is an array of per-cpu values.
>> + */
>> + if (!(map_flags & BPF_F_CPU))
>> + off += size;
>> + bpf_obj_free_fields(array->map.record, per_cpu_ptr(pptr, cpu));
>> + }
>> }
>> rcu_read_unlock();
>> return 0;
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 0fbfa8532c392..43f19d02bc5ce 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -131,8 +131,11 @@ bool bpf_map_write_active(const struct bpf_map *map)
>> return atomic64_read(&map->writecnt) != 0;
>> }
>>
>> -static u32 bpf_map_value_size(const struct bpf_map *map)
>> +static u32 bpf_map_value_size(const struct bpf_map *map, u64 flags)
>> {
>> + if ((flags & BPF_F_CPU) && map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
>> + return round_up(map->value_size, 8);
>> +
>
> nit, maybe we could keep the same style like below and check the map
> type first:
>
> if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY && (flags & BPF_F_CPU))
> return round_up(map->value_size, 8);
> else if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH ||
>
Ack.
>> map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH ||
>> map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY ||
>> @@ -314,7 +317,7 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value,
>> map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) {
>> err = bpf_percpu_hash_copy(map, key, value);
>> } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) {
>> - err = bpf_percpu_array_copy(map, key, value);
>> + err = bpf_percpu_array_copy(map, key, value, flags);
>> } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
>> err = bpf_percpu_cgroup_storage_copy(map, key, value);
>> } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) {
>> @@ -1669,7 +1672,10 @@ static int map_lookup_elem(union bpf_attr *attr)
>> if (CHECK_ATTR(BPF_MAP_LOOKUP_ELEM))
>> return -EINVAL;
>>
>> - if (attr->flags & ~BPF_F_LOCK)
>> + if ((u32)attr->flags & ~(BPF_F_LOCK | BPF_F_CPU))
>> + return -EINVAL;
>
> I understand the u32 cast in here..
>
>> +
>> + if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32)
>> return -EINVAL;
>
> .. but do we need it in here and other similar places below?
>
You are right. They are unnecessary.
I will remove them in next revision.
>>
>> CLASS(fd, f)(attr->map_fd);
>> @@ -1679,7 +1685,7 @@ static int map_lookup_elem(union bpf_attr *attr)
>> if (!(map_get_sys_perms(map, f) & FMODE_CAN_READ))
>> return -EPERM;
>>
>> - if ((attr->flags & BPF_F_LOCK) &&
>> + if (((u32)attr->flags & BPF_F_LOCK) &&
>> !btf_record_has_field(map->record, BPF_SPIN_LOCK))
>> return -EINVAL;
>>
>> @@ -1687,7 +1693,7 @@ static int map_lookup_elem(union bpf_attr *attr)
>> if (IS_ERR(key))
>> return PTR_ERR(key);
>>
>> - value_size = bpf_map_value_size(map);
>> + value_size = bpf_map_value_size(map, attr->flags);
>>
>> err = -ENOMEM;
>> value = kvmalloc(value_size, GFP_USER | __GFP_NOWARN);
>> @@ -1744,19 +1750,24 @@ static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
>> goto err_put;
>> }
>>
>> - if ((attr->flags & BPF_F_LOCK) &&
>> + if (((u32)attr->flags & BPF_F_LOCK) &&
>> !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
>> err = -EINVAL;
>> goto err_put;
>> }
>>
>> + if (!((u32)attr->flags & BPF_F_CPU) && attr->flags >> 32) {
>> + err = -EINVAL;
>> + goto err_put;
>> + }
>> +
>> key = ___bpf_copy_key(ukey, map->key_size);
>> if (IS_ERR(key)) {
>> err = PTR_ERR(key);
>> goto err_put;
>> }
>>
>> - value_size = bpf_map_value_size(map);
>> + value_size = bpf_map_value_size(map, attr->flags);
>> value = kvmemdup_bpfptr(uvalue, value_size);
>> if (IS_ERR(value)) {
>> err = PTR_ERR(value);
>> @@ -1942,6 +1953,25 @@ int generic_map_delete_batch(struct bpf_map *map,
>> return err;
>> }
>>
>> +static int check_map_batch_elem_flags(struct bpf_map *map, u64 elem_flags)
>> +{
>> + u32 flags = elem_flags;
>> +
>> + if (flags & ~(BPF_F_LOCK | BPF_F_CPU))
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_LOCK) && !btf_record_has_field(map->record, BPF_SPIN_LOCK))
>> + return -EINVAL;
>> +
>> + if (!(flags & BPF_F_CPU) && elem_flags >> 32)
>> + return -EINVAL;
>> +
>> + if ((flags & BPF_F_CPU) && map->map_type != BPF_MAP_TYPE_PERCPU_ARRAY)
>> + return -EINVAL;
>> +
>> + return 0;
>> +}
>
> it seems like this check could be used also for non-batch functions as well?
>
> also it might be more readable if we factor some check_flags function in
> separate patch and then add BPF_F_CPU support
>
Sure. After doing a poc of adding check_flags helper function, this
check can be used also for non-batch functions.
>
>> +
>> int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>> const union bpf_attr *attr,
>> union bpf_attr __user *uattr)
>> @@ -1952,15 +1982,11 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>> void *key, *value;
>> int err = 0;
>>
>> - if (attr->batch.elem_flags & ~BPF_F_LOCK)
>> - return -EINVAL;
>> -
>> - if ((attr->batch.elem_flags & BPF_F_LOCK) &&
>> - !btf_record_has_field(map->record, BPF_SPIN_LOCK)) {
>> - return -EINVAL;
>> - }
>> + err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
>> + if (err)
>> + return err;
>>
>> - value_size = bpf_map_value_size(map);
>> + value_size = bpf_map_value_size(map, attr->batch.elem_flags);
>>
>> max_count = attr->batch.count;
>> if (!max_count)
>> @@ -1986,9 +2012,7 @@ int generic_map_update_batch(struct bpf_map *map, struct file *map_file,
>> copy_from_user(value, values + cp * value_size, value_size))
>> break;
>>
>> - err = bpf_map_update_value(map, map_file, key, value,
>> - attr->batch.elem_flags);
>> -
>> + err = bpf_map_update_value(map, map_file, key, value, attr->batch.elem_flags);
>
> there's no change in here right? I'd keep it as it is
>
Ack.
>> if (err)
>> break;
>> cond_resched();
>> @@ -2015,14 +2039,11 @@ int generic_map_lookup_batch(struct bpf_map *map,
>> u32 value_size, cp, max_count;
>> int err;
>>
>> - if (attr->batch.elem_flags & ~BPF_F_LOCK)
>> - return -EINVAL;
>> -
>> - if ((attr->batch.elem_flags & BPF_F_LOCK) &&
>> - !btf_record_has_field(map->record, BPF_SPIN_LOCK))
>> - return -EINVAL;
>> + err = check_map_batch_elem_flags(map, attr->batch.elem_flags);
>> + if (err)
>> + return err;
>>
>> - value_size = bpf_map_value_size(map);
>> + value_size = bpf_map_value_size(map, attr->batch.elem_flags);
>>
>> max_count = attr->batch.count;
>> if (!max_count)
>> @@ -2056,9 +2077,7 @@ int generic_map_lookup_batch(struct bpf_map *map,
>> rcu_read_unlock();
>> if (err)
>> break;
>> - err = bpf_map_copy_value(map, key, value,
>> - attr->batch.elem_flags);
>> -
>> + err = bpf_map_copy_value(map, key, value, attr->batch.elem_flags);
>
> ditto
>
Ack.
>
> thanks,
> jirka
>
Thanks,
Leon
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
2025-08-07 8:34 ` Jiri Olsa
@ 2025-08-07 17:20 ` Alexei Starovoitov
2025-08-08 16:11 ` Leon Hwang
1 sibling, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2025-08-07 17:20 UTC (permalink / raw)
To: Leon Hwang
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Yonghong Song, Song Liu, Eduard, Daniel Xu, Daniel Müller,
kernel-patches-bot
On Tue, Aug 5, 2025 at 9:30 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>
> Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
> updating values for specified CPU or for all CPUs with a single value.
>
> This enhancement enables:
>
> * Efficient update of all CPUs using a single value when cpu == (u32)~0.
> * Targeted update or lookup for a specified CPU otherwise.
>
> The flag is passed via:
>
> * map_flags in bpf_percpu_array_update() along with embedded cpu field.
> * elem_flags in generic_map_update_batch() along with embedded cpu field.
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> ---
> include/linux/bpf.h | 3 +-
> include/uapi/linux/bpf.h | 6 +++
> kernel/bpf/arraymap.c | 54 ++++++++++++++++++------
> kernel/bpf/syscall.c | 77 +++++++++++++++++++++-------------
> tools/include/uapi/linux/bpf.h | 6 +++
> 5 files changed, 103 insertions(+), 43 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index cc700925b802f..c17c45f797ed9 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2691,7 +2691,8 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
> struct bpf_func_state *callee);
>
> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
> + u64 flags);
> int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
> u64 flags);
> int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 233de8677382e..67bc35e4d6a8d 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1372,6 +1372,12 @@ enum {
> BPF_NOEXIST = 1, /* create new element if it didn't exist */
> BPF_EXIST = 2, /* update existing element */
> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
> + BPF_F_CPU = 8, /* map_update for percpu_array */
only percpu_array?!
Aren't you doing it for percpu_hash too?
The comment should also say that upper 32-bit of flags is a cpu number.
> +};
> +
> +enum {
> + /* indicate updating value across all CPUs for percpu maps. */
> + BPF_ALL_CPUS = (__u32)~0,
> };
The name is inconsistent with BPF_F_ that was adopted long ago.
Also looking at the implementation that ~0 looks too magical.
imo it's cleaner to add another BPF_F_ALL_CPUS flag.
BPF_F_CPU = 8 and upper 32-bit select a cpu.
BPF_F_ALL_CPUS = 16 -> all cpus.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-07 17:20 ` Alexei Starovoitov
@ 2025-08-08 16:11 ` Leon Hwang
2025-08-08 16:23 ` Alexei Starovoitov
0 siblings, 1 reply; 10+ messages in thread
From: Leon Hwang @ 2025-08-08 16:11 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Yonghong Song, Song Liu, Eduard, Daniel Xu, Daniel Müller,
kernel-patches-bot
On Fri Aug 8, 2025 at 1:20 AM +08, Alexei Starovoitov wrote:
> On Tue, Aug 5, 2025 at 9:30 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
>> updating values for specified CPU or for all CPUs with a single value.
>>
>> This enhancement enables:
>>
>> * Efficient update of all CPUs using a single value when cpu == (u32)~0.
>> * Targeted update or lookup for a specified CPU otherwise.
>>
>> The flag is passed via:
>>
>> * map_flags in bpf_percpu_array_update() along with embedded cpu field.
>> * elem_flags in generic_map_update_batch() along with embedded cpu field.
>>
>> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
>> ---
>> include/linux/bpf.h | 3 +-
>> include/uapi/linux/bpf.h | 6 +++
>> kernel/bpf/arraymap.c | 54 ++++++++++++++++++------
>> kernel/bpf/syscall.c | 77 +++++++++++++++++++++-------------
>> tools/include/uapi/linux/bpf.h | 6 +++
>> 5 files changed, 103 insertions(+), 43 deletions(-)
>>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index cc700925b802f..c17c45f797ed9 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -2691,7 +2691,8 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
>> struct bpf_func_state *callee);
>>
>> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
>> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
>> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
>> + u64 flags);
>> int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
>> u64 flags);
>> int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 233de8677382e..67bc35e4d6a8d 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -1372,6 +1372,12 @@ enum {
>> BPF_NOEXIST = 1, /* create new element if it didn't exist */
>> BPF_EXIST = 2, /* update existing element */
>> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
>> + BPF_F_CPU = 8, /* map_update for percpu_array */
>
> only percpu_array?!
> Aren't you doing it for percpu_hash too?
>
Only percpu_array in this patchset.
I have no need to do it for percpu_hash.
> The comment should also say that upper 32-bit of flags is a cpu number.
>
>> +};
>> +
>> +enum {
>> + /* indicate updating value across all CPUs for percpu maps. */
>> + BPF_ALL_CPUS = (__u32)~0,
>> };
>
> The name is inconsistent with BPF_F_ that was adopted long ago.
>
> Also looking at the implementation that ~0 looks too magical.
> imo it's cleaner to add another BPF_F_ALL_CPUS flag.
> BPF_F_CPU = 8 and upper 32-bit select a cpu.
> BPF_F_ALL_CPUS = 16 -> all cpus.
Sure, let us add these two flags:
BPF_F_CPU = 8, /* cpu flag for percpu maps, upper 32-bit of flags is a cpu number */
BPF_F_ALL_CPUS = 16, /* update value across all CPUs for percpu maps */
Thanks,
Leon
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-08 16:11 ` Leon Hwang
@ 2025-08-08 16:23 ` Alexei Starovoitov
2025-08-11 16:34 ` Leon Hwang
0 siblings, 1 reply; 10+ messages in thread
From: Alexei Starovoitov @ 2025-08-08 16:23 UTC (permalink / raw)
To: Leon Hwang
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Yonghong Song, Song Liu, Eduard, Daniel Xu, Daniel Müller,
kernel-patches-bot
On Fri, Aug 8, 2025 at 9:11 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>
> On Fri Aug 8, 2025 at 1:20 AM +08, Alexei Starovoitov wrote:
> > On Tue, Aug 5, 2025 at 9:30 AM Leon Hwang <leon.hwang@linux.dev> wrote:
> >>
> >> Introduce support for the BPF_F_CPU flag in percpu_array maps to allow
> >> updating values for specified CPU or for all CPUs with a single value.
> >>
> >> This enhancement enables:
> >>
> >> * Efficient update of all CPUs using a single value when cpu == (u32)~0.
> >> * Targeted update or lookup for a specified CPU otherwise.
> >>
> >> The flag is passed via:
> >>
> >> * map_flags in bpf_percpu_array_update() along with embedded cpu field.
> >> * elem_flags in generic_map_update_batch() along with embedded cpu field.
> >>
> >> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
> >> ---
> >> include/linux/bpf.h | 3 +-
> >> include/uapi/linux/bpf.h | 6 +++
> >> kernel/bpf/arraymap.c | 54 ++++++++++++++++++------
> >> kernel/bpf/syscall.c | 77 +++++++++++++++++++++-------------
> >> tools/include/uapi/linux/bpf.h | 6 +++
> >> 5 files changed, 103 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> >> index cc700925b802f..c17c45f797ed9 100644
> >> --- a/include/linux/bpf.h
> >> +++ b/include/linux/bpf.h
> >> @@ -2691,7 +2691,8 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
> >> struct bpf_func_state *callee);
> >>
> >> int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> >> -int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value);
> >> +int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
> >> + u64 flags);
> >> int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
> >> u64 flags);
> >> int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >> index 233de8677382e..67bc35e4d6a8d 100644
> >> --- a/include/uapi/linux/bpf.h
> >> +++ b/include/uapi/linux/bpf.h
> >> @@ -1372,6 +1372,12 @@ enum {
> >> BPF_NOEXIST = 1, /* create new element if it didn't exist */
> >> BPF_EXIST = 2, /* update existing element */
> >> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
> >> + BPF_F_CPU = 8, /* map_update for percpu_array */
> >
> > only percpu_array?!
> > Aren't you doing it for percpu_hash too?
> >
>
> Only percpu_array in this patchset.
>
> I have no need to do it for percpu_hash.
You're missing the point. If we're adding the flag it should
work for all per-cpu maps. Both array and hash.
Same issue as with your other patch with common_attr.
We're not adding a feature that works for 1 out 10
commands/map types/whatever and doesn't work for the rest.
Flags/features have to be generic and consistent.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH bpf-next v2 1/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps
2025-08-08 16:23 ` Alexei Starovoitov
@ 2025-08-11 16:34 ` Leon Hwang
0 siblings, 0 replies; 10+ messages in thread
From: Leon Hwang @ 2025-08-11 16:34 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
Yonghong Song, Song Liu, Eduard, Daniel Xu, Daniel Müller,
kernel-patches-bot
On Sat Aug 9, 2025 at 12:23 AM +08, Alexei Starovoitov wrote:
> On Fri, Aug 8, 2025 at 9:11 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> On Fri Aug 8, 2025 at 1:20 AM +08, Alexei Starovoitov wrote:
>> > On Tue, Aug 5, 2025 at 9:30 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>> >>
[...]
>> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> >> index 233de8677382e..67bc35e4d6a8d 100644
>> >> --- a/include/uapi/linux/bpf.h
>> >> +++ b/include/uapi/linux/bpf.h
>> >> @@ -1372,6 +1372,12 @@ enum {
>> >> BPF_NOEXIST = 1, /* create new element if it didn't exist */
>> >> BPF_EXIST = 2, /* update existing element */
>> >> BPF_F_LOCK = 4, /* spin_lock-ed map_lookup/map_update */
>> >> + BPF_F_CPU = 8, /* map_update for percpu_array */
>> >
>> > only percpu_array?!
>> > Aren't you doing it for percpu_hash too?
>> >
>>
>> Only percpu_array in this patchset.
>>
>> I have no need to do it for percpu_hash.
>
> You're missing the point. If we're adding the flag it should
> work for all per-cpu maps. Both array and hash.
>
> Same issue as with your other patch with common_attr.
> We're not adding a feature that works for 1 out 10
> commands/map types/whatever and doesn't work for the rest.
> Flags/features have to be generic and consistent.
Get it. I'll do it for other percpu maps in next revision.
Thanks,
Leon
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-08-11 16:34 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 16:30 [PATCH bpf-next v2 0/3] bpf: Introduce BPF_F_CPU flag for percpu_array maps Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 1/3] " Leon Hwang
2025-08-07 8:34 ` Jiri Olsa
2025-08-07 16:26 ` Leon Hwang
2025-08-07 17:20 ` Alexei Starovoitov
2025-08-08 16:11 ` Leon Hwang
2025-08-08 16:23 ` Alexei Starovoitov
2025-08-11 16:34 ` Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 2/3] libbpf: Support BPF_F_CPU " Leon Hwang
2025-08-05 16:30 ` [PATCH bpf-next v2 3/3] selftests/bpf: Add case to test BPF_F_CPU Leon Hwang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).