From: Leon Hwang <leon.hwang@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, yonghong.song@linux.dev, song@kernel.org,
eddyz87@gmail.com, qmo@kernel.org, dxu@dxuuu.xyz,
kernel-patches-bot@fb.com
Subject: Re: [PATCH bpf-next v3 2/4] bpf, libbpf: Support global percpu data
Date: Thu, 29 May 2025 10:24:56 +0800 [thread overview]
Message-ID: <f722152f-b364-4e36-bd50-7dcc8a8279df@linux.dev> (raw)
In-Reply-To: <CAEf4BzY7MB9h-xAnPbheUgBhcqOMNaf1=HH=8V-HmC8k4VPgwQ@mail.gmail.com>
On 28/5/25 06:31, Andrii Nakryiko wrote:
> On Mon, May 26, 2025 at 9:22 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> This patch introduces support for global percpu data in libbpf by adding a
>> new ".data..percpu" section, similar to ".data". It enables efficient
>> handling of percpu global variables in bpf programs.
>>
>> This enhancement improves performance for workloads that benefit from
>> percpu storage.
>>
>> Meanwhile, add bpf_map__is_internal_percpu() API to check whether the map
>> is an internal map used for global percpu variables.
>
> I'm not a big fan of this super specific API. We do have
> bpf_map__is_internal() to let customer know that map is special in
> some way, but I'd like to avoid making this fine distinction between
> per-CPU internal map vs non-per-CPU (and then why stop there, why not
> have kconfig-specific API, ksym-specific check, etc)?
>
> All this is mostly useful just for bpftool for skeleton codegen, and
> bpftool already has to know about .percpu prefix, so it doesn't need
> this API to make all these decisions. Let's try to drop this
> bpf_map__is_internal_percpu() API?
>
To remove bpf_map__is_internal_percpu(), it can be replaced with:
static bool bpf_map_is_internal_percpu(const struct bpf_map *map)
{
return bpf_map__is_internal(map) &&
bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY;
}
This should be functionally equivalent to checking:
map->libbpf_type == LIBBPF_MAP_PERCPU;
>>
>> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
>> ---
>> tools/lib/bpf/libbpf.c | 102 +++++++++++++++++++++++++++++++--------
>> tools/lib/bpf/libbpf.h | 9 ++++
>> tools/lib/bpf/libbpf.map | 1 +
>> 3 files changed, 91 insertions(+), 21 deletions(-)
>>
[...]
>> struct elf_sec_desc {
>> @@ -1902,7 +1905,7 @@ static bool map_is_mmapable(struct bpf_object *obj, struct bpf_map *map)
>> struct btf_var_secinfo *vsi;
>> int i, n;
>>
>> - if (!map->btf_value_type_id)
>> + if (!map->btf_value_type_id || map->libbpf_type == LIBBPF_MAP_PERCPU_DATA)
>
> Not sure this is correct. We should have btf_value_type_id for PERCPU
> global data array, no?
>
Yes, a PERCPU global data array should indeed have a valid
btf_value_type_id, which is required for seq_show.
This is evident in percpu_array_map_seq_show_elem(), where
btf_value_type_id is used to display the per-CPU values:
static void percpu_array_map_seq_show_elem(struct bpf_map *map, void *key,
struct seq_file *m)
{
// ...
rcu_read_lock();
seq_printf(m, "%u: {\n", *(u32 *)key);
pptr = array->pptrs[index & array->index_mask];
for_each_possible_cpu(cpu) {
seq_printf(m, "\tcpu%d: ", cpu);
btf_type_seq_show(map->btf, map->btf_value_type_id,
per_cpu_ptr(pptr, cpu), m);
seq_putc(m, '\n');
}
seq_puts(m, "}\n");
rcu_read_unlock();
}
So the check for !map->btf_value_type_id in combination with
LIBBPF_MAP_PERCPU_DATA needs to be considered.
>> return false;
>>
>> t = btf__type_by_id(obj->btf, map->btf_value_type_id);
>> @@ -1926,6 +1929,7 @@ static int
>> bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
>> const char *real_name, int sec_idx, void *data, size_t data_sz)
[...]
>>
>> + /* .data..percpu DATASEC must have __aligned(8) size. */
>
> please remind me why? similarly for def->value_size special casing?
> What will happen if we don't explicitly roundup() on libbpf side
> (kernel always does roundup(8) for ARRAY value_size anyways, which is
> why I am asking)
>
t->size must match def->value_size.
That said, I believe it's acceptable to avoid using roundup() for both
values. I'll test this using seq_show to confirm.
>> + if (strcmp(sec_name, PERCPU_DATA_SEC) == 0 || str_has_pfx(sec_name, PERCPU_DATA_SEC))
>> + t->size = roundup(t->size, 8);
>> +
>> for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) {
>> const struct btf_type *t_var;
>> struct btf_var *var;
>> @@ -3923,6 +3939,11 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
[...]
>> + data_sz = map->def.value_size;
>> + if (is_percpu) {
>> + num_cpus = libbpf_num_possible_cpus();
>> + if (num_cpus < 0) {
>> + err = num_cpus;
>> + return err;
>
> hm... why not `return num_cpus;`?
>
Ack.
>> + }
>> +
>> + data_sz = data_sz * num_cpus;
>> + data = malloc(data_sz);
>> + if (!data) {
>> + err = -ENOMEM;
>> + return err;
>> + }
>
> [...]
>
>> /**
>> * @brief **bpf_map__set_pin_path()** sets the path attribute that tells where the
>> * BPF map should be pinned. This does not actually create the 'pin'.
>> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
>> index 1205f9a4fe048..1c239ac88c699 100644
>> --- a/tools/lib/bpf/libbpf.map
>> +++ b/tools/lib/bpf/libbpf.map
>> @@ -443,4 +443,5 @@ LIBBPF_1.6.0 {
>> bpf_program__line_info_cnt;
>> btf__add_decl_attr;
>> btf__add_type_attr;
>> + bpf_map__is_internal_percpu;
>
> alphabetically sorted
>
bpf_map__is_internal_percpu will be dropped.
Thanks,
Leon
next prev parent reply other threads:[~2025-05-29 2:25 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-26 16:21 [PATCH bpf-next v3 0/4] bpf: Introduce global percpu data Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 1/4] " Leon Hwang
2025-05-27 22:31 ` Andrii Nakryiko
2025-05-29 2:03 ` Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 2/4] bpf, libbpf: Support " Leon Hwang
2025-05-27 22:31 ` Andrii Nakryiko
2025-05-29 2:24 ` Leon Hwang [this message]
2025-05-27 22:40 ` Alexei Starovoitov
2025-05-27 23:25 ` Andrii Nakryiko
2025-05-28 2:35 ` Alexei Starovoitov
2025-05-28 16:05 ` Andrii Nakryiko
2025-05-29 2:43 ` Leon Hwang
2025-06-02 23:50 ` Andrii Nakryiko
2025-06-03 2:45 ` Leon Hwang
2025-06-05 16:29 ` Andrii Nakryiko
2025-05-26 16:21 ` [PATCH bpf-next v3 3/4] bpf, bpftool: Generate skeleton for " Leon Hwang
2025-05-27 22:31 ` Andrii Nakryiko
2025-05-29 2:56 ` Leon Hwang
2025-06-02 23:50 ` Andrii Nakryiko
2025-06-03 2:47 ` Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test " Leon Hwang
2025-05-27 22:31 ` [PATCH bpf-next v3 0/4] bpf: Introduce " Andrii Nakryiko
2025-05-28 17:10 ` Yonghong Song
2025-05-29 1:59 ` Leon Hwang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f722152f-b364-4e36-bd50-7dcc8a8279df@linux.dev \
--to=leon.hwang@linux.dev \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dxu@dxuuu.xyz \
--cc=eddyz87@gmail.com \
--cc=kernel-patches-bot@fb.com \
--cc=qmo@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox