Re: [PATCH bpf-next v3 2/4] bpf, libbpf: Support global percpu data

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Leon Hwang <leon.hwang@linux.dev>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
	daniel@iogearbox.net, yonghong.song@linux.dev, song@kernel.org,
	eddyz87@gmail.com, qmo@kernel.org, dxu@dxuuu.xyz,
	kernel-patches-bot@fb.com
Subject: Re: [PATCH bpf-next v3 2/4] bpf, libbpf: Support global percpu data
Date: Thu, 29 May 2025 10:24:56 +0800	[thread overview]
Message-ID: <f722152f-b364-4e36-bd50-7dcc8a8279df@linux.dev> (raw)
In-Reply-To: <CAEf4BzY7MB9h-xAnPbheUgBhcqOMNaf1=HH=8V-HmC8k4VPgwQ@mail.gmail.com>



On 28/5/25 06:31, Andrii Nakryiko wrote:
> On Mon, May 26, 2025 at 9:22 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>> This patch introduces support for global percpu data in libbpf by adding a
>> new ".data..percpu" section, similar to ".data". It enables efficient
>> handling of percpu global variables in bpf programs.
>>
>> This enhancement improves performance for workloads that benefit from
>> percpu storage.
>>
>> Meanwhile, add bpf_map__is_internal_percpu() API to check whether the map
>> is an internal map used for global percpu variables.
> 
> I'm not a big fan of this super specific API. We do have
> bpf_map__is_internal() to let customer know that map is special in
> some way, but I'd like to avoid making this fine distinction between
> per-CPU internal map vs non-per-CPU (and then why stop there, why not
> have kconfig-specific API, ksym-specific check, etc)?
> 
> All this is mostly useful just for bpftool for skeleton codegen, and
> bpftool already has to know about .percpu prefix, so it doesn't need
> this API to make all these decisions. Let's try to drop this
> bpf_map__is_internal_percpu() API?
> 

To remove bpf_map__is_internal_percpu(), it can be replaced with:

static bool bpf_map_is_internal_percpu(const struct bpf_map *map)
{
	return bpf_map__is_internal(map) &&
	       bpf_map__type(map) == BPF_MAP_TYPE_PERCPU_ARRAY;
}

This should be functionally equivalent to checking:

map->libbpf_type == LIBBPF_MAP_PERCPU;

>>
>> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
>> ---
>>  tools/lib/bpf/libbpf.c   | 102 +++++++++++++++++++++++++++++++--------
>>  tools/lib/bpf/libbpf.h   |   9 ++++
>>  tools/lib/bpf/libbpf.map |   1 +
>>  3 files changed, 91 insertions(+), 21 deletions(-)
>>

[...]

>>  struct elf_sec_desc {
>> @@ -1902,7 +1905,7 @@ static bool map_is_mmapable(struct bpf_object *obj, struct bpf_map *map)
>>         struct btf_var_secinfo *vsi;
>>         int i, n;
>>
>> -       if (!map->btf_value_type_id)
>> +       if (!map->btf_value_type_id || map->libbpf_type == LIBBPF_MAP_PERCPU_DATA)
> 
> Not sure this is correct. We should have btf_value_type_id for PERCPU
> global data array, no?
> 

Yes, a PERCPU global data array should indeed have a valid
btf_value_type_id, which is required for seq_show.

This is evident in percpu_array_map_seq_show_elem(), where
btf_value_type_id is used to display the per-CPU values:

static void percpu_array_map_seq_show_elem(struct bpf_map *map, void *key,
					   struct seq_file *m)
{
	// ...

	rcu_read_lock();

	seq_printf(m, "%u: {\n", *(u32 *)key);
	pptr = array->pptrs[index & array->index_mask];
	for_each_possible_cpu(cpu) {
		seq_printf(m, "\tcpu%d: ", cpu);
		btf_type_seq_show(map->btf, map->btf_value_type_id,
				  per_cpu_ptr(pptr, cpu), m);
		seq_putc(m, '\n');
	}
	seq_puts(m, "}\n");

	rcu_read_unlock();
}

So the check for !map->btf_value_type_id in combination with
LIBBPF_MAP_PERCPU_DATA needs to be considered.

>>                 return false;
>>
>>         t = btf__type_by_id(obj->btf, map->btf_value_type_id);
>> @@ -1926,6 +1929,7 @@ static int
>>  bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
>>                               const char *real_name, int sec_idx, void *data, size_t data_sz)

[...]

>>
>> +       /* .data..percpu DATASEC must have __aligned(8) size. */
> 
> please remind me why? similarly for def->value_size special casing?
> What will happen if we don't explicitly roundup() on libbpf side
> (kernel always does roundup(8) for ARRAY value_size anyways, which is
> why I am asking)
> 

t->size must match def->value_size.

That said, I believe it's acceptable to avoid using roundup() for both
values. I'll test this using seq_show to confirm.

>> +       if (strcmp(sec_name, PERCPU_DATA_SEC) == 0 || str_has_pfx(sec_name, PERCPU_DATA_SEC))
>> +               t->size = roundup(t->size, 8);
>> +
>>         for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) {
>>                 const struct btf_type *t_var;
>>                 struct btf_var *var;
>> @@ -3923,6 +3939,11 @@ static int bpf_object__elf_collect(struct bpf_object *obj)

[...]

>> +       data_sz = map->def.value_size;
>> +       if (is_percpu) {
>> +               num_cpus = libbpf_num_possible_cpus();
>> +               if (num_cpus < 0) {
>> +                       err = num_cpus;
>> +                       return err;
> 
> hm... why not `return num_cpus;`?
> 

Ack.

>> +               }
>> +
>> +               data_sz = data_sz * num_cpus;
>> +               data = malloc(data_sz);
>> +               if (!data) {
>> +                       err = -ENOMEM;
>> +                       return err;
>> +               }
> 
> [...]
> 
>>  /**
>>   * @brief **bpf_map__set_pin_path()** sets the path attribute that tells where the
>>   * BPF map should be pinned. This does not actually create the 'pin'.
>> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
>> index 1205f9a4fe048..1c239ac88c699 100644
>> --- a/tools/lib/bpf/libbpf.map
>> +++ b/tools/lib/bpf/libbpf.map
>> @@ -443,4 +443,5 @@ LIBBPF_1.6.0 {
>>                 bpf_program__line_info_cnt;
>>                 btf__add_decl_attr;
>>                 btf__add_type_attr;
>> +               bpf_map__is_internal_percpu;
> 
> alphabetically sorted
> 

bpf_map__is_internal_percpu will be dropped.

Thanks,
Leon

next prev parent reply	other threads:[~2025-05-29  2:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-26 16:21 [PATCH bpf-next v3 0/4] bpf: Introduce global percpu data Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 1/4] " Leon Hwang
2025-05-27 22:31   ` Andrii Nakryiko
2025-05-29  2:03     ` Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 2/4] bpf, libbpf: Support " Leon Hwang
2025-05-27 22:31   ` Andrii Nakryiko
2025-05-29  2:24     ` Leon Hwang [this message]
2025-05-27 22:40   ` Alexei Starovoitov
2025-05-27 23:25     ` Andrii Nakryiko
2025-05-28  2:35       ` Alexei Starovoitov
2025-05-28 16:05         ` Andrii Nakryiko
2025-05-29  2:43           ` Leon Hwang
2025-06-02 23:50             ` Andrii Nakryiko
2025-06-03  2:45               ` Leon Hwang
2025-06-05 16:29                 ` Andrii Nakryiko
2025-05-26 16:21 ` [PATCH bpf-next v3 3/4] bpf, bpftool: Generate skeleton for " Leon Hwang
2025-05-27 22:31   ` Andrii Nakryiko
2025-05-29  2:56     ` Leon Hwang
2025-06-02 23:50       ` Andrii Nakryiko
2025-06-03  2:47         ` Leon Hwang
2025-05-26 16:21 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test " Leon Hwang
2025-05-27 22:31 ` [PATCH bpf-next v3 0/4] bpf: Introduce " Andrii Nakryiko
2025-05-28 17:10   ` Yonghong Song
2025-05-29  1:59     ` Leon Hwang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f722152f-b364-4e36-bd50-7dcc8a8279df@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dxu@dxuuu.xyz \
    --cc=eddyz87@gmail.com \
    --cc=kernel-patches-bot@fb.com \
    --cc=qmo@kernel.org \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.