From: Eduard Zingerman <eddyz87@gmail.com>
To: Uros Bizjak <ubizjak@gmail.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>
Subject: Re: [PATCH] bpf: Fix percpu address space issues
Date: Fri, 09 Aug 2024 01:28:26 -0700 [thread overview]
Message-ID: <cbdf9051a35e8aa16478a2adc821403f53b4f4c0.camel@gmail.com> (raw)
In-Reply-To: <20240804185604.54770-1-ubizjak@gmail.com>
On Sun, 2024-08-04 at 20:55 +0200, Uros Bizjak wrote:
[...]
> Found by GCC's named address space checks.
Please provide some additional details.
I assume that the definition of __percpu was changed from
__attribute__((btf_type_tag(percpu))) to
__attribute__((address_space(??)), is that correct?
What is the motivation for this patch?
Currently __percpu is defined as a type tag and is used only by BPF verifier,
where it seems to be relevant only for structure fields and function parameters.
This patch only changes local variables.
> There were no changes in the resulting object files.
>
> [1] https://sparse.docs.kernel.org/en/latest/annotations.html#address-space-name
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Eduard Zingerman <eddyz87@gmail.com>
> Cc: Song Liu <song@kernel.org>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Stanislav Fomichev <sdf@fomichev.me>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> ---
> kernel/bpf/arraymap.c | 8 ++++----
> kernel/bpf/hashtab.c | 8 ++++----
> kernel/bpf/helpers.c | 4 ++--
> kernel/bpf/memalloc.c | 12 ++++++------
> 4 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 188e3c2effb2..544ca433275e 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -600,7 +600,7 @@ static void *bpf_array_map_seq_start(struct seq_file *seq, loff_t *pos)
> array = container_of(map, struct bpf_array, map);
> index = info->index & array->index_mask;
> if (info->percpu_value_buf)
> - return array->pptrs[index];
> + return array->ptrs[index];
I disagree with this change.
One might say that indeed the address space is cast away here,
however, value returned by this function is only used in functions
bpf_array_map_seq_{next,show,stop}(), where it is guarded by the same
'if (info->percpu_value_buf)' condition to identify if per_cpu_ptr()
is necessary.
> return array_map_elem_ptr(array, index);
> }
>
> @@ -619,7 +619,7 @@ static void *bpf_array_map_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> array = container_of(map, struct bpf_array, map);
> index = info->index & array->index_mask;
> if (info->percpu_value_buf)
> - return array->pptrs[index];
> + return array->ptrs[index];
Same as above.
> return array_map_elem_ptr(array, index);
> }
>
> @@ -632,7 +632,7 @@ static int __bpf_array_map_seq_show(struct seq_file *seq, void *v)
> struct bpf_iter_meta meta;
> struct bpf_prog *prog;
> int off = 0, cpu = 0;
> - void __percpu **pptr;
> + void * __percpu *pptr;
Should this be 'void __percpu *pptr;?
The value comes from array->pptrs[*] field,
which has the above type for elements.
> u32 size;
>
> meta.seq = seq;
> @@ -648,7 +648,7 @@ static int __bpf_array_map_seq_show(struct seq_file *seq, void *v)
> if (!info->percpu_value_buf) {
> ctx.value = v;
> } else {
> - pptr = v;
> + pptr = (void __percpu *)(uintptr_t)v;
> size = array->elem_size;
> for_each_possible_cpu(cpu) {
> copy_map_value_long(map, info->percpu_value_buf + off,
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index be1f64c20125..a49212bbda09 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -1049,14 +1049,14 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
> pptr = htab_elem_get_ptr(l_new, key_size);
> } else {
> /* alloc_percpu zero-fills */
> - pptr = bpf_mem_cache_alloc(&htab->pcpu_ma);
> - if (!pptr) {
> + void *ptr = bpf_mem_cache_alloc(&htab->pcpu_ma);
> + if (!ptr) {
Why adding an intermediate variable here?
Is casting bpf_mem_cache_alloc() result to percpu not sufficient?
It looks like bpf_mem_cache_alloc() returns a percpu pointer,
should it be declared as such?
> bpf_mem_cache_free(&htab->ma, l_new);
> l_new = ERR_PTR(-ENOMEM);
> goto dec_count;
> }
> - l_new->ptr_to_pptr = pptr;
> - pptr = *(void **)pptr;
> + l_new->ptr_to_pptr = ptr;
> + pptr = *(void __percpu **)ptr;
> }
>
> pcpu_init_value(htab, pptr, value, onallcpus);
[...]
> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
> index dec892ded031..b3858a76e0b3 100644
> --- a/kernel/bpf/memalloc.c
> +++ b/kernel/bpf/memalloc.c
> @@ -138,8 +138,8 @@ static struct llist_node notrace *__llist_del_first(struct llist_head *head)
> static void *__alloc(struct bpf_mem_cache *c, int node, gfp_t flags)
> {
> if (c->percpu_size) {
> - void **obj = kmalloc_node(c->percpu_size, flags, node);
> - void *pptr = __alloc_percpu_gfp(c->unit_size, 8, flags);
> + void __percpu **obj = kmalloc_node(c->percpu_size, flags, node);
Why __percpu is needed for obj?
kmalloc_node is defined as 'alloc_hooks(kmalloc_node_noprof(__VA_ARGS__))',
alloc_hooks(X) is a macro and it produces result of type typeof(X),
kmalloc_node_noprof() returns void*, not __percpu void*.
Do I miss something?
> + void __percpu *pptr = __alloc_percpu_gfp(c->unit_size, 8, flags);
>
> if (!obj || !pptr) {
> free_percpu(pptr);
[...]
next prev parent reply other threads:[~2024-08-09 8:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-04 18:55 [PATCH] bpf: Fix percpu address space issues Uros Bizjak
2024-08-09 8:28 ` Eduard Zingerman [this message]
2024-08-09 10:15 ` Uros Bizjak
2024-08-09 20:29 ` Eduard Zingerman
2024-08-10 8:35 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cbdf9051a35e8aa16478a2adc821403f53b4f4c0.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox