Re: [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs
Date: Mon, 4 Nov 2024 19:44:07 -0800	[thread overview]
Message-ID: <06f43c37-a789-49cb-a4b0-bc2c45ae9485@linux.dev> (raw)
In-Reply-To: <34a35dce-fd05-4353-8eaa-0dc87a78dceb@linux.dev>

On 11/4/24 7:07 PM, Yonghong Song wrote:
>
> On 11/4/24 5:38 PM, Alexei Starovoitov wrote:
>> On Mon, Nov 4, 2024 at 11:38 AM Yonghong Song 
>> <yonghong.song@linux.dev> wrote:
>>> For any main prog or subprogs, allocate private stack space if 
>>> requested
>>> by subprog info or main prog. The alignment for private stack is 16
>>> since maximum stack alignment is 16 for bpf-enabled archs.
>>>
>>> If jit failed, the allocated private stack will be freed in the same
>>> function where the allocation happens. If jit succeeded, e.g., for
>>> x86_64 arch, the allocated private stack is freed in arch specific
>>> implementation of bpf_jit_free().
>>>
>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>>> ---
>>>   arch/x86/net/bpf_jit_comp.c |  1 +
>>>   include/linux/bpf.h         |  1 +
>>>   kernel/bpf/core.c           | 19 ++++++++++++++++---
>>>   kernel/bpf/verifier.c       | 13 +++++++++++++
>>>   4 files changed, 31 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>> index 06b080b61aa5..59d294b8dd67 100644
>>> --- a/arch/x86/net/bpf_jit_comp.c
>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>> @@ -3544,6 +3544,7 @@ void bpf_jit_free(struct bpf_prog *prog)
>>>                  prog->bpf_func = (void *)prog->bpf_func - 
>>> cfi_get_offset();
>>>                  hdr = bpf_jit_binary_pack_hdr(prog);
>>>                  bpf_jit_binary_pack_free(hdr, NULL);
>>> +               free_percpu(prog->aux->priv_stack_ptr);
>>> WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(prog));
>>>          }
>>>
>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>>> index 8db3c5d7404b..8a3ea7440a4a 100644
>>> --- a/include/linux/bpf.h
>>> +++ b/include/linux/bpf.h
>>> @@ -1507,6 +1507,7 @@ struct bpf_prog_aux {
>>>          u32 max_rdwr_access;
>>>          struct btf *attach_btf;
>>>          const struct bpf_ctx_arg_aux *ctx_arg_info;
>>> +       void __percpu *priv_stack_ptr;
>>>          struct mutex dst_mutex; /* protects dst_* pointers below, 
>>> *after* prog becomes visible */
>>>          struct bpf_prog *dst_prog;
>>>          struct bpf_trampoline *dst_trampoline;
>>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>>> index 14d9288441f2..f7a3e93c41e1 100644
>>> --- a/kernel/bpf/core.c
>>> +++ b/kernel/bpf/core.c
>>> @@ -2396,6 +2396,7 @@ static void bpf_prog_select_func(struct 
>>> bpf_prog *fp)
>>>    */
>>>   struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int 
>>> *err)
>>>   {
>>> +       void __percpu *priv_stack_ptr = NULL;
>>>          /* In case of BPF to BPF calls, verifier did all the prep
>>>           * work with regards to JITing, etc.
>>>           */
>>> @@ -2421,11 +2422,23 @@ struct bpf_prog 
>>> *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>>>                  if (*err)
>>>                          return fp;
>>>
>>> +               if (fp->aux->use_priv_stack && fp->aux->stack_depth) {
>>> +                       priv_stack_ptr = 
>>> __alloc_percpu_gfp(fp->aux->stack_depth, 16, GFP_KERNEL);
>>> +                       if (!priv_stack_ptr) {
>>> +                               *err = -ENOMEM;
>>> +                               return fp;
>>> +                       }
>>> +                       fp->aux->priv_stack_ptr = priv_stack_ptr;
>>> +               }
>>> +
>>>                  fp = bpf_int_jit_compile(fp);
>>>                  bpf_prog_jit_attempt_done(fp);
>>> -               if (!fp->jited && jit_needed) {
>>> -                       *err = -ENOTSUPP;
>>> -                       return fp;
>>> +               if (!fp->jited) {
>>> +                       free_percpu(priv_stack_ptr);
>>> +                       if (jit_needed) {
>>> +                               *err = -ENOTSUPP;
>>> +                               return fp;
>>> +                       }
>>>                  }
>>>          } else {
>>>                  *err = bpf_prog_offload_compile(fp);
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index e01b3f0fd314..03ae76d57076 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -20073,6 +20073,7 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>>   {
>>>          struct bpf_prog *prog = env->prog, **func, *tmp;
>>>          int i, j, subprog_start, subprog_end = 0, len, subprog;
>>> +       void __percpu *priv_stack_ptr;
>>>          struct bpf_map *map_ptr;
>>>          struct bpf_insn *insn;
>>>          void *old_bpf_func;
>>> @@ -20169,6 +20170,17 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>>
>>>                  func[i]->aux->name[0] = 'F';
>>>                  func[i]->aux->stack_depth = 
>>> env->subprog_info[i].stack_depth;
>>> +
>>> +               if (env->subprog_info[i].use_priv_stack && 
>>> func[i]->aux->stack_depth) {
>>> +                       priv_stack_ptr = 
>>> __alloc_percpu_gfp(func[i]->aux->stack_depth, 16,
>>> + GFP_KERNEL);
>>> +                       if (!priv_stack_ptr) {
>>> +                               err = -ENOMEM;
>>> +                               goto out_free;
>>> +                       }
>>> +                       func[i]->aux->priv_stack_ptr = priv_stack_ptr;
>>> +               }
>>> +
>>>                  func[i]->jit_requested = 1;
>>>                  func[i]->blinding_requested = 
>>> prog->blinding_requested;
>>>                  func[i]->aux->kfunc_tab = prog->aux->kfunc_tab;
>>> @@ -20201,6 +20213,7 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>> func[i]->aux->exception_boundary = env->seen_exception;
>>>                  func[i] = bpf_int_jit_compile(func[i]);
>>>                  if (!func[i]->jited) {
>>> + free_percpu(func[i]->aux->priv_stack_ptr);
>>>                          err = -ENOTSUPP;
>>>                          goto out_free;
>>>                  }
>> Looks correct from leaks pov, but this is so hard to follow.
>> I still don't like this imbalanced alloc/free.
>> Either both need to be done by core or both by JIT.
>>
>> And JIT is probably better, since in:
>> _alloc_percpu_gfp(func[i]->aux->stack_depth, 16
>>
>> 16 alignment is x86 specific.
>
Sorry, I need to fix my format. The following is a reformat.

Agree. I use alignment 16 to cover all architectures. for x86_64,
alignment 8 is used. I did some checking in arch/ directory.

[~/work/bpf-next/arch (master)]$ find . -name 'net'
./arm/net
./mips/net
./parisc/net
./powerpc/net
./s390/net
./sparc/net
./x86/net
./arc/net
./arm64/net
./loongarch/net
./riscv/net

[~/work/bpf-next/arch (master)]$ egrep -r bpf_jit_free (excluding not func definition)
powerpc/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *fp)
sparc/net/bpf_jit_comp_32.c:void bpf_jit_free(struct bpf_prog *fp)
x86/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *prog)
arm64/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *prog)
riscv/net/bpf_jit_core.c:void bpf_jit_free(struct bpf_prog *prog)
  
Looks like all important arch's like x86_64,arm64,riscv having their own
bpf_jit_free(). Some others like s390, etc. do not. I think we can do
allocation in JIT. If s390 starts to implement private stack, then it
can implement arch-specific version of bpf_jit_free() at that time.

next prev parent reply	other threads:[~2024-11-05  3:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-04 19:34 [PATCH bpf-next v9 00/10] bpf: Support private stack for bpf progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 01/10] bpf: Check stack depth limit after visiting all subprogs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 02/10] bpf: Return false for bpf_prog_check_recur() default case Yonghong Song
2024-11-05  1:21   ` Alexei Starovoitov
2024-11-05  1:35     ` Yonghong Song
2024-11-05  1:55       ` Alexei Starovoitov
2024-11-05  2:53         ` Yonghong Song
2024-11-05  3:50           ` Yonghong Song
2024-11-05  4:28             ` Alexei Starovoitov
2024-11-05  6:02               ` Yonghong Song
2024-11-05 15:50                 ` Alexei Starovoitov
2024-11-05 16:33                   ` Yonghong Song
2024-11-05 16:38                     ` Alexei Starovoitov
2024-11-05 16:48                       ` Yonghong Song
2024-11-05 17:47                         ` Alexei Starovoitov
2024-11-04 19:35 ` [PATCH bpf-next v9 03/10] bpf: Allow private stack to have each subprog having stack size of 512 bytes Yonghong Song
2024-11-05  2:47   ` Alexei Starovoitov
2024-11-05  3:09     ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 04/10] bpf: Check potential private stack recursion for progs with async callback Yonghong Song
2024-11-05  2:51   ` Alexei Starovoitov
2024-11-05  3:37     ` Yonghong Song
2024-11-05 20:26       ` Alexei Starovoitov
2024-11-05 21:26         ` Yonghong Song
2024-11-05 21:52           ` Alexei Starovoitov
2024-11-06  0:19             ` Yonghong Song
2024-11-06  1:07               ` Alexei Starovoitov
2024-11-06  2:33                 ` Yonghong Song
2024-11-06  6:55                 ` Yonghong Song
2024-11-06 15:26                   ` Alexei Starovoitov
2024-11-06 15:44                     ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs Yonghong Song
2024-11-05  1:38   ` Alexei Starovoitov
2024-11-05  3:07     ` Yonghong Song
2024-11-05  3:44       ` Yonghong Song [this message]
2024-11-05  5:19         ` Alexei Starovoitov
2024-11-05  6:05           ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 06/10] bpf, x86: Avoid repeated usage of bpf_prog->aux->stack_depth Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 07/10] bpf, x86: Support private stack in jit Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 08/10] selftests/bpf: Add tracing prog private stack tests Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 09/10] bpf: Support private stack for struct_ops progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 10/10] selftests/bpf: Add struct_ops prog private stack tests Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06f43c37-a789-49cb-a4b0-bc2c45ae9485@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.