Re: [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs
Date: Mon, 4 Nov 2024 19:44:07 -0800	[thread overview]
Message-ID: <06f43c37-a789-49cb-a4b0-bc2c45ae9485@linux.dev> (raw)
In-Reply-To: <34a35dce-fd05-4353-8eaa-0dc87a78dceb@linux.dev>

On 11/4/24 7:07 PM, Yonghong Song wrote:
>
> On 11/4/24 5:38 PM, Alexei Starovoitov wrote:
>> On Mon, Nov 4, 2024 at 11:38 AM Yonghong Song 
>> <yonghong.song@linux.dev> wrote:
>>> For any main prog or subprogs, allocate private stack space if 
>>> requested
>>> by subprog info or main prog. The alignment for private stack is 16
>>> since maximum stack alignment is 16 for bpf-enabled archs.
>>>
>>> If jit failed, the allocated private stack will be freed in the same
>>> function where the allocation happens. If jit succeeded, e.g., for
>>> x86_64 arch, the allocated private stack is freed in arch specific
>>> implementation of bpf_jit_free().
>>>
>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>>> ---
>>>   arch/x86/net/bpf_jit_comp.c |  1 +
>>>   include/linux/bpf.h         |  1 +
>>>   kernel/bpf/core.c           | 19 ++++++++++++++++---
>>>   kernel/bpf/verifier.c       | 13 +++++++++++++
>>>   4 files changed, 31 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>>> index 06b080b61aa5..59d294b8dd67 100644
>>> --- a/arch/x86/net/bpf_jit_comp.c
>>> +++ b/arch/x86/net/bpf_jit_comp.c
>>> @@ -3544,6 +3544,7 @@ void bpf_jit_free(struct bpf_prog *prog)
>>>                  prog->bpf_func = (void *)prog->bpf_func - 
>>> cfi_get_offset();
>>>                  hdr = bpf_jit_binary_pack_hdr(prog);
>>>                  bpf_jit_binary_pack_free(hdr, NULL);
>>> +               free_percpu(prog->aux->priv_stack_ptr);
>>> WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(prog));
>>>          }
>>>
>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>>> index 8db3c5d7404b..8a3ea7440a4a 100644
>>> --- a/include/linux/bpf.h
>>> +++ b/include/linux/bpf.h
>>> @@ -1507,6 +1507,7 @@ struct bpf_prog_aux {
>>>          u32 max_rdwr_access;
>>>          struct btf *attach_btf;
>>>          const struct bpf_ctx_arg_aux *ctx_arg_info;
>>> +       void __percpu *priv_stack_ptr;
>>>          struct mutex dst_mutex; /* protects dst_* pointers below, 
>>> *after* prog becomes visible */
>>>          struct bpf_prog *dst_prog;
>>>          struct bpf_trampoline *dst_trampoline;
>>> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
>>> index 14d9288441f2..f7a3e93c41e1 100644
>>> --- a/kernel/bpf/core.c
>>> +++ b/kernel/bpf/core.c
>>> @@ -2396,6 +2396,7 @@ static void bpf_prog_select_func(struct 
>>> bpf_prog *fp)
>>>    */
>>>   struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int 
>>> *err)
>>>   {
>>> +       void __percpu *priv_stack_ptr = NULL;
>>>          /* In case of BPF to BPF calls, verifier did all the prep
>>>           * work with regards to JITing, etc.
>>>           */
>>> @@ -2421,11 +2422,23 @@ struct bpf_prog 
>>> *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>>>                  if (*err)
>>>                          return fp;
>>>
>>> +               if (fp->aux->use_priv_stack && fp->aux->stack_depth) {
>>> +                       priv_stack_ptr = 
>>> __alloc_percpu_gfp(fp->aux->stack_depth, 16, GFP_KERNEL);
>>> +                       if (!priv_stack_ptr) {
>>> +                               *err = -ENOMEM;
>>> +                               return fp;
>>> +                       }
>>> +                       fp->aux->priv_stack_ptr = priv_stack_ptr;
>>> +               }
>>> +
>>>                  fp = bpf_int_jit_compile(fp);
>>>                  bpf_prog_jit_attempt_done(fp);
>>> -               if (!fp->jited && jit_needed) {
>>> -                       *err = -ENOTSUPP;
>>> -                       return fp;
>>> +               if (!fp->jited) {
>>> +                       free_percpu(priv_stack_ptr);
>>> +                       if (jit_needed) {
>>> +                               *err = -ENOTSUPP;
>>> +                               return fp;
>>> +                       }
>>>                  }
>>>          } else {
>>>                  *err = bpf_prog_offload_compile(fp);
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index e01b3f0fd314..03ae76d57076 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -20073,6 +20073,7 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>>   {
>>>          struct bpf_prog *prog = env->prog, **func, *tmp;
>>>          int i, j, subprog_start, subprog_end = 0, len, subprog;
>>> +       void __percpu *priv_stack_ptr;
>>>          struct bpf_map *map_ptr;
>>>          struct bpf_insn *insn;
>>>          void *old_bpf_func;
>>> @@ -20169,6 +20170,17 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>>
>>>                  func[i]->aux->name[0] = 'F';
>>>                  func[i]->aux->stack_depth = 
>>> env->subprog_info[i].stack_depth;
>>> +
>>> +               if (env->subprog_info[i].use_priv_stack && 
>>> func[i]->aux->stack_depth) {
>>> +                       priv_stack_ptr = 
>>> __alloc_percpu_gfp(func[i]->aux->stack_depth, 16,
>>> + GFP_KERNEL);
>>> +                       if (!priv_stack_ptr) {
>>> +                               err = -ENOMEM;
>>> +                               goto out_free;
>>> +                       }
>>> +                       func[i]->aux->priv_stack_ptr = priv_stack_ptr;
>>> +               }
>>> +
>>>                  func[i]->jit_requested = 1;
>>>                  func[i]->blinding_requested = 
>>> prog->blinding_requested;
>>>                  func[i]->aux->kfunc_tab = prog->aux->kfunc_tab;
>>> @@ -20201,6 +20213,7 @@ static int jit_subprogs(struct 
>>> bpf_verifier_env *env)
>>> func[i]->aux->exception_boundary = env->seen_exception;
>>>                  func[i] = bpf_int_jit_compile(func[i]);
>>>                  if (!func[i]->jited) {
>>> + free_percpu(func[i]->aux->priv_stack_ptr);
>>>                          err = -ENOTSUPP;
>>>                          goto out_free;
>>>                  }
>> Looks correct from leaks pov, but this is so hard to follow.
>> I still don't like this imbalanced alloc/free.
>> Either both need to be done by core or both by JIT.
>>
>> And JIT is probably better, since in:
>> _alloc_percpu_gfp(func[i]->aux->stack_depth, 16
>>
>> 16 alignment is x86 specific.
>
Sorry, I need to fix my format. The following is a reformat.

Agree. I use alignment 16 to cover all architectures. for x86_64,
alignment 8 is used. I did some checking in arch/ directory.

[~/work/bpf-next/arch (master)]$ find . -name 'net'
./arm/net
./mips/net
./parisc/net
./powerpc/net
./s390/net
./sparc/net
./x86/net
./arc/net
./arm64/net
./loongarch/net
./riscv/net

[~/work/bpf-next/arch (master)]$ egrep -r bpf_jit_free (excluding not func definition)
powerpc/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *fp)
sparc/net/bpf_jit_comp_32.c:void bpf_jit_free(struct bpf_prog *fp)
x86/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *prog)
arm64/net/bpf_jit_comp.c:void bpf_jit_free(struct bpf_prog *prog)
riscv/net/bpf_jit_core.c:void bpf_jit_free(struct bpf_prog *prog)
  
Looks like all important arch's like x86_64,arm64,riscv having their own
bpf_jit_free(). Some others like s390, etc. do not. I think we can do
allocation in JIT. If s390 starts to implement private stack, then it
can implement arch-specific version of bpf_jit_free() at that time.

next prev parent reply	other threads:[~2024-11-05  3:44 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-04 19:34 [PATCH bpf-next v9 00/10] bpf: Support private stack for bpf progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 01/10] bpf: Check stack depth limit after visiting all subprogs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 02/10] bpf: Return false for bpf_prog_check_recur() default case Yonghong Song
2024-11-05  1:21   ` Alexei Starovoitov
2024-11-05  1:35     ` Yonghong Song
2024-11-05  1:55       ` Alexei Starovoitov
2024-11-05  2:53         ` Yonghong Song
2024-11-05  3:50           ` Yonghong Song
2024-11-05  4:28             ` Alexei Starovoitov
2024-11-05  6:02               ` Yonghong Song
2024-11-05 15:50                 ` Alexei Starovoitov
2024-11-05 16:33                   ` Yonghong Song
2024-11-05 16:38                     ` Alexei Starovoitov
2024-11-05 16:48                       ` Yonghong Song
2024-11-05 17:47                         ` Alexei Starovoitov
2024-11-04 19:35 ` [PATCH bpf-next v9 03/10] bpf: Allow private stack to have each subprog having stack size of 512 bytes Yonghong Song
2024-11-05  2:47   ` Alexei Starovoitov
2024-11-05  3:09     ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 04/10] bpf: Check potential private stack recursion for progs with async callback Yonghong Song
2024-11-05  2:51   ` Alexei Starovoitov
2024-11-05  3:37     ` Yonghong Song
2024-11-05 20:26       ` Alexei Starovoitov
2024-11-05 21:26         ` Yonghong Song
2024-11-05 21:52           ` Alexei Starovoitov
2024-11-06  0:19             ` Yonghong Song
2024-11-06  1:07               ` Alexei Starovoitov
2024-11-06  2:33                 ` Yonghong Song
2024-11-06  6:55                 ` Yonghong Song
2024-11-06 15:26                   ` Alexei Starovoitov
2024-11-06 15:44                     ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs Yonghong Song
2024-11-05  1:38   ` Alexei Starovoitov
2024-11-05  3:07     ` Yonghong Song
2024-11-05  3:44       ` Yonghong Song [this message]
2024-11-05  5:19         ` Alexei Starovoitov
2024-11-05  6:05           ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 06/10] bpf, x86: Avoid repeated usage of bpf_prog->aux->stack_depth Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 07/10] bpf, x86: Support private stack in jit Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 08/10] selftests/bpf: Add tracing prog private stack tests Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 09/10] bpf: Support private stack for struct_ops progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 10/10] selftests/bpf: Add struct_ops prog private stack tests Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=06f43c37-a789-49cb-a4b0-bc2c45ae9485@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox