From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>,
Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next v9 04/10] bpf: Check potential private stack recursion for progs with async callback
Date: Mon, 4 Nov 2024 19:37:28 -0800 [thread overview]
Message-ID: <a34f5be8-8cf9-4659-badd-32c387cefe29@linux.dev> (raw)
In-Reply-To: <CAADnVQL3MkDgZykq1H3NhJio8gZDnf3+kXXw7AQ36uT8yw5UfQ@mail.gmail.com>
On 11/4/24 6:51 PM, Alexei Starovoitov wrote:
> On Mon, Nov 4, 2024 at 11:38 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>> In previous patch, tracing progs are enabled for private stack since
>> recursion checking ensures there exists no nested same bpf prog run on
>> the same cpu.
>>
>> But it is still possible for nested bpf subprog run on the same cpu
>> if the same subprog is called in both main prog and async callback,
>> or in different async callbacks. For example,
>> main_prog
>> bpf_timer_set_callback(timer, timer_cb);
>> call sub1
>> sub1
>> ...
>> time_cb
>> call sub1
>>
>> In the above case, nested subprog run for sub1 is possible with one in
>> process context and the other in softirq context. If this is the case,
>> the verifier will disable private stack for this bpf prog.
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> include/linux/bpf_verifier.h | 2 ++
>> kernel/bpf/verifier.c | 42 +++++++++++++++++++++++++++++++-----
>> 2 files changed, 39 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 0622c11a7e19..e921589abc72 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -669,6 +669,8 @@ struct bpf_subprog_info {
>> /* true if bpf_fastcall stack region is used by functions that can't be inlined */
>> bool keep_fastcall_stack: 1;
>> bool use_priv_stack: 1;
>> + bool visited_with_priv_stack_accum: 1;
>> + bool visited_with_priv_stack: 1;
>>
>> u8 arg_cnt;
>> struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 406195c433ea..e01b3f0fd314 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -6118,8 +6118,12 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
>> idx, subprog_depth);
>> return -EACCES;
>> }
>> - if (subprog_depth >= BPF_PRIV_STACK_MIN_SIZE)
>> + if (subprog_depth >= BPF_PRIV_STACK_MIN_SIZE) {
>> subprog[idx].use_priv_stack = true;
>> + subprog[idx].visited_with_priv_stack = true;
>> + }
>> + } else {
>> + subprog[idx].visited_with_priv_stack = true;
> See suggestion for patch 3.
> It's cleaner to rewrite with a single visited_with_priv_stack = true; statement.
Ack.
>
>> }
>> }
>> continue_func:
>> @@ -6220,10 +6224,12 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
>> static int check_max_stack_depth(struct bpf_verifier_env *env)
>> {
>> struct bpf_subprog_info *si = env->subprog_info;
>> + enum priv_stack_mode orig_priv_stack_supported;
>> enum priv_stack_mode priv_stack_supported;
>> int ret, subtree_depth = 0, depth_frame;
>>
>> priv_stack_supported = bpf_enable_priv_stack(env->prog);
>> + orig_priv_stack_supported = priv_stack_supported;
>>
>> if (priv_stack_supported != NO_PRIV_STACK) {
>> for (int i = 0; i < env->subprog_cnt; i++) {
>> @@ -6240,13 +6246,39 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
>> priv_stack_supported);
>> if (ret < 0)
>> return ret;
>> +
>> + if (priv_stack_supported != NO_PRIV_STACK) {
>> + for (int j = 0; j < env->subprog_cnt; j++) {
>> + if (si[j].visited_with_priv_stack_accum &&
>> + si[j].visited_with_priv_stack) {
>> + /* si[j] is visited by both main/async subprog
>> + * and another async subprog.
>> + */
>> + priv_stack_supported = NO_PRIV_STACK;
>> + break;
>> + }
>> + if (!si[j].visited_with_priv_stack_accum)
>> + si[j].visited_with_priv_stack_accum =
>> + si[j].visited_with_priv_stack;
>> + }
>> + }
>> + if (priv_stack_supported != NO_PRIV_STACK) {
>> + for (int j = 0; j < env->subprog_cnt; j++)
>> + si[j].visited_with_priv_stack = false;
>> + }
> I cannot understand what this algorithm is doing.
> What is the meaning of visited_with_priv_stack_accum ?
The following is an example to show how the algorithm works.
Let us say we have prog like
main_prog0 si[0]
sub1 si[1]
sub2 si[2]
async1 si[3]
sub4 si[4]
sub2 si[2]
async2 si[5]
sub4 si[4]
sub5 si[6]
Total 9 subprograms.
after iteration 1 (main_prog0)
visited_with_priv_stack_accum: si[i] = false for i = 0 ... 9
visited_with_priv_stack: si[0] = si[1] = si[2] = true, others false
for all i, visited_with_priv_stack_accum[i] and visited_with_priv_stack[i]
is false, so main_prog0 can use priv stack.
visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
visited_with_priv_stack cleared with false.
after iteration 2 (async1)
visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
visited_with_priv_stack: si[2] = si[3] = si[4] = true, others false
Here, si[2] appears in both visited_with_priv_stack_accum and
visited_with_priv_stack, so async1 cannot have priv stack.
In my algorithm, I flipped the whole thing to no_priv_stack, which is
too conservative. We should just skip async1 and continues.
Let us say, we say async1 not having priv stack while main_prog0 has.
/* the same as end of iteration 1 */
visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
visited_with_priv_stack cleared with false.
after iteration 3 (async2)
visited_with_priv_stack_accum: si[0] = si[1] = si[2] = true; others false
visited_with_priv_stack: si[4] = si[5] = si[6] = true;
there are no conflict, so async2 can use private stack.
If we only have one bit in bpf_subprog_info, for a async tree,
if marking a subprog to be true and later we found there is a conflict in
async tree and we need make the whole async subprogs not eligible for priv stack,
then it will be hard to undo previous markings.
So visited_with_priv_stack_accum is to accumulate "true" results from
main_prog/async's.
Maybe we change two bit names to
visited_with_priv_stack
visited_with_priv_stack_tmp
?
>
>> }
>> }
>>
>> - if (priv_stack_supported == NO_PRIV_STACK && subtree_depth > MAX_BPF_STACK) {
>> - verbose(env, "combined stack size of %d calls is %d. Too large\n",
>> - depth_frame, subtree_depth);
>> - return -EACCES;
>> + if (priv_stack_supported == NO_PRIV_STACK) {
>> + if (subtree_depth > MAX_BPF_STACK) {
>> + verbose(env, "combined stack size of %d calls is %d. Too large\n",
>> + depth_frame, subtree_depth);
>> + return -EACCES;
>> + }
>> + if (orig_priv_stack_supported == PRIV_STACK_ADAPTIVE) {
>> + for (int i = 0; i < env->subprog_cnt; i++)
>> + si[i].use_priv_stack = false;
>> + }
> why? This patch suppose clear use_priv_stack from subprogs
> that are dual called and only from those subprogs.
> All other subprogs are fine.
>
> But it seems the alog attempts to detect one such calling scenario
> and disables priv_stack everywhere?
Sorry about this. Will fix in the next revision.
next prev parent reply other threads:[~2024-11-05 3:37 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-04 19:34 [PATCH bpf-next v9 00/10] bpf: Support private stack for bpf progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 01/10] bpf: Check stack depth limit after visiting all subprogs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 02/10] bpf: Return false for bpf_prog_check_recur() default case Yonghong Song
2024-11-05 1:21 ` Alexei Starovoitov
2024-11-05 1:35 ` Yonghong Song
2024-11-05 1:55 ` Alexei Starovoitov
2024-11-05 2:53 ` Yonghong Song
2024-11-05 3:50 ` Yonghong Song
2024-11-05 4:28 ` Alexei Starovoitov
2024-11-05 6:02 ` Yonghong Song
2024-11-05 15:50 ` Alexei Starovoitov
2024-11-05 16:33 ` Yonghong Song
2024-11-05 16:38 ` Alexei Starovoitov
2024-11-05 16:48 ` Yonghong Song
2024-11-05 17:47 ` Alexei Starovoitov
2024-11-04 19:35 ` [PATCH bpf-next v9 03/10] bpf: Allow private stack to have each subprog having stack size of 512 bytes Yonghong Song
2024-11-05 2:47 ` Alexei Starovoitov
2024-11-05 3:09 ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 04/10] bpf: Check potential private stack recursion for progs with async callback Yonghong Song
2024-11-05 2:51 ` Alexei Starovoitov
2024-11-05 3:37 ` Yonghong Song [this message]
2024-11-05 20:26 ` Alexei Starovoitov
2024-11-05 21:26 ` Yonghong Song
2024-11-05 21:52 ` Alexei Starovoitov
2024-11-06 0:19 ` Yonghong Song
2024-11-06 1:07 ` Alexei Starovoitov
2024-11-06 2:33 ` Yonghong Song
2024-11-06 6:55 ` Yonghong Song
2024-11-06 15:26 ` Alexei Starovoitov
2024-11-06 15:44 ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 05/10] bpf: Allocate private stack for eligible main prog or subprogs Yonghong Song
2024-11-05 1:38 ` Alexei Starovoitov
2024-11-05 3:07 ` Yonghong Song
2024-11-05 3:44 ` Yonghong Song
2024-11-05 5:19 ` Alexei Starovoitov
2024-11-05 6:05 ` Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 06/10] bpf, x86: Avoid repeated usage of bpf_prog->aux->stack_depth Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 07/10] bpf, x86: Support private stack in jit Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 08/10] selftests/bpf: Add tracing prog private stack tests Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 09/10] bpf: Support private stack for struct_ops progs Yonghong Song
2024-11-04 19:35 ` [PATCH bpf-next v9 10/10] selftests/bpf: Add struct_ops prog private stack tests Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a34f5be8-8cf9-4659-badd-32c387cefe29@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox