BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Chuyi Zhou <zhouchuyi@bytedance.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [PATCH bpf-next v2 1/2] bpf: Relax allowlist for css_task iter
Date: Mon, 23 Oct 2023 23:08:54 -0700	[thread overview]
Message-ID: <c923dba3-f638-410f-ae65-0cfa1962a6f2@linux.dev> (raw)
In-Reply-To: <1f4f9308-26b4-4cc5-99eb-88851fe2f3f9@bytedance.com>


On 10/23/23 10:52 PM, Chuyi Zhou wrote:
> Hello,
>
> 在 2023/10/24 12:57, Alexei Starovoitov 写道:
>> On Mon, Oct 23, 2023 at 7:42 PM Chuyi Zhou <zhouchuyi@bytedance.com> 
>> wrote:
>>>
>>> The newly added open-coded css_task iter would try to hold the global
>>> css_set_lock in bpf_iter_css_task_new, so the bpf side has to be 
>>> careful in
>>> where it allows to use this iter. The mainly concern is dead locking on
>>> css_set_lock. check_css_task_iter_allowlist() in verifier enforced 
>>> css_task
>>> can only be used in bpf_lsm hooks and sleepable bpf_iter.
>>>
>>> This patch relax the allowlist for css_task iter. Any lsm and any iter
>>> (even non-sleepable) and any sleepable are safe since they would not 
>>> hold
>>> the css_set_lock before entering BPF progs context.
>>>
>>> This patch also fixes the misused BPF_TRACE_ITER in
>>> check_css_task_iter_allowlist which compared bpf_prog_type with
>>> bpf_attach_type.
>>>
>>> Fixes: 9c66dc94b62ae ("bpf: Introduce css_task open-coded iterator 
>>> kfuncs")
>>> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
>>> ---
>>>   kernel/bpf/verifier.c                         | 21 
>>> ++++++++++++-------
>>>   .../selftests/bpf/progs/iters_task_failure.c  |  4 ++--
>>>   2 files changed, 15 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index e9bc5d4a25a1..9f209adc4ccb 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -11088,18 +11088,23 @@ static int 
>>> process_kf_arg_ptr_to_rbtree_node(struct bpf_verifier_env *env,
>>> &meta->arg_rbtree_root.field);
>>>   }
>>>
>>> +/*
>>> + * css_task iter allowlist is needed to avoid dead locking on 
>>> css_set_lock.
>>> + * LSM hooks and iters (both sleepable and non-sleepable) are safe.
>>> + * Any sleepable progs are also safe since 
>>> bpf_check_attach_target() enforce
>>> + * them can only be attached to some specific hook points.
>>> + */
>>>   static bool check_css_task_iter_allowlist(struct bpf_verifier_env 
>>> *env)
>>>   {
>>>          enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
>>>
>>> -       switch (prog_type) {
>>> -       case BPF_PROG_TYPE_LSM:
>>> +       if (prog_type == BPF_PROG_TYPE_LSM)
>>>                  return true;
>>> -       case BPF_TRACE_ITER:
>>> -               return env->prog->aux->sleepable;
>>> -       default:
>>> -               return false;
>>> -       }
>>> +
>>> +       if (env->prog->expected_attach_type == BPF_TRACE_ITER)
>>> +               return true;
>>
>> I think the switch by prog_type has to stay.
>> Checking attach_type == BPF_TRACE_ITER without considering prog_type
>> is fragile. It likely works, but we don't do it anywhere else.
>> Let's stick to what is known to work.
>>
>
> IIUC, do you mean:
>
> static bool check_css_task_iter_allowlist(struct bpf_verifier_env *env)
> {
>     enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
>
>      switch (prog_type) {
>      case BPF_PROG_TYPE_LSM:
>          return true;
>     case BPF_PROG_TYPE_TRACING:
>         if (env->prog->expected_attach_type == BPF_TRACE_ITER)
>             return true;
>         return env->prog->aux->sleepable;


The above can be a fullthrough instead.


> default:
>         return env->prog->aux->sleepable;
>      }
> }
>
>>> -SEC("?fentry.s/" SYS_PREFIX "sys_getpgid")
>>> -__failure __msg("css_task_iter is only allowed in bpf_lsm and bpf 
>>> iter-s")
>>> +SEC("?fentry/" SYS_PREFIX "sys_getpgid")
>>> +__failure __msg("css_task_iter is only allowed in bpf_lsm, bpf_iter 
>>> and sleepable progs")
>>
>> Please add both. fentry that is rejected and fentry.s that is accepted.
>
> Sure.
>

  reply	other threads:[~2023-10-24  6:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-24  2:42 [PATCH bpf-next v2 0/2] Relax allowlist for open-coded css_task iter Chuyi Zhou
2023-10-24  2:42 ` [PATCH bpf-next v2 1/2] bpf: Relax allowlist for " Chuyi Zhou
2023-10-24  4:57   ` Alexei Starovoitov
2023-10-24  5:52     ` Chuyi Zhou
2023-10-24  6:08       ` Yonghong Song [this message]
2023-10-24  6:23         ` Chuyi Zhou
2023-10-24 11:43           ` Chuyi Zhou
2023-10-24 15:28             ` Yonghong Song
2023-10-24  2:42 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add test for css_task iter combining with cgroup iter Chuyi Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c923dba3-f638-410f-ae65-0cfa1962a6f2@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=martin.lau@kernel.org \
    --cc=zhouchuyi@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox