Re: [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

From: Yonghong Song <yonghong.song@linux.dev>
To: Dave Marchevsky <davemarchevsky@gmail.com>,
	Dave Marchevsky <davemarchevsky@fb.com>,
	bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
Date: Wed, 2 Aug 2023 14:41:17 -0700	[thread overview]
Message-ID: <269b5c18-69ac-a9c4-3596-84ddc82b8877@linux.dev> (raw)
In-Reply-To: <a139645d-7949-30c7-5a6d-00f288babd81@gmail.com>



On 8/2/23 12:23 PM, Dave Marchevsky wrote:
> 
> 
> On 8/1/23 11:57 PM, Yonghong Song wrote:
>>
>>
>> On 8/1/23 1:36 PM, Dave Marchevsky wrote:
>>> It's straightforward to prove that kptr_struct_meta must be non-NULL for
>>> any valid call to these kfuncs:
>>>
>>>     * btf_parse_struct_metas in btf.c creates a btf_struct_meta for any
>>>       struct in user BTF with a special field (e.g. bpf_refcount,
>>>       {rb,list}_node). These are stored in that BTF's struct_meta_tab.
>>>
>>>     * __process_kf_arg_ptr_to_graph_node in verifier.c ensures that nodes
>>>       have {rb,list}_node field and that it's at the correct offset.
>>>       Similarly, check_kfunc_args ensures bpf_refcount field existence for
>>>       node param to bpf_refcount_acquire.
>>>
>>>     * So a btf_struct_meta must have been created for the struct type of
>>>       node param to these kfuncs
>>>
>>>     * That BTF and its struct_meta_tab are guaranteed to still be around.
>>>       Any arbitrary {rb,list} node the BPF program interacts with either:
>>>       came from bpf_obj_new or a collection removal kfunc in the same
>>>       program, in which case the BTF is associated with the program and
>>>       still around; or came from bpf_kptr_xchg, in which case the BTF was
>>>       associated with the map and is still around
>>>
>>> Instead of silently continuing with NULL struct_meta, which caused
>>> confusing bugs such as those addressed by commit 2140a6e3422d ("bpf: Set
>>> kptr_struct_meta for node param to list and rbtree insert funcs"), let's
>>> error out. Then, at runtime, we can confidently say that the
>>> implementations of these kfuncs were given a non-NULL kptr_struct_meta,
>>> meaning that special-field-specific functionality like
>>> bpf_obj_free_fields and the bpf_obj_drop change introduced later in this
>>> series are guaranteed to execute.
>>
>> The subject says '... for collection insert and refcount_acquire'.
>> Why picks these? We could check for all kptr_struct_meta use cases?
>>
> 
> fixup_kfunc_call sets kptr_struct_meta arg for the following kfuncs:
> 
>    - bpf_obj_new_impl
>    - bpf_obj_drop_impl
>    - collection insert kfuncs
>      - bpf_rbtree_add_impl
>      - bpf_list_push_{front,back}_impl
>    - bpf_refcount_acquire_impl
> 
> A btf_struct_meta is only created for a struct if it has a non-null btf_record,
> which in turn only happens if the struct has any special fields (spin_lock,
> refcount, {rb,list}_node, etc.). Since it's valid to call bpf_obj_new on a
> struct type without any special fields, the kptr_struct_meta arg can be
> NULL. The result of such bpf_obj_new allocation must be bpf_obj_drop-able, so
> the same holds for that kfunc.
> 
> By definition rbtree and list nodes must be some struct type w/
> struct bpf_{rb,list}_node field, and similar logic for refcounted, so if there's
> no kptr_struct_meta for their node arg, there was some verifier-internal issue.
> 
> 
>>>
>>> This patch doesn't change functionality, just makes it easier to reason
>>> about existing functionality.
>>>
>>> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
>>> ---
>>>    kernel/bpf/verifier.c | 14 ++++++++++++++
>>>    1 file changed, 14 insertions(+)
>>>
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index e7b1af016841..ec37e84a11c6 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -18271,6 +18271,13 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>>            struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta;
>>>            struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
>>>    +        if (desc->func_id == special_kfunc_list[KF_bpf_refcount_acquire_impl] &&
>>
>> Why check for KF_bpf_refcount_acquire_impl? We can cover all cases in this 'if' branch, right?
>>
> 
> The body of this 'else if' also handles kptr_struct_meta setup for bpf_obj_drop,
> for which NULL kptr_struct_meta is valid.
> 
>>> +            !kptr_struct_meta) {
>>> +            verbose(env, "verifier internal error: kptr_struct_meta expected at insn_idx %d\n",
>>> +                insn_idx);
>>> +            return -EFAULT;
>>> +        }
>>> +
>>>            insn_buf[0] = addr[0];
>>>            insn_buf[1] = addr[1];
>>>            insn_buf[2] = *insn;
>>> @@ -18278,6 +18285,7 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>>        } else if (desc->func_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
>>>               desc->func_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
>>>               desc->func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
>>> +        struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta;
>>>            int struct_meta_reg = BPF_REG_3;
>>>            int node_offset_reg = BPF_REG_4;
>>>    @@ -18287,6 +18295,12 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>>                node_offset_reg = BPF_REG_5;
>>>            }
>>>    +        if (!kptr_struct_meta) {
>>> +            verbose(env, "verifier internal error: kptr_struct_meta expected at insn_idx %d\n",
>>> +                insn_idx);
>>> +            return -EFAULT;
>>> +        }
>>> +
>>>            __fixup_collection_insert_kfunc(&env->insn_aux_data[insn_idx], struct_meta_reg,
>>>                            node_offset_reg, insn, insn_buf, cnt);
>>>        } else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
>>
>> In my opinion, such selective defensive programming is not necessary. By searching kptr_struct_meta in the code, it is reasonably easy to find
>> whether we have any mismatch or not. Also self test coverage should
>> cover these cases (probably already) right?
>>
>> If the defensive programming here is still desirable to warn at verification time, I think we should just check all of uses for kptr_struct_meta.
> 
> Something like this patch probably should've been included with the series
> containing 2140a6e3422d ("bpf: Set kptr_struct_meta for node param to list and rbtree insert funcs"),
> since that commit found that kptr_struct_meta wasn't being set for collection
> insert kfuncs and fixed the issue. It was annoyingly hard to root-cause
> because, among other things, many of these kfunc impls check that
> the btf_struct_meta is non-NULL before using it, with some fallback logic.
> I don't like those unnecessary NULL checks either, and considered removing
> them in this patch, but decided to leave them in since we already had
> a case where struct_meta wasn't being set.
> 
> On second thought, maybe it's better to take the unnecessary runtime checks
> out and leave these verification-time checks in. If, at runtime, those kfuncs
> see a NULL btf_struct_meta, I'd rather they fail loudly in the future
> with a NULL deref splat, than potentially leaking memory or similarly
> subtle failures. WDYT?

Certainly I agree with you that verification failure is much better than
debugging runtime.

Here, we covered a few kfunc which always requires non-NULL 
kptr_struct_meta. But as you mentioned in the above, we also have
cases where for a kfunc, the kptr_struct_meta could be NULL or non-NULL.

Let us say, kfunc bpf_obj_new_impl, for some cases, the kptr_struct_meta
cannot be NULL based on bpf prog, but somehow, the verifier passes
a NULL ptr to the program. Should we check this at fixup_kfunc_call()
as well?

> 
> I don't feel particularly strongly about these verification-time checks,
> but the level of 'selective defensive programming' here feels similar to
> other 'verifier internal error' checks sprinkled throughout verifier.c,
> so that argument doesn't feel very persuasive to me.

I am okay with this patch but I wonder whether we can cover more
cases.

next prev parent reply	other threads:[~2023-08-02 21:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 20:36 [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Dave Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire Dave Marchevsky
2023-08-02  3:57   ` Yonghong Song
2023-08-02 19:23     ` Dave Marchevsky
2023-08-02 21:41       ` Yonghong Song [this message]
2023-08-04  6:17         ` David Marchevsky
2023-08-04 15:37           ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 2/7] bpf: Consider non-owning refs trusted Dave Marchevsky
2023-08-02  4:11   ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 3/7] bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes Dave Marchevsky
2023-08-02  4:15   ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 4/7] bpf: Reenable bpf_refcount_acquire Dave Marchevsky
2023-08-02  5:21   ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 5/7] bpf: Consider non-owning refs to refcounted nodes RCU protected Dave Marchevsky
2023-08-02  5:59   ` Yonghong Song
2023-08-04  6:47     ` David Marchevsky
2023-08-04 15:43       ` Yonghong Song
2023-08-02 22:50   ` Alexei Starovoitov
2023-08-04  6:55     ` David Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 6/7] [RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS Dave Marchevsky
2023-08-02  6:33   ` Yonghong Song
2023-08-02 22:55   ` Alexei Starovoitov
2023-08-01 20:36 ` [PATCH v1 bpf-next 7/7] selftests/bpf: Add tests for rbtree API interaction in sleepable progs Dave Marchevsky
2023-08-02 23:07   ` Alexei Starovoitov
2023-08-02  3:07 ` [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=269b5c18-69ac-a9c4-3596-84ddc82b8877@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=davemarchevsky@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox