From: Yonghong Song <yonghong.song@linux.dev>
To: Dave Marchevsky <davemarchevsky@gmail.com>,
Dave Marchevsky <davemarchevsky@fb.com>,
bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
Date: Wed, 2 Aug 2023 14:41:17 -0700 [thread overview]
Message-ID: <269b5c18-69ac-a9c4-3596-84ddc82b8877@linux.dev> (raw)
In-Reply-To: <a139645d-7949-30c7-5a6d-00f288babd81@gmail.com>
On 8/2/23 12:23 PM, Dave Marchevsky wrote:
>
>
> On 8/1/23 11:57 PM, Yonghong Song wrote:
>>
>>
>> On 8/1/23 1:36 PM, Dave Marchevsky wrote:
>>> It's straightforward to prove that kptr_struct_meta must be non-NULL for
>>> any valid call to these kfuncs:
>>>
>>> * btf_parse_struct_metas in btf.c creates a btf_struct_meta for any
>>> struct in user BTF with a special field (e.g. bpf_refcount,
>>> {rb,list}_node). These are stored in that BTF's struct_meta_tab.
>>>
>>> * __process_kf_arg_ptr_to_graph_node in verifier.c ensures that nodes
>>> have {rb,list}_node field and that it's at the correct offset.
>>> Similarly, check_kfunc_args ensures bpf_refcount field existence for
>>> node param to bpf_refcount_acquire.
>>>
>>> * So a btf_struct_meta must have been created for the struct type of
>>> node param to these kfuncs
>>>
>>> * That BTF and its struct_meta_tab are guaranteed to still be around.
>>> Any arbitrary {rb,list} node the BPF program interacts with either:
>>> came from bpf_obj_new or a collection removal kfunc in the same
>>> program, in which case the BTF is associated with the program and
>>> still around; or came from bpf_kptr_xchg, in which case the BTF was
>>> associated with the map and is still around
>>>
>>> Instead of silently continuing with NULL struct_meta, which caused
>>> confusing bugs such as those addressed by commit 2140a6e3422d ("bpf: Set
>>> kptr_struct_meta for node param to list and rbtree insert funcs"), let's
>>> error out. Then, at runtime, we can confidently say that the
>>> implementations of these kfuncs were given a non-NULL kptr_struct_meta,
>>> meaning that special-field-specific functionality like
>>> bpf_obj_free_fields and the bpf_obj_drop change introduced later in this
>>> series are guaranteed to execute.
>>
>> The subject says '... for collection insert and refcount_acquire'.
>> Why picks these? We could check for all kptr_struct_meta use cases?
>>
>
> fixup_kfunc_call sets kptr_struct_meta arg for the following kfuncs:
>
> - bpf_obj_new_impl
> - bpf_obj_drop_impl
> - collection insert kfuncs
> - bpf_rbtree_add_impl
> - bpf_list_push_{front,back}_impl
> - bpf_refcount_acquire_impl
>
> A btf_struct_meta is only created for a struct if it has a non-null btf_record,
> which in turn only happens if the struct has any special fields (spin_lock,
> refcount, {rb,list}_node, etc.). Since it's valid to call bpf_obj_new on a
> struct type without any special fields, the kptr_struct_meta arg can be
> NULL. The result of such bpf_obj_new allocation must be bpf_obj_drop-able, so
> the same holds for that kfunc.
>
> By definition rbtree and list nodes must be some struct type w/
> struct bpf_{rb,list}_node field, and similar logic for refcounted, so if there's
> no kptr_struct_meta for their node arg, there was some verifier-internal issue.
>
>
>>>
>>> This patch doesn't change functionality, just makes it easier to reason
>>> about existing functionality.
>>>
>>> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
>>> ---
>>> kernel/bpf/verifier.c | 14 ++++++++++++++
>>> 1 file changed, 14 insertions(+)
>>>
>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>> index e7b1af016841..ec37e84a11c6 100644
>>> --- a/kernel/bpf/verifier.c
>>> +++ b/kernel/bpf/verifier.c
>>> @@ -18271,6 +18271,13 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>> struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta;
>>> struct bpf_insn addr[2] = { BPF_LD_IMM64(BPF_REG_2, (long)kptr_struct_meta) };
>>> + if (desc->func_id == special_kfunc_list[KF_bpf_refcount_acquire_impl] &&
>>
>> Why check for KF_bpf_refcount_acquire_impl? We can cover all cases in this 'if' branch, right?
>>
>
> The body of this 'else if' also handles kptr_struct_meta setup for bpf_obj_drop,
> for which NULL kptr_struct_meta is valid.
>
>>> + !kptr_struct_meta) {
>>> + verbose(env, "verifier internal error: kptr_struct_meta expected at insn_idx %d\n",
>>> + insn_idx);
>>> + return -EFAULT;
>>> + }
>>> +
>>> insn_buf[0] = addr[0];
>>> insn_buf[1] = addr[1];
>>> insn_buf[2] = *insn;
>>> @@ -18278,6 +18285,7 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>> } else if (desc->func_id == special_kfunc_list[KF_bpf_list_push_back_impl] ||
>>> desc->func_id == special_kfunc_list[KF_bpf_list_push_front_impl] ||
>>> desc->func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) {
>>> + struct btf_struct_meta *kptr_struct_meta = env->insn_aux_data[insn_idx].kptr_struct_meta;
>>> int struct_meta_reg = BPF_REG_3;
>>> int node_offset_reg = BPF_REG_4;
>>> @@ -18287,6 +18295,12 @@ static int fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>>> node_offset_reg = BPF_REG_5;
>>> }
>>> + if (!kptr_struct_meta) {
>>> + verbose(env, "verifier internal error: kptr_struct_meta expected at insn_idx %d\n",
>>> + insn_idx);
>>> + return -EFAULT;
>>> + }
>>> +
>>> __fixup_collection_insert_kfunc(&env->insn_aux_data[insn_idx], struct_meta_reg,
>>> node_offset_reg, insn, insn_buf, cnt);
>>> } else if (desc->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] ||
>>
>> In my opinion, such selective defensive programming is not necessary. By searching kptr_struct_meta in the code, it is reasonably easy to find
>> whether we have any mismatch or not. Also self test coverage should
>> cover these cases (probably already) right?
>>
>> If the defensive programming here is still desirable to warn at verification time, I think we should just check all of uses for kptr_struct_meta.
>
> Something like this patch probably should've been included with the series
> containing 2140a6e3422d ("bpf: Set kptr_struct_meta for node param to list and rbtree insert funcs"),
> since that commit found that kptr_struct_meta wasn't being set for collection
> insert kfuncs and fixed the issue. It was annoyingly hard to root-cause
> because, among other things, many of these kfunc impls check that
> the btf_struct_meta is non-NULL before using it, with some fallback logic.
> I don't like those unnecessary NULL checks either, and considered removing
> them in this patch, but decided to leave them in since we already had
> a case where struct_meta wasn't being set.
>
> On second thought, maybe it's better to take the unnecessary runtime checks
> out and leave these verification-time checks in. If, at runtime, those kfuncs
> see a NULL btf_struct_meta, I'd rather they fail loudly in the future
> with a NULL deref splat, than potentially leaking memory or similarly
> subtle failures. WDYT?
Certainly I agree with you that verification failure is much better than
debugging runtime.
Here, we covered a few kfunc which always requires non-NULL
kptr_struct_meta. But as you mentioned in the above, we also have
cases where for a kfunc, the kptr_struct_meta could be NULL or non-NULL.
Let us say, kfunc bpf_obj_new_impl, for some cases, the kptr_struct_meta
cannot be NULL based on bpf prog, but somehow, the verifier passes
a NULL ptr to the program. Should we check this at fixup_kfunc_call()
as well?
>
> I don't feel particularly strongly about these verification-time checks,
> but the level of 'selective defensive programming' here feels similar to
> other 'verifier internal error' checks sprinkled throughout verifier.c,
> so that argument doesn't feel very persuasive to me.
I am okay with this patch but I wonder whether we can cover more
cases.
next prev parent reply other threads:[~2023-08-02 21:41 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-01 20:36 [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Dave Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire Dave Marchevsky
2023-08-02 3:57 ` Yonghong Song
2023-08-02 19:23 ` Dave Marchevsky
2023-08-02 21:41 ` Yonghong Song [this message]
2023-08-04 6:17 ` David Marchevsky
2023-08-04 15:37 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 2/7] bpf: Consider non-owning refs trusted Dave Marchevsky
2023-08-02 4:11 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 3/7] bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes Dave Marchevsky
2023-08-02 4:15 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 4/7] bpf: Reenable bpf_refcount_acquire Dave Marchevsky
2023-08-02 5:21 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 5/7] bpf: Consider non-owning refs to refcounted nodes RCU protected Dave Marchevsky
2023-08-02 5:59 ` Yonghong Song
2023-08-04 6:47 ` David Marchevsky
2023-08-04 15:43 ` Yonghong Song
2023-08-02 22:50 ` Alexei Starovoitov
2023-08-04 6:55 ` David Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 6/7] [RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS Dave Marchevsky
2023-08-02 6:33 ` Yonghong Song
2023-08-02 22:55 ` Alexei Starovoitov
2023-08-01 20:36 ` [PATCH v1 bpf-next 7/7] selftests/bpf: Add tests for rbtree API interaction in sleepable progs Dave Marchevsky
2023-08-02 23:07 ` Alexei Starovoitov
2023-08-02 3:07 ` [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=269b5c18-69ac-a9c4-3596-84ddc82b8877@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=davemarchevsky@gmail.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox