From: Yonghong Song <yonghong.song@linux.dev>
To: Dave Marchevsky <davemarchevsky@fb.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@kernel.org>,
Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v1 bpf-next 5/7] bpf: Consider non-owning refs to refcounted nodes RCU protected
Date: Tue, 1 Aug 2023 22:59:24 -0700 [thread overview]
Message-ID: <eda219c8-88ef-0907-377c-eb965c3f1008@linux.dev> (raw)
In-Reply-To: <20230801203630.3581291-6-davemarchevsky@fb.com>
On 8/1/23 1:36 PM, Dave Marchevsky wrote:
> The previous patch in the series ensures that the underlying memory of
> nodes with bpf_refcount - which can have multiple owners - is not reused
> until RCU Tasks Trace grace period has elapsed. This prevents
> use-after-free with non-owning references that may point to
> recently-freed memory. While RCU read lock is held, it's safe to
> dereference such a non-owning ref, as by definition RCU GP couldn't have
> elapsed and therefore underlying memory couldn't have been reused.
>
> From the perspective of verifier "trustedness" non-owning refs to
> refcounted nodes are now trusted only in RCU CS and therefore should no
> longer pass is_trusted_reg, but rather is_rcu_reg. Let's mark them
> MEM_RCU in order to reflect this new state.
>
> Similarly to bpf_spin_unlock being a non-owning ref invalidation point,
> where non-owning ref reg states are clobbered so that they cannot be
> used outside of the critical section, currently all MEM_RCU regs are
> marked untrusted after bpf_rcu_read_unlock. This patch makes
> bpf_rcu_read_unlock a non-owning ref invalidation point as well,
> clobbering the non-owning refs instead of marking untrusted. In the
> future we may want to allow untrusted non-owning refs in which case we
> can remove this custom logic without breaking BPF programs as it's more
> restrictive than the default. That's a big change in semantics, though,
> and this series is focused on fixing the use-after-free in most
> straightforward way.
>
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> ---
> include/linux/bpf.h | 3 ++-
> kernel/bpf/verifier.c | 17 +++++++++++++++--
> 2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index ceaa8c23287f..37fba01b061a 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -653,7 +653,8 @@ enum bpf_type_flag {
> MEM_RCU = BIT(13 + BPF_BASE_TYPE_BITS),
>
> /* Used to tag PTR_TO_BTF_ID | MEM_ALLOC references which are non-owning.
> - * Currently only valid for linked-list and rbtree nodes.
> + * Currently only valid for linked-list and rbtree nodes. If the nodes
> + * have a bpf_refcount_field, they must be tagged MEM_RCU as well.
What does 'must' here mean?
> */
> NON_OWN_REF = BIT(14 + BPF_BASE_TYPE_BITS),
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 9014b469dd9d..4bda365000d3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -469,7 +469,8 @@ static bool type_is_ptr_alloc_obj(u32 type)
>
> static bool type_is_non_owning_ref(u32 type)
> {
> - return type_is_ptr_alloc_obj(type) && type_flag(type) & NON_OWN_REF;
> + return type_is_ptr_alloc_obj(type) &&
> + type_flag(type) & NON_OWN_REF;
There is no code change here.
> }
>
> static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg)
> @@ -8012,6 +8013,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> case PTR_TO_BTF_ID | PTR_TRUSTED:
> case PTR_TO_BTF_ID | MEM_RCU:
> case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF:
> + case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF | MEM_RCU:
> /* When referenced PTR_TO_BTF_ID is passed to release function,
> * its fixed offset must be 0. In the other cases, fixed offset
> * can be non-zero. This was already checked above. So pass
> @@ -10478,6 +10480,7 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env,
> static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> {
> struct bpf_verifier_state *state = env->cur_state;
> + struct btf_record *rec = reg_btf_record(reg);
>
> if (!state->active_lock.ptr) {
> verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n");
> @@ -10490,6 +10493,9 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state
> }
>
> reg->type |= NON_OWN_REF;
> + if (rec->refcount_off >= 0)
> + reg->type |= MEM_RCU;
Should we check whether the state is in rcu cs before marking MEM_RCU?
> +
> return 0;
> }
>
> @@ -11327,10 +11333,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> struct bpf_func_state *state;
> struct bpf_reg_state *reg;
>
> + if (in_rbtree_lock_required_cb(env) && (rcu_lock || rcu_unlock)) {
> + verbose(env, "can't rcu read {lock,unlock} in rbtree cb\n");
> + return -EACCES;
> + }
> +
> if (rcu_lock) {
> verbose(env, "nested rcu read lock (kernel function %s)\n", func_name);
> return -EINVAL;
> } else if (rcu_unlock) {
> + invalidate_non_owning_refs(env);
If we have both spin lock and rcu like
bpf_rcu_read_lock()
...
bpf_spin_lock()
...
bpf_spin_unlock() <=== invalidate all non_owning_refs
... <=== MEM_RCU type is gone
bpf_rcu_read_unlock()
Maybe we could fine tune here to preserve MEM_RCU after bpf_spin_unlock()?
> bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({
> if (reg->type & MEM_RCU) {
> reg->type &= ~(MEM_RCU | PTR_MAYBE_NULL);
> @@ -16679,7 +16691,8 @@ static int do_check(struct bpf_verifier_env *env)
> return -EINVAL;
> }
>
> - if (env->cur_state->active_rcu_lock) {
> + if (env->cur_state->active_rcu_lock &&
> + !in_rbtree_lock_required_cb(env)) {
> verbose(env, "bpf_rcu_read_unlock is missing\n");
> return -EINVAL;
> }
next prev parent reply other threads:[~2023-08-02 5:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-01 20:36 [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Dave Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 1/7] bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire Dave Marchevsky
2023-08-02 3:57 ` Yonghong Song
2023-08-02 19:23 ` Dave Marchevsky
2023-08-02 21:41 ` Yonghong Song
2023-08-04 6:17 ` David Marchevsky
2023-08-04 15:37 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 2/7] bpf: Consider non-owning refs trusted Dave Marchevsky
2023-08-02 4:11 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 3/7] bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes Dave Marchevsky
2023-08-02 4:15 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 4/7] bpf: Reenable bpf_refcount_acquire Dave Marchevsky
2023-08-02 5:21 ` Yonghong Song
2023-08-01 20:36 ` [PATCH v1 bpf-next 5/7] bpf: Consider non-owning refs to refcounted nodes RCU protected Dave Marchevsky
2023-08-02 5:59 ` Yonghong Song [this message]
2023-08-04 6:47 ` David Marchevsky
2023-08-04 15:43 ` Yonghong Song
2023-08-02 22:50 ` Alexei Starovoitov
2023-08-04 6:55 ` David Marchevsky
2023-08-01 20:36 ` [PATCH v1 bpf-next 6/7] [RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS Dave Marchevsky
2023-08-02 6:33 ` Yonghong Song
2023-08-02 22:55 ` Alexei Starovoitov
2023-08-01 20:36 ` [PATCH v1 bpf-next 7/7] selftests/bpf: Add tests for rbtree API interaction in sleepable progs Dave Marchevsky
2023-08-02 23:07 ` Alexei Starovoitov
2023-08-02 3:07 ` [PATCH v1 bpf-next 0/7] BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eda219c8-88ef-0907-377c-eb965c3f1008@linux.dev \
--to=yonghong.song@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davemarchevsky@fb.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox