public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: Dave Marchevsky <davemarchevsky@fb.com>, bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v1 bpf-next 5/6] bpf: Mark direct ld of stashed bpf_{rb,list}_node as non-owning ref
Date: Mon, 30 Oct 2023 19:36:41 -0700	[thread overview]
Message-ID: <73604c03-75ea-4126-9d8c-38d9581b6d9f@linux.dev> (raw)
In-Reply-To: <20231025214007.2920506-6-davemarchevsky@fb.com>


On 10/25/23 2:40 PM, Dave Marchevsky wrote:
> This patch enables the following pattern:
>
>    /* mapval contains a __kptr pointing to refcounted local kptr */
>    mapval = bpf_map_lookup_elem(&map, &idx);
>    if (!mapval || !mapval->some_kptr) { /* omitted */ }
>
>    p = bpf_refcount_acquire(&mapval->some_kptr);
>
> Currently this doesn't work because bpf_refcount_acquire expects an
> owning or non-owning ref. The verifier defines non-owning ref as a type:
>
>    PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF
>
> while mapval->some_kptr is PTR_TO_BTF_ID | PTR_UNTRUSTED. It's possible
> to do the refcount_acquire by first bpf_kptr_xchg'ing mapval->some_kptr
> into a temp kptr, refcount_acquiring that, and xchg'ing back into
> mapval, but this is unwieldy and shouldn't be necessary.
>
> This patch modifies btf_ld_kptr_type such that user-allocated types are
> marked MEM_ALLOC and if those types have a bpf_{rb,list}_node they're
> marked NON_OWN_REF as well. Additionally, due to changes to
> bpf_obj_drop_impl earlier in this series, rcu_protected_object now
> returns true for all user-allocated types, resulting in
> mapval->some_kptr being marked MEM_RCU.
>
> After this patch's changes, mapval->some_kptr is now:
>
>    PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF | MEM_RCU
>
> which results in it passing the non-owning ref test, and the motivating
> example passing verification.
>
> Future work will likely get rid of special non-owning ref lifetime logic
> in the verifier, at which point we'll be able to delete the NON_OWN_REF
> flag entirely.
>
> Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
> ---
>   kernel/bpf/verifier.c | 36 +++++++++++++++++++++++++++++++-----
>   1 file changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 857d76694517..bb098a4c8fd5 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5396,10 +5396,23 @@ BTF_SET_END(rcu_protected_types)
>   static bool rcu_protected_object(const struct btf *btf, u32 btf_id)
>   {
>   	if (!btf_is_kernel(btf))
> -		return false;
> +		return true;
>   	return btf_id_set_contains(&rcu_protected_types, btf_id);
>   }
>   
> +static struct btf_record *kptr_pointee_btf_record(struct btf_field *kptr_field)
> +{
> +	struct btf_struct_meta *meta;
> +
> +	if (btf_is_kernel(kptr_field->kptr.btf))
> +		return NULL;
> +
> +	meta = btf_find_struct_meta(kptr_field->kptr.btf,
> +				    kptr_field->kptr.btf_id);
> +
> +	return meta ? meta->record : NULL;
> +}
> +
>   static bool rcu_safe_kptr(const struct btf_field *field)
>   {
>   	const struct btf_field_kptr *kptr = &field->kptr;
> @@ -5410,12 +5423,25 @@ static bool rcu_safe_kptr(const struct btf_field *field)
>   
>   static u32 btf_ld_kptr_type(struct bpf_verifier_env *env, struct btf_field *kptr_field)
>   {
> +	struct btf_record *rec;
> +	u32 ret;
> +
> +	ret = PTR_MAYBE_NULL;
>   	if (rcu_safe_kptr(kptr_field) && in_rcu_cs(env)) {
> -		if (kptr_field->type != BPF_KPTR_PERCPU)
> -			return PTR_MAYBE_NULL | MEM_RCU;
> -		return PTR_MAYBE_NULL | MEM_RCU | MEM_PERCPU;
> +		ret |= MEM_RCU;
> +		if (kptr_field->type == BPF_KPTR_PERCPU)
> +			ret |= MEM_PERCPU;
> +		if (!btf_is_kernel(kptr_field->kptr.btf))
> +			ret |= MEM_ALLOC;
> +
> +		rec = kptr_pointee_btf_record(kptr_field);
> +		if (rec && btf_record_has_field(rec, BPF_GRAPH_NODE))
> +			ret |= NON_OWN_REF;
> +	} else {
> +		ret |= PTR_UNTRUSTED;
>   	}
> -	return PTR_MAYBE_NULL | PTR_UNTRUSTED;
> +
> +	return ret;
>   }

The CI reported a failure.
   https://github.com/kernel-patches/bpf/actions/runs/6675467065/job/18143577936?pr=5886

Error: #162 percpu_alloc
Error:#162/1 percpu_alloc/array
Error:#162/2 percpu_alloc/array_sleepable
Error:#162/3 percpu_alloc/cgrp_local_storage

Error: #162 percpu_alloc
Error:#162/1 percpu_alloc/array
Error: #162/1 percpu_alloc/array
test_array:PASS:percpu_alloc_array__open 0 nsec
libbpf: prog 'test_array_map_2': BPF program load failed: Permission denied
libbpf: prog 'test_array_map_2': -- BEGIN PROG LOAD LOG --
reg type unsupported for arg#0 function test_array_map_2#25
0: R1=ctx(off=0,imm=0) R10=fp0
; int BPF_PROG(test_array_map_2)
0: (b4) w1 = 0 ; R1_w=0
; int index = 0;
1: (63) *(u32 *)(r10 -4) = r1 ; R1_w=0 R10=fp0 fp-8=0000????
2: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
;
3: (07) r2 += -4 ; R2_w=fp-4
; e = bpf_map_lookup_elem(&array, &index);
4: (18) r1 = 0xffffa3bd813c5400 ; R1_w=map_ptr(off=0,ks=4,vs=16,imm=0)
6: (85) call bpf_map_lookup_elem#1 ; 
R0_w=map_value_or_null(id=1,off=0,ks=4,vs=16,imm=0)
; if (!e)
7: (15) if r0 == 0x0 goto pc+9 ; R0_w=map_value(off=0,ks=4,vs=16,imm=0)
; p = e->pc;
8: (79) r1 = *(u64 *)(r0 +8) ; R0=map_value(off=0,ks=4,vs=16,imm=0) 
R1=percpu_rcu_ptr_or_null_val_t(id=2,off=0,imm=0)
; if (!p)
9: (15) if r1 == 0x0 goto pc+7 ; R1=percpu_rcu_ptr_val_t(off=0,imm=0)
; v = bpf_per_cpu_ptr(p, 0);
10: (b4) w2 = 0 ; R2_w=0
11: (85) call bpf_per_cpu_ptr#153
R1 type=percpu_rcu_ptr_ expected=percpu_ptr_, percpu_rcu_ptr_, 
percpu_trusted_ptr_
processed 11 insns (limit 1000000) max_states_per_insn 0 total_states 1 
peak_states 1 mark_read 1
-- END PROG LOAD LOG --
libbpf: prog 'test_array_map_2': failed to load: -13
libbpf: failed to load object 'percpu_alloc_array'
libbpf: failed to load BPF skeleton 'percpu_alloc_array': -13
test_array:FAIL:percpu_alloc_array__load unexpected error: -13 (errno 13)

The following hack can fix the issue.

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index bb098a4c8fd5..2bbda1f5e858 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5431,7 +5431,7 @@ static u32 btf_ld_kptr_type(struct 
bpf_verifier_env *env, struct btf_field *kptr
                 ret |= MEM_RCU;
                 if (kptr_field->type == BPF_KPTR_PERCPU)
                         ret |= MEM_PERCPU;
-               if (!btf_is_kernel(kptr_field->kptr.btf))
+               else if (!btf_is_kernel(kptr_field->kptr.btf))
                         ret |= MEM_ALLOC;

                 rec = kptr_pointee_btf_record(kptr_field);


Note in the current kernel, MEM_RCU | MEM_PERCPU implies non-kernel 
kptr. The kernel PERCPU kptr has PTR_TRUSTED | MEM_PERCPU. So there is 
no MEM_ALLOC. Adding MEM_ALLOC might need changes in other places 
w.r.t., percpu non-kernel kptr.

Please take a look.

>   
>   static int check_map_kptr_access(struct bpf_verifier_env *env, u32 regno,

  reply	other threads:[~2023-10-31  2:36 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 21:40 [PATCH v1 bpf-next 0/6] Allow bpf_refcount_acquire of mapval Dave Marchevsky
2023-10-25 21:40 ` [PATCH v1 bpf-next 1/6] bpf: Add KF_RCU flag to bpf_refcount_acquire_impl Dave Marchevsky
2023-10-25 21:40 ` [PATCH v1 bpf-next 2/6] selftests/bpf: Add test passing MAYBE_NULL reg to bpf_refcount_acquire Dave Marchevsky
2023-10-25 21:40 ` [PATCH v1 bpf-next 3/6] bpf: Use bpf_mem_free_rcu when bpf_obj_dropping non-refcounted nodes Dave Marchevsky
2023-10-25 21:40 ` [PATCH v1 bpf-next 4/6] bpf: Move GRAPH_{ROOT,NODE}_MASK macros into btf_field_type enum Dave Marchevsky
2023-10-25 21:40 ` [PATCH v1 bpf-next 5/6] bpf: Mark direct ld of stashed bpf_{rb,list}_node as non-owning ref Dave Marchevsky
2023-10-31  2:36   ` Yonghong Song [this message]
2023-10-25 21:40 ` [PATCH v1 bpf-next 6/6] selftests/bpf: Test bpf_refcount_acquire of node obtained via direct ld Dave Marchevsky
2023-10-25 21:48 ` [PATCH v1 bpf-next 0/6] Allow bpf_refcount_acquire of mapval David Marchevsky
2023-10-31  2:38 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73604c03-75ea-4126-9d8c-38d9581b6d9f@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davemarchevsky@fb.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox