netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: "Björn Töpel" <bjorn@kernel.org>,
	"Ilya Leoshkevich" <iii@linux.ibm.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"John Fastabend" <john.fastabend@gmail.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org
Cc: "Björn Töpel" <bjorn@rivosinc.com>,
	"Brendan Jackman" <jackmanb@google.com>
Subject: Re: [PATCH bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions
Date: Tue, 6 Dec 2022 17:01:17 -0800	[thread overview]
Message-ID: <cd7a6e8d-2de1-d5a0-cf4a-09188f01fa7e@meta.com> (raw)
In-Reply-To: <87v8mos7gw.fsf@all.your.base.are.belong.to.us>



On 12/6/22 10:38 AM, Björn Töpel wrote:
> Yonghong Song <yhs@meta.com> writes:
> 
>> On 12/6/22 9:47 AM, Yonghong Song wrote:
>>>
>>>
>>> On 12/6/22 5:21 AM, Ilya Leoshkevich wrote:
>>>> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote:
>>>>> From: Björn Töpel <bjorn@rivosinc.com>
>>>>>
>>>>> A BPF call instruction can be, correctly, marked with zext_dst set to
>>>>> true. An example of this can be found in the BPF selftests
>>>>> progs/bpf_cubic.c:
>>>>>
>>>>>     ...
>>>>>     extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
>>>>>
>>>>>     __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
>>>>>     {
>>>>>             return tcp_reno_undo_cwnd(sk);
>>>>>     }
>>>>>     ...
>>>>>
>>>>> which compiles to:
>>>>>     0:  r1 = *(u64 *)(r1 + 0x0)
>>>>>     1:  call -0x1
>>>>>     2:  exit
>>>>>
>>>>> The call will be marked as zext_dst set to true, and for some
>>>>> backends
>>>>> (bpf_jit_needs_zext() returns true) expanded to:
>>>>>     0:  r1 = *(u64 *)(r1 + 0x0)
>>>>>     1:  call -0x1
>>>>>     2:  w0 = w0
>>>>>     3:  exit
>>>>
>>>> In the verifier, the marking is done by check_kfunc_call() (added in
>>>> e6ac2450d6de), right? So the problem occurs only for kfuncs?
>>>>
>>>>           /* Check return type */
>>>>           t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
>>>>
>>>>           ...
>>>>
>>>>           if (btf_type_is_scalar(t)) {
>>>>                   mark_reg_unknown(env, regs, BPF_REG_0);
>>>>                   mark_btf_func_reg_size(env, BPF_REG_0, t->size);
>>>>
>>>> I tried to find some official information whether the eBPF calling
>>>> convention requires sign- or zero- extending return values and
>>>> arguments, but unfortunately [1] doesn't mention this.
>>>>
>>>> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W*
>>>> registers, but since assigning to W* leads to zero-extension, it seems
>>>> to me that this is the case.
>>>
>>> We actually follow the clang convention, the zero-extension is either
>>> done in caller or callee, but not both. See
>>> https://reviews.llvm.org/D131598   how the convention could be changed.
>>>
>>> The following is an example.
>>>
>>> $ cat t.c
>>> extern unsigned foo(void);
>>> unsigned bar1(void) {
>>>       return foo();
>>> }
>>> unsigned bar2(void) {
>>>       if (foo()) return 10; else return 20;
>>> }
>>> $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o
>>>
>>> t.o:    file format elf64-bpf
>>>
>>> Disassembly of section .text:
>>>
>>> 0000000000000000 <bar1>:
>>>          0:       85 10 00 00 ff ff ff ff call -0x1
>>>          1:       95 00 00 00 00 00 00 00 exit
>>>
>>> 0000000000000010 <bar2>:
>>>          2:       85 10 00 00 ff ff ff ff call -0x1
>>>          3:       bc 01 00 00 00 00 00 00 w1 = w0
>>>          4:       b4 00 00 00 14 00 00 00 w0 = 0x14
>>>          5:       16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2>
>>>          6:       b4 00 00 00 0a 00 00 00 w0 = 0xa
>>>
>>> 0000000000000038 <LBB1_2>:
>>>          7:       95 00 00 00 00 00 00 00 exit
>>> $
>>>
>>> If the return value of 'foo()' is actually used in the bpf program, the
>>> proper zero extension will be done. Otherwise, it is not done.
>>>
>>> This is with latest llvm16. I guess we need to check llvm whether
>>> we could enforce to add a w0 = w0 in bar1().
>>>
>>> Otherwise, with this patch, it will add w0 = w0 in all cases which
>>> is not necessary in most of practical cases.
>>>
>>>>
>>>> If the above is correct, then shouldn't we rather use sizeof(void *) in
>>>> the mark_btf_func_reg_size() call above?
>>>>
>>>>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for
>>>>> the zext patching, relies on insn_def_regno() to fetch the register
>>>>> to
>>>>> zero-extend. However, this function does not handle call instructions
>>>>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the
>>>>> verification.
>>>>>
>>>>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL)
>>>>> instructions.
>>>>>
>>>>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in
>>>>> insn_has_def32()")
>>>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
>>>>> ---
>>>>> I'm not super happy about the additional special case -- first
>>>>> cmpxchg, and now call. :-( A more elegant/generic solution is
>>>>> welcome!
>>>>> ---
>>>>>    kernel/bpf/verifier.c | 3 +++
>>>>>    1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>>>>> index 264b3dc714cc..4f9660eafc72 100644
>>>>> --- a/kernel/bpf/verifier.c
>>>>> +++ b/kernel/bpf/verifier.c
>>>>> @@ -13386,6 +13386,9 @@ static int
>>>>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env,
>>>>>                   if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn))
>>>>>                           continue;
>>>>> +               if (insn.code == (BPF_JMP | BPF_CALL))
>>>>> +                       load_reg = BPF_REG_0;
>>
>> Want to double check. Do we actually have a problem here?
>> For example, on x64, we probably won't have this issue.
> 
> The "problem" is that I hit this:
> 		if (WARN_ON(load_reg == -1)) {
> 			verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n");
> 			return -EFAULT;
> 		}
> 
> This path is only taken for archs which have bpf_jit_needs_zext() ==
> true. In my case it's riscv64, but it should hit i386, sparc, s390, ppc,
> mips, and arm.
> 
> My reading of this thread has been that "marking the call has
> zext_dst=true, is incorrect", i.e. that LLVM will insert the correct
> zext instructions.

Your interpretation is correct. Yes, for func return values, the
llvm will insert correct zext/sext instructions if the return
value is used. Otherwise, if the return value simply passes
through, the caller call site should handle that properly.

So, yes changing t->size to sizeof(u64) in below code in
check_kfunc_call() should work. But the fix sounds like a hack
and we might have some side effect during verification, now
or future.

Maybe we could check BPF_PSEUDO_KFUNC_CALL in appropriate place to 
prevent zext.

> 
> So, on way of not hitting this path, is what Ilya suggest -- in
> check_kfunc_call():
> 
>    if (btf_type_is_scalar(t)) {
>    	mark_reg_unknown(env, regs, BPF_REG_0);
>    	mark_btf_func_reg_size(env, BPF_REG_0, t->size);
>    }
> 
> change t->size to sizeof(u64). Then the call wont be marked.
> 
>>   >>>    ...
>>   >>>    extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
>>   >>>
>>   >>>    __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
>>   >>>    {
>>   >>>            return tcp_reno_undo_cwnd(sk);
>>   >>>    }
>>
>> The native code will return a 32-bit subreg to bpf program,
>> and bpf didn't do anything and return r0 to the kernel func.
>> In the kernel func, the kernel will take 32-bit subreg by
>> x86_64 convention. This applies to some other return types
>> like u8/s8/u16/s16/u32/s32.
>>
>> Which architecture you actually see the issue?
> 
> This is riscv64, but the nature of the problem is more of an assertion
> failure, than codegen AFAIK.
> 
> I hit is when I load progs/bpf_cubic.o from the selftest. Nightly clang
> from apt.llvm.org: clang version 16.0.0
> (++20221204034339+7a194cfb327a-1~exp1~20221204154444.167)
> 
> 
> Björn

  reply	other threads:[~2022-12-07  1:01 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-02 10:36 [PATCH bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions Björn Töpel
2022-12-06 13:21 ` Ilya Leoshkevich
2022-12-06 13:49   ` Björn Töpel
2022-12-06 13:51     ` Ilya Leoshkevich
2022-12-06 17:47   ` Yonghong Song
2022-12-06 18:02     ` Björn Töpel
2022-12-06 18:09     ` Yonghong Song
2022-12-06 18:38       ` Björn Töpel
2022-12-07  1:01         ` Yonghong Song [this message]
2022-12-07  7:10           ` Björn Töpel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd7a6e8d-2de1-d5a0-cf4a-09188f01fa7e@meta.com \
    --to=yhs@meta.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bjorn@rivosinc.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=iii@linux.ibm.com \
    --cc=jackmanb@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).