Re: WARNING in __mark_chain_precision

BPF List
 help / color / mirror / Atom feed

From: Yonghong Song <yhs@meta.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Hao Sun <sunhao.th@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Andrii Nakryiko <andrii@kernel.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	KP Singh <kpsingh@kernel.org>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, David Miller <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: WARNING in __mark_chain_precision
Date: Tue, 3 Jan 2023 21:47:10 -0800	[thread overview]
Message-ID: <44be9647-0208-beaf-130d-92a036c95d04@meta.com> (raw)
In-Reply-To: <CAADnVQKy48LPz=f6XSHv84tZb7MFp8PrWrk3AxJLKwJR0tL=nA@mail.gmail.com>



On 1/3/23 10:27 AM, Alexei Starovoitov wrote:
> On Mon, Jan 2, 2023 at 1:42 AM Hao Sun <sunhao.th@gmail.com> wrote:
>>
>>
>>
>> Yonghong Song <yhs@meta.com> 于2023年1月2日周一 03:20写道：
>>>
>>>
>>>
>>> On 12/30/22 1:44 AM, Hao Sun wrote:
>>>>
>>>>
>>>> Andrii Nakryiko <andrii.nakryiko@gmail.com> 于2022年12月30日周五 06:16写道：
>>>>>
>>>>> On Tue, Dec 27, 2022 at 9:24 PM Yonghong Song <yhs@meta.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/20/22 4:30 PM, Andrii Nakryiko wrote:
>>>>>>> On Mon, Dec 19, 2022 at 11:13 AM <sdf@google.com> wrote:
>>>>>>>>
>>>>>>>> On 12/19, Hao Sun wrote:
>>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> The following backtracking bug can be triggered on the latest bpf-next and
>>>>>>>>> Linux 6.1 with the C prog provided. I don't have enough knowledge about
>>>>>>>>> this part in the verifier, don't know how to fix this.
>>>>>>>>
>>>>>>>> Maybe something related to commit be2ef8161572 ("bpf: allow precision
>>>>>>>> tracking
>>>>>>>> for programs with subprogs") and/or the related ones?
>>>>>>>>
>>>>>>>>
>>>>>>>>> This can be reproduced on:
>>>>>>>>
>>>>>>>>> HEAD commit: 0e43662e61f2 tools/resolve_btfids: Use pkg-config to locate
>>>>>>>>> libelf
>>>>>>>>> git tree: bpf-next
>>>>>>>>> console log: https://pastebin.com/raw/45hZ7iqm
>>>>>>>>> kernel config: https://pastebin.com/raw/0pu1CHRm
>>>>>>>>> C reproducer: https://pastebin.com/raw/tqsiezvT
>>>>>>>>
>>>>>>>>> func#0 @0
>>>>>>>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>>>>>>>> 0: (18) r2 = 0x8000000000000          ; R2_w=2251799813685248
>>>>>>>>> 2: (18) r6 = 0xffff888027358000       ;
>>>>>>>>> R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>>>>>>>>> 4: (18) r7 = 0xffff88802735a000       ;
>>>>>>>>> R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
>>>>>>>>> 6: (18) r8 = 0xffff88802735e000       ;
>>>>>>>>> R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
>>>>>>>>> 8: (18) r9 = 0x8e9700000000           ; R9_w=156779191205888
>>>>>>>>> 10: (36) if w9 >= 0xffffffe3 goto pc+1
>>>>>>>>> last_idx 10 first_idx 0
>>>>>>>>> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
>>>>>>>>> 11: R9_w=156779191205888
>>>>>>>>> 11: (85) call #0
>>>>>>>>> 12: (cc) w2 s>>= w7
>>>>>>>
>>>>>>> w2 should have been set to NOT_INIT (because r1-r5 are clobbered by
>>>>>>> calls) and rejected here as !read_ok (see check_reg_arg()) before
>>>>>>> attempting to mark precision for r2. Can you please try to debug and
>>>>>>> understand why that didn't happen here?
>>>>>>
>>>>>> The verifier is doing the right thing here and the 'call #0' does
>>>>>> implicitly cleared r1-r5.
>>>>>>
>>>>>> So for 'w2 s>>= w7', since w2 is used, the verifier tries to find
>>>>>> its definition by backtracing. It encountered 'call #0', which clears
>>>>>
>>>>> and that's what I'm saying is incorrect. Normally we'd get !read_ok
>>>>> error because s>>= is both READ and WRITE on w2, which is
>>>>> uninitialized after call instruction according to BPF ABI. And that's
>>>>> what actually seems to happen correctly in my (simpler) tests locally.
>>>>> But something is special about this specific repro that somehow either
>>>>> bypasses this logic, or attempts to mark precision before we get to
>>>>> that test. That's what we should investigate. I haven't tried to run
>>>>> this specific repro locally yet, so can't tell for sure.
>>>>>
>>>>
>>>> So, the reason why w2 is not marked as uninit is that the kfunc call in
>>>> the BPF program is invalid, "call #0", imm is zero, right?
>>>
>>> Yes, "call #0" is invalid. As the code below
>>>
>>>> /* skip for now, but return error when we find this in
>>> fixup_kfunc_call */
>>>>   if (!insn->imm)
>>>>   return 0;
>>>
>>> The error report will be delayed later in fixup_kfunc_call().
>>>
>>> static int fixup_kfunc_call(struct bpf_verifier_env *env, struct
>>> bpf_insn *insn,
>>>                              struct bpf_insn *insn_buf, int insn_idx,
>>> int *cnt)
>>> {
>>>          const struct bpf_kfunc_desc *desc;
>>>
>>>          if (!insn->imm) {
>>>                  verbose(env, "invalid kernel function call not
>>> eliminated in verifier pass\n");
>>>                  return -EINVAL;
>>>          }
>>>
>>>
>>>> In check_kfunc_call(), it skips this error temporarily:
>>>>
>>>> /* skip for now, but return error when we find this in fixup_kfunc_call */
>>>>   if (!insn->imm)
>>>>   return 0;
>>>>
>>>> So the kfunc call is the previous instruction before "w2 s>>= w7", this
>>>> leads to the warning in backtrack_insn():
>>>>
>>>> /* regular helper call sets R0 */
>>>> *reg_mask &= ~1;
>>>> if (*reg_mask & 0x3f) {
>>>>       /* if backtracing was looking for registers R1-R5
>>>>       * they should have been found already.
>>>>       */
>>>>       verbose(env, "BUG regs %x\n", *reg_mask);
>>>>       WARN_ONCE(1, "verifier backtracking bug”);
>>>>       return -EFAULT;
>>>> }
>>>
>>> The main triggering the backtrack_insn() is due to
>>>
>>>                          } else {
>>>                                  /* scalar += pointer
>>>                                   * This is legal, but we have to
>>> reverse our
>>>                                   * src/dest handling in computing the range
>>>                                   */
>>>                                  err = mark_chain_precision(env,
>>> insn->dst_reg);
>>>                                  if (err)
>>>                                          return err;
>>>                                  return adjust_ptr_min_max_vals(env, insn,
>>>                                                                 src_reg,
>>> dst_reg);
>>>                          }
>>>
>>>
>>> unc#0 @0
>>> 0: R1=ctx(off=0,imm=0) R10=fp0
>>> 0: (18) r2 = 0x8000000000000          ; R2_w=2251799813685248
>>> 2: (18) r6 = 0xffff888100d29000       ;
>>> R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>>> 4: (18) r7 = 0xffff888100d2a000       ;
>>> R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
>>> 6: (18) r8 = 0xffff888100d2ac00       ;
>>> R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
>>> 8: (18) r9 = 0x8e9700000000           ; R9_w=156779191205888
>>> 10: (36) if w9 >= 0xffffffe3 goto pc+1
>>> last_idx 10 first_idx 0
>>> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
>>> 11: R9_w=156779191205888
>>> 11: (85) call #0
>>> 12: (cc) w2 s>>= w7
>>> last_idx 12 first_idx 12
>>> parent didn't have regs=4 stack=0 marks: R1=ctx(off=0,imm=0)
>>> R2_rw=P2251799813685248 R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>>> R7_rw=map_ptr(off=0,ks=156,vs=2624,imm=0) R8_w=map_ptr(off=0,ks=2396,v0
>>> last_idx 11 first_idx 0
>>> regs=4 stack=0 before 11: (85) call #0
>>> BUG regs 4
>>>
>>> For insn 12, 'w2 s>>= w7', w2 is a scalar and w7 is a map_ptr. Hence,
>>> based on the above verifier code, mark_chain_precision() is triggered.
>>>
>>> Not sure what is the purpose of this test. But to make it succeed,
>>> first "call #0" need to change to a valid kfunc call, and second, you
>>> might want to change 'w2 s>>= w7' to e.g., 'w9 s>>= w7' to avoid
>>> precision tracking.
>>>
>>
>> The purpose is not to make the test "succeed", the verifier temporarily
>> skips the invalid kfunc insn "call #0", but this insn triggered a warning
>> in backtrack_insn(), while it is supposed to reject the program either
>> due to insn#12 32bit ptr alu or insn#11 invalid kfunc.
>>
>> Maybe something like the bellow, after applying the patch, the reproducer
>> is rejected:
>>
>> func#0 @0
>> 0: R1=ctx(off=0,imm=0) R10=fp0
>> 0: (18) r2 = 0x8000000000000          ; R2_w=2251799813685248
>> 2: (18) r6 = 0xffff88817d563000       ; R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0)
>> 4: (18) r7 = 0xffff888171ee9000       ; R7_w=map_ptr(off=0,ks=156,vs=2624,imm=0)
>> 6: (18) r8 = 0xffff888171ee8000       ; R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0)
>> 8: (18) r9 = 0x8e9700000000           ; R9_w=156779191205888
>> 10: (36) if w9 >= 0xffffffe3 goto pc+1
>> last_idx 10 first_idx 0
>> regs=200 stack=0 before 8: (18) r9 = 0x8e9700000000
>> 11: R9_w=156779191205888
>> 11: (85) call #0
>> 12: (cc) w2 s>>= w7
>> last_idx 12 first_idx 12
>> parent didn't have regs=4 stack=0 marks: R1=ctx(off=0,imm=0) R2_rw=P2251799813685248 R6_w=map_ptr(off=0,ks=3032,vs=3664,imm=0) R7_rw=map_ptr(off=0,ks=156,vs=2624,imm=0) R8_w=map_ptr(off=0,ks=2396,vs=76,imm=0) R9_w=156779191205888 R10=fp0
>> last_idx 11 first_idx 0
>> regs=4 stack=0 before 11: (85) call #0
>> regs=4 stack=0 before 10: (36) if w9 >= 0xffffffe3 goto pc+1
>> regs=4 stack=0 before 8: (18) r9 = 0x8e9700000000
>> regs=4 stack=0 before 6: (18) r8 = 0xffff888171ee8000
>> regs=4 stack=0 before 4: (18) r7 = 0xffff888171ee9000
>> regs=4 stack=0 before 2: (18) r6 = 0xffff88817d563000
>> regs=4 stack=0 before 0: (18) r2 = 0x8000000000000
>> R2 32-bit pointer arithmetic prohibited
>> processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index 4a25375ebb0d..abc7e96d826f 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -2743,6 +2743,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
>>                          *reg_mask |= sreg;
>>          } else if (class == BPF_JMP || class == BPF_JMP32) {
>>                  if (opcode == BPF_CALL) {
>> +                       /* skip for now, should return error when we find this in fixup_kfunc_call */
>> +                       if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && insn->imm == 0)
>> +                               return 0;
> 
> 
> Makes sense to me. Please submit as an official patch
> with s/return 0/return -ENOTSUPP/
> Also 'skip for now' isn't quite correct here.
> In check_kfunc_call() it's correct, since invalid kfunc with imm==0
> could be eliminated during dead code elimination,
> but since we're walking this insn here in backtrack_insn
> the dead code elimination is not going to kick in.
> So it's surely invalid kfunc call if we see it in backtrack_insn.
> The comment should probably be something like:
> /* kfunc with imm==0 is invalid and fixup_kfunc_call will catch
> this error later. Make backtracking conservative with ENOTSUPP. */

Do we have the same issue if we have
    call #1 <or some valid kfunc>
instead of
    call #0
?

next prev parent reply	other threads:[~2023-01-04  5:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-19  7:50 WARNING in __mark_chain_precision Hao Sun
2022-12-19 19:13 ` sdf
2022-12-21  0:30   ` Andrii Nakryiko
2022-12-28  5:23     ` Yonghong Song
2022-12-29 22:16       ` Andrii Nakryiko
2022-12-30  9:44         ` Hao Sun
2023-01-01 19:19           ` Yonghong Song
2023-01-02  9:42             ` Hao Sun
2023-01-03 18:27               ` Alexei Starovoitov
2023-01-04  5:47                 ` Yonghong Song [this message]
2023-01-04  8:52                   ` Hao Sun
2023-01-04 16:51                     ` Yonghong Song
  -- strict thread matches above, loose matches on Subject: below --
2019-07-08 15:27 syzbot
2019-07-09  3:49 ` Andrii Nakryiko
2019-07-09  4:08   ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44be9647-0208-beaf-130d-92a036c95d04@meta.com \
    --to=yhs@meta.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=sunhao.th@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox