public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Brendan Jackman <jackmanb@google.com>, <bpf@vger.kernel.org>
Cc: <ast@kernel.org>, <daniel@iogearbox.net>, <andrii@kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: Help with verifier failure
Date: Wed, 21 Apr 2021 08:06:35 -0700	[thread overview]
Message-ID: <94c4f7b0-c64e-e580-7d9b-a0a65e2fe33d@fb.com> (raw)
In-Reply-To: <20210421122348.547922-1-jackmanb@google.com>



On 4/21/21 5:23 AM, Brendan Jackman wrote:
> Hi,
> 
> Recently when our internal Clang build was updated to 0e92cbd6a652 we started
> hitting a verifier issue that I can't see an easy fix for. I've narrowed it down
> to a minimal reproducer - this email is a patch to add that repro as a prog
> test (./test_progs -t example).
> 
> Here's the BPF code I get from the attached source:
> 
> 0000000000000000 <exec>:
> ; int BPF_PROG(exec, struct linux_binprm *bprm) {
>         0:       79 11 00 00 00 00 00 00 r1 = *(u64 *)(r1 + 0)
>         1:       7b 1a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r1
> ;   uint64_t args_size = bprm->argc & 0xFFFFFFF;
>         2:       61 17 58 00 00 00 00 00 r7 = *(u32 *)(r1 + 88)
>         3:       b4 01 00 00 00 00 00 00 w1 = 0
> ;   int map_key = 0;
>         4:       63 1a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r1
>         5:       bf a2 00 00 00 00 00 00 r2 = r10
>         6:       07 02 00 00 fc ff ff ff r2 += -4
> ;   void *buf = bpf_map_lookup_elem(&buf_map, &map_key);
>         7:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
>         9:       85 00 00 00 01 00 00 00 call 1
>        10:       7b 0a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r0
>        11:       57 07 00 00 ff ff ff 0f r7 &= 268435455
>        12:       bf 76 00 00 00 00 00 00 r6 = r7
> ;   if (!buf)
>        13:       16 07 12 00 00 00 00 00 if w7 == 0 goto +18 <LBB0_7>
>        14:       79 a1 f0 ff 00 00 00 00 r1 = *(u64 *)(r10 - 16)
>        15:       15 01 10 00 00 00 00 00 if r1 == 0 goto +16 <LBB0_7>
>        16:       b4 09 00 00 00 00 00 00 w9 = 0
>        17:       b7 01 00 00 00 10 00 00 r1 = 4096
>        18:       bf 68 00 00 00 00 00 00 r8 = r6
>        19:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_3>
> 
> 00000000000000a0 <LBB0_5>:
> ;     void *src = (void *)(char *)bprm->p + offset;
>        20:       79 a1 e8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 24)
>        21:       79 13 18 00 00 00 00 00 r3 = *(u64 *)(r1 + 24)
> ;     uint64_t read_size = args_size - offset;
>        22:       0f 73 00 00 00 00 00 00 r3 += r7
>        23:       07 03 00 00 00 f0 ff ff r3 += -4096
> ;     (void) bpf_probe_read_user(buf, read_size, src);
>        24:       79 a1 f0 ff 00 00 00 00 r1 = *(u64 *)(r10 - 16)
>        25:       85 00 00 00 70 00 00 00 call 112
> ;   for (int i = 0; i < 512 && offset < args_size; i++) {
>        26:       26 09 05 00 fe 01 00 00 if w9 > 510 goto +5 <LBB0_7>
>        27:       07 08 00 00 00 f0 ff ff r8 += -4096
>        28:       bf 71 00 00 00 00 00 00 r1 = r7
>        29:       07 01 00 00 00 10 00 00 r1 += 4096
>        30:       04 09 00 00 01 00 00 00 w9 += 1
> ;   for (int i = 0; i < 512 && offset < args_size; i++) {
>        31:       ad 67 02 00 00 00 00 00 if r7 < r6 goto +2 <LBB0_3>
> 
> 0000000000000100 <LBB0_7>:
> ; int BPF_PROG(exec, struct linux_binprm *bprm) {
>        32:       b4 00 00 00 00 00 00 00 w0 = 0
>        33:       95 00 00 00 00 00 00 00 exit
> 
> 0000000000000110 <LBB0_3>:
>        34:       bf 17 00 00 00 00 00 00 r7 = r1
> ;     (void) bpf_probe_read_user(buf, read_size, src);
>        35:       bc 82 00 00 00 00 00 00 w2 = w8
>        36:       a5 08 ef ff 00 10 00 00 if r8 < 4096 goto -17 <LBB0_5>
>        37:       b4 02 00 00 00 10 00 00 w2 = 4096
>        38:       05 00 ed ff 00 00 00 00 goto -19 <LBB0_5>
> 
> 
> The full log I get is at
> https://gist.githubusercontent.com/bjackman/2928c4ff4cc89545f3993bddd9d5edb2/raw/feda6d7c165d24be3ea72c3cf7045c50246abd83/gistfile1.txt ,
> but basically the verifier runs through the loop a large number of times, going
> down the true path of the `if (read_size > CHUNK_LEN)` every time. Then
> eventually it takes the false path.
> 
> In the disassembly this is basically instructions 35-37 - pseudocode:
>    w2 = w8
>    if (r8 < 4096) {
>      w2 = 4096
>    }
> 
> w2 can't exceed 4096 but the verifier doesn't seem to "backpropagate" those
> bounds from r8 (note the umax_value for R8 goes to 4095 after the branch from 36
> to 20, but R2's umax_value is still 266342399)
> 
> from 31 to 34: R0_w=inv(id=0) R1_w=inv2097152 R6=inv(id=2,umin_value=2093057,umax_value=268435455,var_off=(0x0; 0xfffffff)) R7_w=inv2093056 R8_w=inv(id=0,umax_value=266342399,var_off=(0x0; 0xfffffff)) R9_w=invP511 R10=fp0 fp-8=mmmm???? fp-16=map_value fp-24=ptr_
> ; int BPF_PROG(exec, struct linux_binprm *bprm) {
> 34: (bf) r7 = r1
> ; (void) bpf_probe_read_user(buf, read_size, src);
> 35: (bc) w2 = w8
> 36: (a5) if r8 < 0x1000 goto pc-17
> 
> from 36 to 20: R0_w=inv(id=0) R1_w=inv2097152 R2_w=inv(id=0,umax_value=266342399,var_off=(0x0; 0xfffffff)) R6=inv(id=2,umin_value=2093057,umax_value=268435455,var_off=(0x0; 0xfffffff)) R7_w=inv2097152 R8_w=inv(id=0,umax_value=4095,var_off=(0x0; 0xfff)) R9_w=invP511 R10=fp0 fp-8=mmmm???? fp-16=map_value fp-24=ptr_
> ; void *src = (void *)(char *)bprm->p + offset;
> 20: (79) r1 = *(u64 *)(r10 -24)
> 21: (79) r3 = *(u64 *)(r1 +24)
> ; uint64_t read_size = args_size - offset;
> 22: (0f) r3 += r7
> 23: (07) r3 += -4096
> ; (void) bpf_probe_read_user(buf, read_size, src);
> 24: (79) r1 = *(u64 *)(r10 -16)
> 25: (85) call bpf_probe_read_user#112
>   R0_w=inv(id=0) R1_w=map_value(id=0,off=0,ks=4,vs=4096,imm=0) R2_w=inv(id=0,umax_value=266342399,var_off=(0x0; 0xfffffff)) R3_w=inv(id=0) R6=inv(id=2,umin_value=2093057,umax_value=268435455,var_off=(0x0; 0xfffffff)) R7_w=inv2097152 R8_w=inv(id=0,umax_value=4095,var_off=(0x0; 0xfff)) R9_w=invP511 R10=fp0 fp-8=mmmm???? fp-16=map_value fp-24=ptr_
>   R0_w=inv(id=0) R1_w=map_value(id=0,off=0,ks=4,vs=4096,imm=0) R2_w=inv(id=0,umax_value=266342399,var_off=(0x0; 0xfffffff)) R3_w=inv(id=0) R6=inv(id=2,umin_value=2093057,umax_value=268435455,var_off=(0x0; 0xfffffff)) R7_w=inv2097152 R8_w=inv(id=0,umax_value=4095,var_off=(0x0; 0xfff)) R9_w=invP511 R10=fp0 fp-8=mmmm???? fp-16=map_value fp-24=ptr_
> invalid access to map value, value_size=4096 off=0 size=266342399
> R1 min value is outside of the allowed memory range
> processed 9239 insns (limit 1000000) max_states_per_insn 4 total_states 133 peak_states 133 mark_read 2

Thanks, Brendan. Looks at least the verifier failure is triggered
by recent clang changes. I will take a look whether we could
improve verifier for such a case and whether we could improve
clang to avoid generate such codes the verifier doesn't like.
Will get back to you once I had concrete analysis.

> 
> This seems like it must be a common pitfall, any idea what we can do to fix it
> and avoid it in future? Am I misunderstanding the issue?
> 
> Cheers,
> Brendan
> 
[...]

  reply	other threads:[~2021-04-21 15:07 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21 12:23 Help with verifier failure Brendan Jackman
2021-04-21 15:06 ` Yonghong Song [this message]
2021-04-21 16:59   ` Yonghong Song
2021-04-22 13:55     ` Brendan Jackman
2021-04-22 14:35       ` Yonghong Song
2021-05-07 15:05         ` Brendan Jackman
2021-05-07 18:32           ` Yonghong Song
2021-04-21 16:41 ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94c4f7b0-c64e-e580-7d9b-a0a65e2fe33d@fb.com \
    --to=yhs@fb.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jackmanb@google.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox