[Bug Report] bpf: incorrectly pruning runtime execution path

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [Bug Report] bpf: incorrectly pruning runtime execution path
@ 2023-12-11 15:31 Hao Sun
  2023-12-13  0:51 ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Hao Sun @ 2023-12-11 15:31 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf
  Cc: Linux Kernel Mailing List

Hi,

The verifier incorrectly prunes a path expected to be executed at
runtime. In the following program, the execution path is:
    from 6 to 8 (taken) -> from 11 to 15 (taken) -> from 18 to 22
(taken) -> from 26 to 27 (fall-through) -> from 29 to 30
(fall-through)
The verifier prunes the checking path at #26, skipping the actual
execution path.

   0: (18) r2 = 0x1a000000be
   2: (bf) r5 = r1
   3: (bf) r8 = r2
   4: (bc) w4 = w5
   5: (85) call bpf_get_current_cgroup_id#680112
   6: (36) if w8 >= 0x69 goto pc+1
   7: (95) exit
   8: (18) r4 = 0x52
  10: (84) w4 = -w4
  11: (45) if r0 & 0xfffffffe goto pc+3
  12: (1f) r8 -= r4
  13: (0f) r0 += r0
  14: (2f) r4 *= r4
  15: (18) r3 = 0x1f00000034
  17: (c4) w4 s>>= 29
  18: (56) if w8 != 0xf goto pc+3
  19: r3 = bswap32 r3
  20: (18) r2 = 0x1c
  22: (67) r4 <<= 2
  23: (bf) r5 = r8
  24: (18) r2 = 0x4
  26: (7e) if w8 s>= w0 goto pc+5
  27: (4f) r8 |= r8
  28: (0f) r8 += r8
  29: (d6) if w5 s<= 0x1d goto pc+2
  30: (18) r0 = 0x4 ; incorrectly pruned here
  32: (95) exit

-------- Verifier Log --------
func#0 @0
0: R1=ctx() R10=fp0
0: (18) r2 = 0x1a000000be             ; R2_w=0x1a000000be
2: (bf) r5 = r1                       ; R1=ctx() R5_w=ctx()
3: (bf) r8 = r2                       ; R2_w=0x1a000000be R8_w=0x1a000000be
4: (bc) w4 = w5                       ;
R4_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
R5_w=ctx()
5: (85) call bpf_get_current_cgroup_id#80     ; R0_w=scalar()
6: (36) if w8 >= 0x69 goto pc+1
mark_precise: frame0: last_idx 6 first_idx 0 subseq_idx -1
mark_precise: frame0: regs=r8 stack= before 5: (85) call
bpf_get_current_cgroup_id#80
mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
6: R8_w=0x1a000000be
8: (18) r4 = 0x52                     ; R4_w=82
10: (84) w4 = -w4                     ; R4=scalar()
11: (45) if r0 & 0xfffffffe goto pc+3         ;
R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1))
12: (1f) r8 -= r4                     ; R4=scalar() R8_w=scalar()
13: (0f) r0 += r0                     ;
R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3))
14: (2f) r4 *= r4                     ; R4_w=scalar()
15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
17: (c4) w4 s>>= 29                   ;
R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff))
18: (56) if w8 != 0xf goto pc+3       ;
R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000))
19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
20: (18) r2 = 0x1c                    ; R2=28
22: (67) r4 <<= 2                     ;
R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc))
23: (bf) r5 = r8                      ;
R5_w=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000))
R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000))
24: (18) r2 = 0x4                     ; R2_w=4
26: (7e) if w8 s>= w0 goto pc+5
mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: parent state regs=r8 stack=:
R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2_w=28 R3_w=scalar()
R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000)) R10=fp0
mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
mark_precise: frame0: regs=r8 stack= before 20: (18) r2 = 0x1c
mark_precise: frame0: regs=r8 stack= before 19: (d7) r3 = bswap32 r3
mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
0xfffffffe goto pc+3
mark_precise: frame0: parent state regs=r4,r8 stack=:  R0_rw=scalar()
R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
mark_precise: frame0: regs=r4,r8 stack= before 10: (84) w4 = -w4
mark_precise: frame0: regs=r4,r8 stack= before 8: (18) r4 = 0x52
mark_precise: frame0: regs=r8 stack= before 6: (36) if w8 >= 0x69 goto pc+1
mark_precise: frame0: regs=r8 stack= before 5: (85) call
bpf_get_current_cgroup_id#80
mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: parent state regs=r0 stack=:
R0_rw=Pscalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2_w=28 R3_w=scalar()
R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000)) R10=fp0
mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
mark_precise: frame0: regs=r0 stack= before 20: (18) r2 = 0x1c
mark_precise: frame0: regs=r0 stack= before 19: (d7) r3 = bswap32 r3
mark_precise: frame0: regs=r0 stack= before 18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: regs=r0 stack= before 17: (c4) w4 s>>= 29
mark_precise: frame0: regs=r0 stack= before 15: (18) r3 = 0x1f00000034
mark_precise: frame0: regs=r0 stack= before 14: (2f) r4 *= r4
mark_precise: frame0: regs=r0 stack= before 13: (0f) r0 += r0
mark_precise: frame0: regs=r0 stack= before 12: (1f) r8 -= r4
mark_precise: frame0: regs=r0 stack= before 11: (45) if r0 &
0xfffffffe goto pc+3
mark_precise: frame0: parent state regs=r0 stack=:  R0_rw=Pscalar()
R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
mark_precise: frame0: regs=r0 stack= before 10: (84) w4 = -w4
mark_precise: frame0: regs=r0 stack= before 8: (18) r4 = 0x52
mark_precise: frame0: regs=r0 stack= before 6: (36) if w8 >= 0x69 goto pc+1
mark_precise: frame0: regs=r0 stack= before 5: (85) call
bpf_get_current_cgroup_id#80
26: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000))
32: (95) exit

from 18 to 22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R3_w=0x1f00000034
R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_w=scalar() R10=fp0
22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R3_w=0x1f00000034
R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_w=scalar() R10=fp0
22: (67) r4 <<= 2                     ;
R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc))
23: (bf) r5 = r8                      ; R5_w=scalar(id=2) R8_w=scalar(id=2)
24: (18) r2 = 0x4                     ; R2=4
26: (7e) if w8 s>= w0 goto pc+5       ;
R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
R8=scalar(id=2,smax32=1)
27: (4f) r8 |= r8                     ; R8_w=scalar()
28: (0f) r8 += r8                     ; R8_w=scalar()
29: (d6) if w5 s<= 0x1d goto pc+2
mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
mark_precise: frame0: parent state regs=r5 stack=:
R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2_w=4 R3_w=0x1f00000034
R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc)) R5_rw=Pscalar(id=2) R8_rw=scalar(id=2) R10=fp0
mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26
mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
0xfffffffe goto pc+3
mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
29: R5=scalar(id=2,smax32=1)
32: (95) exit

from 26 to 32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2=4 R3=0x1f00000034
R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
0xffffffff7fffffff))
R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
0xffffffff7fffffff)) R10=fp0
32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2=4 R3=0x1f00000034
R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
0xffffffff7fffffff))
R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
0xffffffff7fffffff)) R10=fp0
32: (95) exit

from 11 to 15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
17: (c4) w4 s>>= 29                   ;
R4=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff))
18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: last_idx 18 first_idx 18 subseq_idx -1
mark_precise: frame0: parent state regs=r8 stack=:  R0=scalar()
R3_w=0x1f00000034
R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_r=P0x1a000000be R10=fp0
mark_precise: frame0: last_idx 17 first_idx 11 subseq_idx 18
mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
mark_precise: frame0: regs=r8 stack= before 11: (45) if r0 &
0xfffffffe goto pc+3
mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
18: R8=0x1a000000be
22: (67) r4 <<= 2                     ;
R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc))
23: (bf) r5 = r8                      ; R5_w=0x1a000000be R8=0x1a000000be
24: (18) r2 = 0x4
frame 0: propagating r5
mark_precise: frame0: last_idx 26 first_idx 18 subseq_idx -1
mark_precise: frame0: regs=r5 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r5 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: parent state regs= stack=:  R0_r=scalar()
R3_w=0x1f00000034
R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_r=P0x1a000000be R10=fp0
26: safe
processed 38 insns (limit 1000000) max_states_per_insn 1 total_states
4 peak_states 4 mark_read 2

-------- End of Verifier Log --------

When the verifier backtracks from #29, I expected w0 at #26 (if w8 s>=
w0 goto pc+5) to be marked as precise since R8 and R5 share the same
id:

29: (d6) if w5 s<= 0x1d goto pc+2
mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
mark_precise: frame0: parent state regs=r5 stack=:
R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2_w=4 R3_w=0x1f00000034
R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
0x3fffffffc)) R5_rw=Pscalar(id=2) R8_rw=scalar(id=2) R10=fp0
mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26
mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
0xfffffffe goto pc+3
mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
29: R5=scalar(id=2,smax32=1)

However, seems it's not, so the next time when the verifier checks
#26, R0 is incorrectly ignored.
We have mark_precise_scalar_ids(), but it's called before calculating
the mask once.
I investigated for quite a while, but mark_chain_pricision() is really
hard to follow.

Here is a reduced C repro, maybe someone else can shed some light on this.
C repro: https://pastebin.com/raw/chrshhGQ

Thanks
Hao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-11 15:31 [Bug Report] bpf: incorrectly pruning runtime execution path Hao Sun
@ 2023-12-13  0:51 ` Andrii Nakryiko
  2023-12-13 10:25   ` Hao Sun
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-13  0:51 UTC (permalink / raw)
  To: Hao Sun
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Linux Kernel Mailing List

On Mon, Dec 11, 2023 at 7:31 AM Hao Sun <sunhao.th@gmail.com> wrote:
>
> Hi,
>
> The verifier incorrectly prunes a path expected to be executed at
> runtime. In the following program, the execution path is:
>     from 6 to 8 (taken) -> from 11 to 15 (taken) -> from 18 to 22
> (taken) -> from 26 to 27 (fall-through) -> from 29 to 30
> (fall-through)
> The verifier prunes the checking path at #26, skipping the actual
> execution path.
>
>    0: (18) r2 = 0x1a000000be
>    2: (bf) r5 = r1
>    3: (bf) r8 = r2
>    4: (bc) w4 = w5
>    5: (85) call bpf_get_current_cgroup_id#680112
>    6: (36) if w8 >= 0x69 goto pc+1
>    7: (95) exit
>    8: (18) r4 = 0x52
>   10: (84) w4 = -w4
>   11: (45) if r0 & 0xfffffffe goto pc+3
>   12: (1f) r8 -= r4
>   13: (0f) r0 += r0
>   14: (2f) r4 *= r4
>   15: (18) r3 = 0x1f00000034
>   17: (c4) w4 s>>= 29
>   18: (56) if w8 != 0xf goto pc+3
>   19: r3 = bswap32 r3
>   20: (18) r2 = 0x1c
>   22: (67) r4 <<= 2
>   23: (bf) r5 = r8
>   24: (18) r2 = 0x4
>   26: (7e) if w8 s>= w0 goto pc+5
>   27: (4f) r8 |= r8
>   28: (0f) r8 += r8
>   29: (d6) if w5 s<= 0x1d goto pc+2
>   30: (18) r0 = 0x4 ; incorrectly pruned here



>   32: (95) exit
>
> -------- Verifier Log --------
> func#0 @0
> 0: R1=ctx() R10=fp0
> 0: (18) r2 = 0x1a000000be             ; R2_w=0x1a000000be
> 2: (bf) r5 = r1                       ; R1=ctx() R5_w=ctx()
> 3: (bf) r8 = r2                       ; R2_w=0x1a000000be R8_w=0x1a000000be
> 4: (bc) w4 = w5                       ;
> R4_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> R5_w=ctx()
> 5: (85) call bpf_get_current_cgroup_id#80     ; R0_w=scalar()
> 6: (36) if w8 >= 0x69 goto pc+1
> mark_precise: frame0: last_idx 6 first_idx 0 subseq_idx -1
> mark_precise: frame0: regs=r8 stack= before 5: (85) call
> bpf_get_current_cgroup_id#80
> mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
> mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
> mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
> mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
> 6: R8_w=0x1a000000be
> 8: (18) r4 = 0x52                     ; R4_w=82
> 10: (84) w4 = -w4                     ; R4=scalar()
> 11: (45) if r0 & 0xfffffffe goto pc+3         ;
> R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1))
> 12: (1f) r8 -= r4                     ; R4=scalar() R8_w=scalar()
> 13: (0f) r0 += r0                     ;
> R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3))
> 14: (2f) r4 *= r4                     ; R4_w=scalar()
> 15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
> 17: (c4) w4 s>>= 29                   ;
> R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff))
> 18: (56) if w8 != 0xf goto pc+3       ;
> R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000))
> 19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
> 20: (18) r2 = 0x1c                    ; R2=28
> 22: (67) r4 <<= 2                     ;
> R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc))
> 23: (bf) r5 = r8                      ;
> R5_w=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000))
> R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000))
> 24: (18) r2 = 0x4                     ; R2_w=4
> 26: (7e) if w8 s>= w0 goto pc+5

so here w8=15 and w0=[0,2], always taken, right?

> mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: parent state regs=r8 stack=:
> R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2_w=28 R3_w=scalar()
> R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000)) R10=fp0
> mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
> mark_precise: frame0: regs=r8 stack= before 20: (18) r2 = 0x1c
> mark_precise: frame0: regs=r8 stack= before 19: (d7) r3 = bswap32 r3
> mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
> mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
> mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
> mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
> 0xfffffffe goto pc+3
> mark_precise: frame0: parent state regs=r4,r8 stack=:  R0_rw=scalar()
> R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
> mark_precise: frame0: regs=r4,r8 stack= before 10: (84) w4 = -w4
> mark_precise: frame0: regs=r4,r8 stack= before 8: (18) r4 = 0x52
> mark_precise: frame0: regs=r8 stack= before 6: (36) if w8 >= 0x69 goto pc+1
> mark_precise: frame0: regs=r8 stack= before 5: (85) call
> bpf_get_current_cgroup_id#80
> mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
> mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
> mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
> mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
> mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: parent state regs=r0 stack=:
> R0_rw=Pscalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2_w=28 R3_w=scalar()
> R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000)) R10=fp0
> mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
> mark_precise: frame0: regs=r0 stack= before 20: (18) r2 = 0x1c
> mark_precise: frame0: regs=r0 stack= before 19: (d7) r3 = bswap32 r3
> mark_precise: frame0: regs=r0 stack= before 18: (56) if w8 != 0xf goto pc+3
> mark_precise: frame0: regs=r0 stack= before 17: (c4) w4 s>>= 29
> mark_precise: frame0: regs=r0 stack= before 15: (18) r3 = 0x1f00000034
> mark_precise: frame0: regs=r0 stack= before 14: (2f) r4 *= r4
> mark_precise: frame0: regs=r0 stack= before 13: (0f) r0 += r0
> mark_precise: frame0: regs=r0 stack= before 12: (1f) r8 -= r4
> mark_precise: frame0: regs=r0 stack= before 11: (45) if r0 &
> 0xfffffffe goto pc+3
> mark_precise: frame0: parent state regs=r0 stack=:  R0_rw=Pscalar()
> R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
> mark_precise: frame0: regs=r0 stack= before 10: (84) w4 = -w4
> mark_precise: frame0: regs=r0 stack= before 8: (18) r4 = 0x52
> mark_precise: frame0: regs=r0 stack= before 6: (36) if w8 >= 0x69 goto pc+1
> mark_precise: frame0: regs=r0 stack= before 5: (85) call
> bpf_get_current_cgroup_id#80
> 26: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000))
> 32: (95) exit
>
> from 18 to 22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R3_w=0x1f00000034
> R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_w=scalar() R10=fp0
> 22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R3_w=0x1f00000034
> R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_w=scalar() R10=fp0
> 22: (67) r4 <<= 2                     ;
> R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc))
> 23: (bf) r5 = r8                      ; R5_w=scalar(id=2) R8_w=scalar(id=2)
> 24: (18) r2 = 0x4                     ; R2=4
> 26: (7e) if w8 s>= w0 goto pc+5       ;
> R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
> R8=scalar(id=2,smax32=1)

we didn't prune here, assuming w8 < w0, so w8=w5 is at most 1 (because
r0 is [0, 2])

> 27: (4f) r8 |= r8                     ; R8_w=scalar()

here r5 and r8 are disassociated

> 28: (0f) r8 += r8                     ; R8_w=scalar()
> 29: (d6) if w5 s<= 0x1d goto pc+2

w5 is at most 1 (signed), so this is always true, so we just to exit,
30: is still never visited

> mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
> mark_precise: frame0: parent state regs=r5 stack=:
> R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2_w=4 R3_w=0x1f00000034
> R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc)) R5_rw=Pscalar(id=2) R8_rw=scalar(id=2) R10=fp0
> mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26
> mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
> mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
> mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
> mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
> 0xfffffffe goto pc+3
> mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
> R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> 29: R5=scalar(id=2,smax32=1)
> 32: (95) exit
>
> from 26 to 32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2=4 R3=0x1f00000034
> R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> 0xffffffff7fffffff))
> R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> 0xffffffff7fffffff)) R10=fp0
> 32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2=4 R3=0x1f00000034
> R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> 0xffffffff7fffffff))
> R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> 0xffffffff7fffffff)) R10=fp0
> 32: (95) exit

here we also skipped 30:, and w8 was in [0,0x7fffffff] range, r0 is
[0,2], but it's precision doesn't matter as we didn't do any pruning

NOTE this one.

>
> from 11 to 15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
> 15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
> 15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
> 17: (c4) w4 s>>= 29                   ;
> R4=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff))
> 18: (56) if w8 != 0xf goto pc+3

known true, always taken

> mark_precise: frame0: last_idx 18 first_idx 18 subseq_idx -1
> mark_precise: frame0: parent state regs=r8 stack=:  R0=scalar()
> R3_w=0x1f00000034
> R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_r=P0x1a000000be R10=fp0
> mark_precise: frame0: last_idx 17 first_idx 11 subseq_idx 18
> mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> mark_precise: frame0: regs=r8 stack= before 11: (45) if r0 &
> 0xfffffffe goto pc+3
> mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
> R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> 18: R8=0x1a000000be
> 22: (67) r4 <<= 2                     ;
> R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc))
> 23: (bf) r5 = r8                      ; R5_w=0x1a000000be R8=0x1a000000be
> 24: (18) r2 = 0x4
> frame 0: propagating r5
> mark_precise: frame0: last_idx 26 first_idx 18 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r5 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> mark_precise: frame0: parent state regs= stack=:  R0_r=scalar()
> R3_w=0x1f00000034
> R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_r=P0x1a000000be R10=fp0
> 26: safe

and here we basically need to evaluate

if w8 s>= w0 goto pc+5

w8 is precisely known to be 0x000000be, while w0 is unknown. Now go
back to "NOTE this one" mark above. w8 is inside [0, 0xffffffff]
range, right? And w0 is unknown, while up in "NOTE this one" w0 didn't
matter, so it stayed imprecise. This is a match. It seems correct.


> processed 38 insns (limit 1000000) max_states_per_insn 1 total_states
> 4 peak_states 4 mark_read 2
>
> -------- End of Verifier Log --------
>
> When the verifier backtracks from #29, I expected w0 at #26 (if w8 s>=
> w0 goto pc+5) to be marked as precise since R8 and R5 share the same
> id:

r0 is marked precise at 26:

mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
mark_precise: frame0: parent state regs=r0 stack=:
R0_rw=Pscalar(smin=smin32=0,sm
ax=umax=smax32=umax32=2,var_off=(0x0;
0x3)) R2_w=28 R3_w=scalar()
R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
0xffffffff00000000)) R10=fp0

>
> 29: (d6) if w5 s<= 0x1d goto pc+2
> mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
> mark_precise: frame0: parent state regs=r5 stack=:
> R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2_w=4 R3_w=0x1f00000034
> R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> 0x3fffffffc)) R5_rw=Pscalar(id=2) R8_rw=scalar(id=2) R10=fp0
> mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26
> mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
> mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
> mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
> mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
> 0xfffffffe goto pc+3
> mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
> R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> 29: R5=scalar(id=2,smax32=1)
>
> However, seems it's not, so the next time when the verifier checks
> #26, R0 is incorrectly ignored.
> We have mark_precise_scalar_ids(), but it's called before calculating
> the mask once.

I'm not following the remark about mark_precise_scalar_ids(). That
works fine, but has nothing to do with r0. mark_precise_scalar_ids()
identifies that r8 and r5 are linked together, and you can see from
the log that we mark both r5 and r8 as precise.

> I investigated for quite a while, but mark_chain_pricision() is really
> hard to follow.
>
> Here is a reduced C repro, maybe someone else can shed some light on this.
> C repro: https://pastebin.com/raw/chrshhGQ

So you claim is that

30: (18) r0 = 0x4 ; incorrectly pruned here


Can you please show a detailed code patch in which we do reach 30
actually? I might have missed it, but so far it look like verifier is
doing everything right.

>
> Thanks
> Hao

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13  0:51 ` Andrii Nakryiko
@ 2023-12-13 10:25   ` Hao Sun
  2023-12-13 23:30     ` Andrii Nakryiko
  2023-12-13 23:35     ` Eduard Zingerman
  0 siblings, 2 replies; 24+ messages in thread
From: Hao Sun @ 2023-12-13 10:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Linux Kernel Mailing List

On Wed, Dec 13, 2023 at 1:51 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Dec 11, 2023 at 7:31 AM Hao Sun <sunhao.th@gmail.com> wrote:
> >
> > Hi,
> >
> > The verifier incorrectly prunes a path expected to be executed at
> > runtime. In the following program, the execution path is:
> >     from 6 to 8 (taken) -> from 11 to 15 (taken) -> from 18 to 22
> > (taken) -> from 26 to 27 (fall-through) -> from 29 to 30
> > (fall-through)
> > The verifier prunes the checking path at #26, skipping the actual
> > execution path.
> >
> >    0: (18) r2 = 0x1a000000be
> >    2: (bf) r5 = r1
> >    3: (bf) r8 = r2
> >    4: (bc) w4 = w5
> >    5: (85) call bpf_get_current_cgroup_id#680112
> >    6: (36) if w8 >= 0x69 goto pc+1
> >    7: (95) exit
> >    8: (18) r4 = 0x52
> >   10: (84) w4 = -w4
> >   11: (45) if r0 & 0xfffffffe goto pc+3
> >   12: (1f) r8 -= r4
> >   13: (0f) r0 += r0
> >   14: (2f) r4 *= r4
> >   15: (18) r3 = 0x1f00000034
> >   17: (c4) w4 s>>= 29
> >   18: (56) if w8 != 0xf goto pc+3
> >   19: r3 = bswap32 r3
> >   20: (18) r2 = 0x1c
> >   22: (67) r4 <<= 2
> >   23: (bf) r5 = r8
> >   24: (18) r2 = 0x4
> >   26: (7e) if w8 s>= w0 goto pc+5
> >   27: (4f) r8 |= r8
> >   28: (0f) r8 += r8
> >   29: (d6) if w5 s<= 0x1d goto pc+2
> >   30: (18) r0 = 0x4 ; incorrectly pruned here
>
>
>
> >   32: (95) exit
> >
> > -------- Verifier Log --------
> > func#0 @0
> > 0: R1=ctx() R10=fp0
> > 0: (18) r2 = 0x1a000000be             ; R2_w=0x1a000000be
> > 2: (bf) r5 = r1                       ; R1=ctx() R5_w=ctx()
> > 3: (bf) r8 = r2                       ; R2_w=0x1a000000be R8_w=0x1a000000be
> > 4: (bc) w4 = w5                       ;
> > R4_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> > R5_w=ctx()
> > 5: (85) call bpf_get_current_cgroup_id#80     ; R0_w=scalar()
> > 6: (36) if w8 >= 0x69 goto pc+1
> > mark_precise: frame0: last_idx 6 first_idx 0 subseq_idx -1
> > mark_precise: frame0: regs=r8 stack= before 5: (85) call
> > bpf_get_current_cgroup_id#80
> > mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
> > mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
> > mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
> > mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
> > 6: R8_w=0x1a000000be
> > 8: (18) r4 = 0x52                     ; R4_w=82
> > 10: (84) w4 = -w4                     ; R4=scalar()
> > 11: (45) if r0 & 0xfffffffe goto pc+3         ;
> > R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0; 0x1))
> > 12: (1f) r8 -= r4                     ; R4=scalar() R8_w=scalar()
> > 13: (0f) r0 += r0                     ;
> > R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3))
> > 14: (2f) r4 *= r4                     ; R4_w=scalar()
> > 15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
> > 17: (c4) w4 s>>= 29                   ;
> > R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff))
> > 18: (56) if w8 != 0xf goto pc+3       ;
> > R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000))
> > 19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
> > 20: (18) r2 = 0x1c                    ; R2=28
> > 22: (67) r4 <<= 2                     ;
> > R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc))
> > 23: (bf) r5 = r8                      ;
> > R5_w=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000))
> > R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000))
> > 24: (18) r2 = 0x4                     ; R2_w=4
> > 26: (7e) if w8 s>= w0 goto pc+5
>
> so here w8=15 and w0=[0,2], always taken, right?
>
> > mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> > mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> > mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
> > mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> > mark_precise: frame0: parent state regs=r8 stack=:
> > R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R2_w=28 R3_w=scalar()
> > R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000)) R10=fp0
> > mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
> > mark_precise: frame0: regs=r8 stack= before 20: (18) r2 = 0x1c
> > mark_precise: frame0: regs=r8 stack= before 19: (d7) r3 = bswap32 r3
> > mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> > mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> > mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> > mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
> > mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
> > mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
> > mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
> > 0xfffffffe goto pc+3
> > mark_precise: frame0: parent state regs=r4,r8 stack=:  R0_rw=scalar()
> > R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> > mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
> > mark_precise: frame0: regs=r4,r8 stack= before 10: (84) w4 = -w4
> > mark_precise: frame0: regs=r4,r8 stack= before 8: (18) r4 = 0x52
> > mark_precise: frame0: regs=r8 stack= before 6: (36) if w8 >= 0x69 goto pc+1
> > mark_precise: frame0: regs=r8 stack= before 5: (85) call
> > bpf_get_current_cgroup_id#80
> > mark_precise: frame0: regs=r8 stack= before 4: (bc) w4 = w5
> > mark_precise: frame0: regs=r8 stack= before 3: (bf) r8 = r2
> > mark_precise: frame0: regs=r2 stack= before 2: (bf) r5 = r1
> > mark_precise: frame0: regs=r2 stack= before 0: (18) r2 = 0x1a000000be
> > mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> > mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
> > mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
> > mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
> > mark_precise: frame0: parent state regs=r0 stack=:
> > R0_rw=Pscalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R2_w=28 R3_w=scalar()
> > R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000)) R10=fp0
> > mark_precise: frame0: last_idx 20 first_idx 11 subseq_idx 22
> > mark_precise: frame0: regs=r0 stack= before 20: (18) r2 = 0x1c
> > mark_precise: frame0: regs=r0 stack= before 19: (d7) r3 = bswap32 r3
> > mark_precise: frame0: regs=r0 stack= before 18: (56) if w8 != 0xf goto pc+3
> > mark_precise: frame0: regs=r0 stack= before 17: (c4) w4 s>>= 29
> > mark_precise: frame0: regs=r0 stack= before 15: (18) r3 = 0x1f00000034
> > mark_precise: frame0: regs=r0 stack= before 14: (2f) r4 *= r4
> > mark_precise: frame0: regs=r0 stack= before 13: (0f) r0 += r0
> > mark_precise: frame0: regs=r0 stack= before 12: (1f) r8 -= r4
> > mark_precise: frame0: regs=r0 stack= before 11: (45) if r0 &
> > 0xfffffffe goto pc+3
> > mark_precise: frame0: parent state regs=r0 stack=:  R0_rw=Pscalar()
> > R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> > mark_precise: frame0: last_idx 10 first_idx 0 subseq_idx 11
> > mark_precise: frame0: regs=r0 stack= before 10: (84) w4 = -w4
> > mark_precise: frame0: regs=r0 stack= before 8: (18) r4 = 0x52
> > mark_precise: frame0: regs=r0 stack= before 6: (36) if w8 >= 0x69 goto pc+1
> > mark_precise: frame0: regs=r0 stack= before 5: (85) call
> > bpf_get_current_cgroup_id#80
> > 26: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R8=scalar(id=1,smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> > 0xffffffff00000000))
> > 32: (95) exit
> >
> > from 18 to 22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R3_w=0x1f00000034
> > R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_w=scalar() R10=fp0
> > 22: R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R3_w=0x1f00000034
> > R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_w=scalar() R10=fp0
> > 22: (67) r4 <<= 2                     ;
> > R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc))
> > 23: (bf) r5 = r8                      ; R5_w=scalar(id=2) R8_w=scalar(id=2)
> > 24: (18) r2 = 0x4                     ; R2=4
> > 26: (7e) if w8 s>= w0 goto pc+5       ;
> > R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
> > R8=scalar(id=2,smax32=1)
>
> we didn't prune here, assuming w8 < w0, so w8=w5 is at most 1 (because
> r0 is [0, 2])
>
> > 27: (4f) r8 |= r8                     ; R8_w=scalar()
>
> here r5 and r8 are disassociated
>
> > 28: (0f) r8 += r8                     ; R8_w=scalar()
> > 29: (d6) if w5 s<= 0x1d goto pc+2
>
> w5 is at most 1 (signed), so this is always true, so we just to exit,
> 30: is still never visited
>
> > mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> > mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> > mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> > mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
> > mark_precise: frame0: parent state regs=r5 stack=:
> > R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R2_w=4 R3_w=0x1f00000034
> > R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc)) R5_rw=Pscalar(id=2) R8_rw=scalar(id=2) R10=fp0
> > mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26
> > mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> > mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
> > mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> > mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> > mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> > mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> > mark_precise: frame0: regs=r8 stack= before 14: (2f) r4 *= r4
> > mark_precise: frame0: regs=r8 stack= before 13: (0f) r0 += r0
> > mark_precise: frame0: regs=r8 stack= before 12: (1f) r8 -= r4
> > mark_precise: frame0: regs=r4,r8 stack= before 11: (45) if r0 &
> > 0xfffffffe goto pc+3
> > mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
> > R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> > 29: R5=scalar(id=2,smax32=1)
> > 32: (95) exit
> >
> > from 26 to 32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R2=4 R3=0x1f00000034
> > R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> > 0xffffffff7fffffff))
> > R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> > 0xffffffff7fffffff)) R10=fp0
> > 32: R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0;
> > 0x3)) R2=4 R3=0x1f00000034
> > R4=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc)) R5=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> > 0xffffffff7fffffff))
> > R8=scalar(id=2,smax=0x7fffffff7fffffff,umax=0xffffffff7fffffff,smin32=0,umax32=0x7fffffff,var_off=(0x0;
> > 0xffffffff7fffffff)) R10=fp0
> > 32: (95) exit
>
> here we also skipped 30:, and w8 was in [0,0x7fffffff] range, r0 is
> [0,2], but it's precision doesn't matter as we didn't do any pruning
>
> NOTE this one.
>
> >
> > from 11 to 15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
> > 15: R0=scalar() R4=scalar() R8=0x1a000000be R10=fp0
> > 15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
> > 17: (c4) w4 s>>= 29                   ;
> > R4=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff))
> > 18: (56) if w8 != 0xf goto pc+3
>
> known true, always taken
>
> > mark_precise: frame0: last_idx 18 first_idx 18 subseq_idx -1
> > mark_precise: frame0: parent state regs=r8 stack=:  R0=scalar()
> > R3_w=0x1f00000034
> > R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_r=P0x1a000000be R10=fp0
> > mark_precise: frame0: last_idx 17 first_idx 11 subseq_idx 18
> > mark_precise: frame0: regs=r8 stack= before 17: (c4) w4 s>>= 29
> > mark_precise: frame0: regs=r8 stack= before 15: (18) r3 = 0x1f00000034
> > mark_precise: frame0: regs=r8 stack= before 11: (45) if r0 &
> > 0xfffffffe goto pc+3
> > mark_precise: frame0: parent state regs= stack=:  R0_rw=Pscalar()
> > R4_rw=Pscalar() R8_rw=P0x1a000000be R10=fp0
> > 18: R8=0x1a000000be
> > 22: (67) r4 <<= 2                     ;
> > R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,var_off=(0x0;
> > 0x3fffffffc))
> > 23: (bf) r5 = r8                      ; R5_w=0x1a000000be R8=0x1a000000be
> > 24: (18) r2 = 0x4
> > frame 0: propagating r5
> > mark_precise: frame0: last_idx 26 first_idx 18 subseq_idx -1
> > mark_precise: frame0: regs=r5 stack= before 24: (18) r2 = 0x4
> > mark_precise: frame0: regs=r5 stack= before 23: (bf) r5 = r8
> > mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
> > mark_precise: frame0: regs=r8 stack= before 18: (56) if w8 != 0xf goto pc+3
> > mark_precise: frame0: parent state regs= stack=:  R0_r=scalar()
> > R3_w=0x1f00000034
> > R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> > 0xffffffff)) R8_r=P0x1a000000be R10=fp0
> > 26: safe
>
> and here we basically need to evaluate
>
> if w8 s>= w0 goto pc+5
>
> w8 is precisely known to be 0x000000be, while w0 is unknown. Now go
> back to "NOTE this one" mark above. w8 is inside [0, 0xffffffff]
> range, right? And w0 is unknown, while up in "NOTE this one" w0 didn't
> matter, so it stayed imprecise. This is a match. It seems correct.
>

Thanks for the detailed explanation.
This is the exact point where I got confused. w8 and w5 share the same id
at this point, the range of w5 is also updated according to w0. Even though
r5 and r8 are disassociated later, w0 actually matters.

Assume the verifier does not prune at this point, then w5 would be unknown
after this point, and #30 will be explored. The branch "from 29 to 30" is the
taken path at runtime, see below.

>
[...]
>
> r0 is marked precise at 26:
>
> mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
> mark_precise: frame0: parent state regs=r0 stack=:
> R0_rw=Pscalar(smin=smin32=0,sm
> ax=umax=smax32=umax32=2,var_off=(0x0;
> 0x3)) R2_w=28 R3_w=scalar()
> R4_rw=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,var_off=(0x0;
> 0xffffffff)) R8_rw=Pscalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,umin=smin32=umin32=15,umax=0xffffffff0000000f,smax32=umax32=15,var_off=(0xf;
> 0xffffffff00000000)) R10=fp0
>
[...]
> > However, seems it's not, so the next time when the verifier checks
> > #26, R0 is incorrectly ignored.
> > We have mark_precise_scalar_ids(), but it's called before calculating
> > the mask once.
>
> I'm not following the remark about mark_precise_scalar_ids(). That
> works fine, but has nothing to do with r0. mark_precise_scalar_ids()
> identifies that r8 and r5 are linked together, and you can see from
> the log that we mark both r5 and r8 as precise.
>
> > I investigated for quite a while, but mark_chain_pricision() is really
> > hard to follow.
> >
> > Here is a reduced C repro, maybe someone else can shed some light on this.
> > C repro: https://pastebin.com/raw/chrshhGQ
>
> So you claim is that
>
> 30: (18) r0 = 0x4 ; incorrectly pruned here
>
>
> Can you please show a detailed code patch in which we do reach 30
> actually? I might have missed it, but so far it look like verifier is
> doing everything right.
>

I tried to convert the repro to a valid test case in inline asm, but seems
JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
Will try after clang-18 is released.

#30 is expected to be executed, see below where everything after ";" is
the runtime value:
   ...
   6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
   ...
  11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
  ...
  18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
  ...
  23: (bf) r5 = r8     ; w5 = 0xbe
  24: (18) r2 = 0x4
  26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
  27: (4f) r8 |= r8
  28: (0f) r8 += r8
  29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
  30: (18) r0 = 0x4      ; executed

Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
is executed after manually commenting out the dead code rewrite pass.

From my understanding, I think r0 should be marked as precise when
first backtrack from #29, because r5 range at this point depends on w0
as r8 and r5 share the same id at #26.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 10:25   ` Hao Sun
@ 2023-12-13 23:30     ` Andrii Nakryiko
  2023-12-14  0:08       ` Eduard Zingerman
  2023-12-13 23:35     ` Eduard Zingerman
  1 sibling, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-13 23:30 UTC (permalink / raw)
  To: Hao Sun
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Linux Kernel Mailing List, Eduard Zingerman

On Wed, Dec 13, 2023 at 2:25 AM Hao Sun <sunhao.th@gmail.com> wrote:
>
> On Wed, Dec 13, 2023 at 1:51 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Dec 11, 2023 at 7:31 AM Hao Sun <sunhao.th@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > The verifier incorrectly prunes a path expected to be executed at
> > > runtime. In the following program, the execution path is:
> > >     from 6 to 8 (taken) -> from 11 to 15 (taken) -> from 18 to 22
> > > (taken) -> from 26 to 27 (fall-through) -> from 29 to 30
> > > (fall-through)
> > > The verifier prunes the checking path at #26, skipping the actual
> > > execution path.
> > >
> > >    0: (18) r2 = 0x1a000000be
> > >    2: (bf) r5 = r1
> > >    3: (bf) r8 = r2
> > >    4: (bc) w4 = w5
> > >    5: (85) call bpf_get_current_cgroup_id#680112
> > >    6: (36) if w8 >= 0x69 goto pc+1
> > >    7: (95) exit
> > >    8: (18) r4 = 0x52
> > >   10: (84) w4 = -w4
> > >   11: (45) if r0 & 0xfffffffe goto pc+3
> > >   12: (1f) r8 -= r4
> > >   13: (0f) r0 += r0
> > >   14: (2f) r4 *= r4
> > >   15: (18) r3 = 0x1f00000034
> > >   17: (c4) w4 s>>= 29
> > >   18: (56) if w8 != 0xf goto pc+3
> > >   19: r3 = bswap32 r3
> > >   20: (18) r2 = 0x1c
> > >   22: (67) r4 <<= 2
> > >   23: (bf) r5 = r8
> > >   24: (18) r2 = 0x4
> > >   26: (7e) if w8 s>= w0 goto pc+5
> > >   27: (4f) r8 |= r8
> > >   28: (0f) r8 += r8
> > >   29: (d6) if w5 s<= 0x1d goto pc+2
> > >   30: (18) r0 = 0x4 ; incorrectly pruned here
> >
> >
> >
> > >   32: (95) exit
> > >

[...]

> > >
> > > Here is a reduced C repro, maybe someone else can shed some light on this.
> > > C repro: https://pastebin.com/raw/chrshhGQ
> >
> > So you claim is that
> >
> > 30: (18) r0 = 0x4 ; incorrectly pruned here
> >
> >
> > Can you please show a detailed code patch in which we do reach 30
> > actually? I might have missed it, but so far it look like verifier is
> > doing everything right.
> >
>
> I tried to convert the repro to a valid test case in inline asm, but seems
> JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
> Will try after clang-18 is released.

JSET has nothing to do with the issue, it's just a distraction, I'm
not sure why you bring JSET into the discussion at all.

Your repro can be actually much smaller, as it was really hard to
follow which parts are important and which weren't. If you can try to
minimize repros, it will definitely help with analysis for future
issues.

>
> #30 is expected to be executed, see below where everything after ";" is
> the runtime value:
>    ...
>    6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
>    ...
>   11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
>   ...
>   18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
>   ...
>   23: (bf) r5 = r8     ; w5 = 0xbe
>   24: (18) r2 = 0x4
>   26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
>   27: (4f) r8 |= r8
>   28: (0f) r8 += r8
>   29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
>   30: (18) r0 = 0x4      ; executed
>
> Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
> is executed after manually commenting out the dead code rewrite pass.
>
> From my understanding, I think r0 should be marked as precise when
> first backtrack from #29, because r5 range at this point depends on w0
> as r8 and r5 share the same id at #26.

Yes, thanks, the execution trace above was helpful. Let's try to
minimize the example here, I'll keep original instruction indices,
though:

   23: (bf) r5 = r8                   ; here we link r5 and r8 together
   26: (7e) if w8 s>= w0 goto pc+5    ; here it's not always/never
taken, so w8 and w0 remain imprecise
   28: (0f) r8 += r8                  ; here link between r8 and r5 is broken
   29: (d6) if w5 s<= 0x1d goto pc+2  ; here we know value of w5 and
so it's always/never taken, r5 is marked precise

Now, if we look at r5's precision log at this instruction:

29: (d6) if w5 s<= 0x1d goto pc+2
mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Note how at this instruction r5 and r8 *WERE* linked together, but we
already lost this information for backtracking. So we don't mark w8 as
precise. That's one part of the problem.

The second part is that even if we knew that w8/r8 is precise, should
we mark w0/r0 as precise? I actually need to think about this some
more. Right now for conditional jumps we eagerly mark precision for
both registers only in always/never taken scenarios.

For now just narrowing down the issue, as I'm sure not many people
followed all the above stuff carefully.


P.S. For solving tracking of linked registers we can probably utilize
instruction history, though technically they can be spread across
multiple frames, between registers and stack slots, so that's a bit
tricky.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 23:30     ` Andrii Nakryiko
@ 2023-12-14  0:08       ` Eduard Zingerman
  2023-12-14  0:36         ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-14  0:08 UTC (permalink / raw)
  To: Andrii Nakryiko, Hao Sun
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Linux Kernel Mailing List

On Wed, 2023-12-13 at 15:30 -0800, Andrii Nakryiko wrote:
[...]
> Yes, thanks, the execution trace above was helpful. Let's try to
> minimize the example here, I'll keep original instruction indices,
> though:
> 
>    23: (bf) r5 = r8                   ; here we link r5 and r8 together
>    26: (7e) if w8 s>= w0 goto pc+5    ; here it's not always/never
> taken, so w8 and w0 remain imprecise
>    28: (0f) r8 += r8                  ; here link between r8 and r5 is broken
>    29: (d6) if w5 s<= 0x1d goto pc+2  ; here we know value of w5 and
> so it's always/never taken, r5 is marked precise
> 
> Now, if we look at r5's precision log at this instruction:
> 
> 29: (d6) if w5 s<= 0x1d goto pc+2
> mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Sorry, maybe it's time for me to get some sleep, but I don't see an
issue here. The "before" log is printed by backtrack_insn() before
instruction is backtracked. So the following:

> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Is a state of backtracker before "if w8 s>= w0 ..." is processed.
But running the test case I've shared wider precision trace for
this instruction looks as follows:

  26: (7e) if w8 s>= w0 goto pc+5       ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
                                          R8=scalar(id=2,smax32=1)
  27: (4f) r8 |= r8                     ; R8_w=scalar()
  28: (0f) r8 += r8                     ; R8_w=scalar()
  29: (d6) if w5 s<= 0x1d goto pc+2
  mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1 
  mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
  mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
  mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
  mark_precise: frame0: parent state regs=r5 stack=: 
     R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
     R2_w=4
     R3_w=0x1f00000034
     R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,
                 var_off=(0x0; 0x3fffffffc))
     R5_rw=Pscalar(id=2)
     R8_rw=scalar(id=2) R10=fp0
  mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26 
  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4     <------ !!!
  mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
  mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
  ...

Note, that right after "if w8 s>= w0 goto pc+5" is processed the
backtracker state is:

  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4

So both r5 and r8 are accounted for.

> Note how at this instruction r5 and r8 *WERE* linked together, but we
> already lost this information for backtracking. So we don't mark w8 as
> precise. That's one part of the problem.
> 
> The second part is that even if we knew that w8/r8 is precise, should
> we mark w0/r0 as precise? I actually need to think about this some
> more. Right now for conditional jumps we eagerly mark precision for
> both registers only in always/never taken scenarios.
> 
> For now just narrowing down the issue, as I'm sure not many people
> followed all the above stuff carefully.
> 
> 
> P.S. For solving tracking of linked registers we can probably utilize
> instruction history, though technically they can be spread across
> multiple frames, between registers and stack slots, so that's a bit
> tricky.

For precision back-propagation we currently rely on id values stored
in the parent state, when moving up from child to parent boundary.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-14  0:08       ` Eduard Zingerman
@ 2023-12-14  0:36         ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-14  0:36 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Wed, Dec 13, 2023 at 4:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-12-13 at 15:30 -0800, Andrii Nakryiko wrote:
> [...]
> > Yes, thanks, the execution trace above was helpful. Let's try to
> > minimize the example here, I'll keep original instruction indices,
> > though:
> >
> >    23: (bf) r5 = r8                   ; here we link r5 and r8 together
> >    26: (7e) if w8 s>= w0 goto pc+5    ; here it's not always/never
> > taken, so w8 and w0 remain imprecise
> >    28: (0f) r8 += r8                  ; here link between r8 and r5 is broken
> >    29: (d6) if w5 s<= 0x1d goto pc+2  ; here we know value of w5 and
> > so it's always/never taken, r5 is marked precise
> >
> > Now, if we look at r5's precision log at this instruction:
> >
> > 29: (d6) if w5 s<= 0x1d goto pc+2
> > mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> > mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> > mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> > mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
>
> Sorry, maybe it's time for me to get some sleep, but I don't see an
> issue here. The "before" log is printed by backtrack_insn() before
> instruction is backtracked. So the following:
>
> > mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
>
> Is a state of backtracker before "if w8 s>= w0 ..." is processed.
> But running the test case I've shared wider precision trace for
> this instruction looks as follows:
>
>   26: (7e) if w8 s>= w0 goto pc+5       ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
>                                           R8=scalar(id=2,smax32=1)
>   27: (4f) r8 |= r8                     ; R8_w=scalar()
>   28: (0f) r8 += r8                     ; R8_w=scalar()
>   29: (d6) if w5 s<= 0x1d goto pc+2
>   mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
>   mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8

What if we had a checkpoint here and not have a checkpoint at
conditional jump instruction?

The general point is that the checkpoint has information about linked
registers at the very end of the instruction span that it represents,
but any intermediate changes will be lost.

It's a similar issue to stack access tracking. At some point we know
that r3 is actually fp-8, but we will lose it by the time we actually
get to the checkpoint. Yet this information is important in the
context of some instruction before the checkpoint.

I might be missing something as well, my brain is fried from all these
verifier issues.

>   mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
>   mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
>   mark_precise: frame0: parent state regs=r5 stack=:
>      R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
>      R2_w=4
>      R3_w=0x1f00000034
>      R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,
>                  var_off=(0x0; 0x3fffffffc))
>      R5_rw=Pscalar(id=2)
>      R8_rw=scalar(id=2) R10=fp0
>   mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26

would this all work if we didn't have a checkpoint here?

>   mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4     <------ !!!
>   mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
>   mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
>   ...
>
> Note, that right after "if w8 s>= w0 goto pc+5" is processed the
> backtracker state is:
>
>   mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
>
> So both r5 and r8 are accounted for.
>
> > Note how at this instruction r5 and r8 *WERE* linked together, but we
> > already lost this information for backtracking. So we don't mark w8 as
> > precise. That's one part of the problem.
> >
> > The second part is that even if we knew that w8/r8 is precise, should
> > we mark w0/r0 as precise? I actually need to think about this some
> > more. Right now for conditional jumps we eagerly mark precision for
> > both registers only in always/never taken scenarios.
> >
> > For now just narrowing down the issue, as I'm sure not many people
> > followed all the above stuff carefully.
> >
> >
> > P.S. For solving tracking of linked registers we can probably utilize
> > instruction history, though technically they can be spread across
> > multiple frames, between registers and stack slots, so that's a bit
> > tricky.
>
> For precision back-propagation we currently rely on id values stored
> in the parent state, when moving up from child to parent boundary.

Everything might be good with linked IDs. But there is a real issue
here, let's find what it is.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 10:25   ` Hao Sun
  2023-12-13 23:30     ` Andrii Nakryiko
@ 2023-12-13 23:35     ` Eduard Zingerman
  2023-12-13 23:40       ` Andrii Nakryiko
  2023-12-14  9:38       ` Hao Sun
  1 sibling, 2 replies; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-13 23:35 UTC (permalink / raw)
  To: Hao Sun, Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, bpf,
	Linux Kernel Mailing List

On Wed, 2023-12-13 at 11:25 +0100, Hao Sun wrote:
[...]
 
> I tried to convert the repro to a valid test case in inline asm, but seems
> JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
> Will try after clang-18 is released.
> 
> #30 is expected to be executed, see below where everything after ";" is
> the runtime value:
>    ...
>    6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
>    ...
>   11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
>   ...
>   18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
>   ...
>   23: (bf) r5 = r8     ; w5 = 0xbe
>   24: (18) r2 = 0x4
>   26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
>   27: (4f) r8 |= r8
>   28: (0f) r8 += r8
>   29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
>   30: (18) r0 = 0x4      ; executed
> 
> Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
> is executed after manually commenting out the dead code rewrite pass.
> 
> From my understanding, I think r0 should be marked as precise when
> first backtrack from #29, because r5 range at this point depends on w0
> as r8 and r5 share the same id at #26.

Hi Hao, Andrii,

I converted program in question to a runnable test, here is a link to
the patch adding it and disabling dead code removal:
https://gist.github.com/eddyz87/e888ad70c947f28f94146a47e33cd378

Run the test as follows:
  ./test_progs -vvv -a verifier_and/pruning_test

And inspect the retval:
  do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
  run_subtest:FAIL:647 Unexpected retval: 1353935089 != 4

Note that I tried this test with two functions:
- bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
- bpf_get_prandom_u32, with this function I get a random retval each time.

What is the expectation when 'bpf_get_current_cgroup_id' is used?
That it is some known (to us) number, but verifier treats it as unknown scalar?

Also, I find this portion of the verification log strange:

    ...
    13: (0f) r0 += r0                     ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,
                                                        var_off=(0x0; 0x3))
    14: (2f) r4 *= r4                     ; R4_w=scalar()
    15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
    17: (c4) w4 s>>= 29                   ; R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,
                                                        var_off=(0x0; 0xffffffff))
    18: (56) if w8 != 0xf goto pc+3       ; R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,
                                                        umin=smin32=umin32=15,umax=0xffffffff0000000f,
                                                        smax32=umax32=15,var_off=(0xf; 0xffffffff00000000))
    19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
    20: (18) r2 = 0x1c                    ; R2=28
    22: (67) r4 <<= 2                     ; R4_w=scalar(smin=0,smax=umax=0x3fffffffc,
                                                        smax32=0x7ffffffc,umax32=0xfffffffc,
                                                        var_off=(0x0; 0x3fffffffc))
    23: (bf) r5 = r8                      ; R5_w=scalar(id=1,smin=0x800000000000000f,
                                                        smax=0x7fffffff0000000f,
                                                        umin=smin32=umin32=15,
                                                        umax=0xffffffff0000000f,
                                                        smax32=umax32=15,
                                                        var_off=(0xf; 0xffffffff00000000))
                                            R8=scalar(id=1,smin=0x800000000000000f,
                                                      smax=0x7fffffff0000000f,
                                                      umin=smin32=umin32=15,
                                                      umax=0xffffffff0000000f,
                                                      smax32=umax32=15,
                                                      var_off=(0xf; 0xffffffff00000000))
    24: (18) r2 = 0x4                     ; R2_w=4
    26: (7e) if w8 s>= w0 goto pc+5
    mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1 
    mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
    ...                   ^^^^^^^^^^
                          ^^^^^^^^^^
Here w8 == 15, w0 in range [0, 2], so the jump is being predicted,
but for some reason R0 is not among the registers that would be marked precise.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 23:35     ` Eduard Zingerman
@ 2023-12-13 23:40       ` Andrii Nakryiko
  2023-12-13 23:47         ` Eduard Zingerman
  2023-12-14  9:38       ` Hao Sun
  1 sibling, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-13 23:40 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Wed, Dec 13, 2023 at 3:35 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-12-13 at 11:25 +0100, Hao Sun wrote:
> [...]
>
> > I tried to convert the repro to a valid test case in inline asm, but seems
> > JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
> > Will try after clang-18 is released.
> >
> > #30 is expected to be executed, see below where everything after ";" is
> > the runtime value:
> >    ...
> >    6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
> >    ...
> >   11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
> >   ...
> >   18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
> >   ...
> >   23: (bf) r5 = r8     ; w5 = 0xbe
> >   24: (18) r2 = 0x4
> >   26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
> >   27: (4f) r8 |= r8
> >   28: (0f) r8 += r8
> >   29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
> >   30: (18) r0 = 0x4      ; executed
> >
> > Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
> > is executed after manually commenting out the dead code rewrite pass.
> >
> > From my understanding, I think r0 should be marked as precise when
> > first backtrack from #29, because r5 range at this point depends on w0
> > as r8 and r5 share the same id at #26.
>
> Hi Hao, Andrii,
>
> I converted program in question to a runnable test, here is a link to
> the patch adding it and disabling dead code removal:
> https://gist.github.com/eddyz87/e888ad70c947f28f94146a47e33cd378
>
> Run the test as follows:
>   ./test_progs -vvv -a verifier_and/pruning_test
>
> And inspect the retval:
>   do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
>   run_subtest:FAIL:647 Unexpected retval: 1353935089 != 4
>
> Note that I tried this test with two functions:
> - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> - bpf_get_prandom_u32, with this function I get a random retval each time.
>
> What is the expectation when 'bpf_get_current_cgroup_id' is used?
> That it is some known (to us) number, but verifier treats it as unknown scalar?
>
> Also, I find this portion of the verification log strange:
>
>     ...
>     13: (0f) r0 += r0                     ; R0_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,
>                                                         var_off=(0x0; 0x3))
>     14: (2f) r4 *= r4                     ; R4_w=scalar()
>     15: (18) r3 = 0x1f00000034            ; R3_w=0x1f00000034
>     17: (c4) w4 s>>= 29                   ; R4_w=scalar(smin=0,smax=umax=0xffffffff,smin32=-4,smax32=3,
>                                                         var_off=(0x0; 0xffffffff))
>     18: (56) if w8 != 0xf goto pc+3       ; R8_w=scalar(smin=0x800000000000000f,smax=0x7fffffff0000000f,
>                                                         umin=smin32=umin32=15,umax=0xffffffff0000000f,
>                                                         smax32=umax32=15,var_off=(0xf; 0xffffffff00000000))
>     19: (d7) r3 = bswap32 r3              ; R3_w=scalar()
>     20: (18) r2 = 0x1c                    ; R2=28
>     22: (67) r4 <<= 2                     ; R4_w=scalar(smin=0,smax=umax=0x3fffffffc,
>                                                         smax32=0x7ffffffc,umax32=0xfffffffc,
>                                                         var_off=(0x0; 0x3fffffffc))
>     23: (bf) r5 = r8                      ; R5_w=scalar(id=1,smin=0x800000000000000f,
>                                                         smax=0x7fffffff0000000f,
>                                                         umin=smin32=umin32=15,
>                                                         umax=0xffffffff0000000f,
>                                                         smax32=umax32=15,
>                                                         var_off=(0xf; 0xffffffff00000000))
>                                             R8=scalar(id=1,smin=0x800000000000000f,
>                                                       smax=0x7fffffff0000000f,
>                                                       umin=smin32=umin32=15,
>                                                       umax=0xffffffff0000000f,
>                                                       smax32=umax32=15,
>                                                       var_off=(0xf; 0xffffffff00000000))
>     24: (18) r2 = 0x4                     ; R2_w=4
>     26: (7e) if w8 s>= w0 goto pc+5
>     mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
>     mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
>     ...                   ^^^^^^^^^^
>                           ^^^^^^^^^^
> Here w8 == 15, w0 in range [0, 2], so the jump is being predicted,
> but for some reason R0 is not among the registers that would be marked precise.

It is, as a second step. There are two concatenated precision logs:

mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2


The issue is elsewhere, see my last email.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 23:40       ` Andrii Nakryiko
@ 2023-12-13 23:47         ` Eduard Zingerman
  2023-12-13 23:50           ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-13 23:47 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Wed, 2023-12-13 at 15:40 -0800, Andrii Nakryiko wrote:
[...]
> >     24: (18) r2 = 0x4                     ; R2_w=4
> >     26: (7e) if w8 s>= w0 goto pc+5
> >     mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> >     mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> >     ...                   ^^^^^^^^^^
> >                           ^^^^^^^^^^
> > Here w8 == 15, w0 in range [0, 2], so the jump is being predicted,
> > but for some reason R0 is not among the registers that would be marked precise.
> 
> It is, as a second step. There are two concatenated precision logs:
> 
> mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
> mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
> mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
> 
> 
> The issue is elsewhere, see my last email.

Oh, right, there are two calls to mark_chain_precision in a row, thanks

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 23:47         ` Eduard Zingerman
@ 2023-12-13 23:50           ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-13 23:50 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Wed, Dec 13, 2023 at 3:47 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Wed, 2023-12-13 at 15:40 -0800, Andrii Nakryiko wrote:
> [...]
> > >     24: (18) r2 = 0x4                     ; R2_w=4
> > >     26: (7e) if w8 s>= w0 goto pc+5
> > >     mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> > >     mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4
> > >     ...                   ^^^^^^^^^^
> > >                           ^^^^^^^^^^
> > > Here w8 == 15, w0 in range [0, 2], so the jump is being predicted,
> > > but for some reason R0 is not among the registers that would be marked precise.
> >
> > It is, as a second step. There are two concatenated precision logs:
> >
> > mark_precise: frame0: last_idx 26 first_idx 22 subseq_idx -1
> > mark_precise: frame0: regs=r0 stack= before 24: (18) r2 = 0x4
> > mark_precise: frame0: regs=r0 stack= before 23: (bf) r5 = r8
> > mark_precise: frame0: regs=r0 stack= before 22: (67) r4 <<= 2
> >
> >
> > The issue is elsewhere, see my last email.
>
> Oh, right, there are two calls to mark_chain_precision in a row, thanks

We should probably combine those two steps, though, backtrack_state
allows us that now (see how propagate_precision() is doing that in one
go). It used to be very hard to mark two registers at the same time,
but now it's trivial. So not a bad idea to improve this and remove
confusion, especially in big real-world programs.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-13 23:35     ` Eduard Zingerman
  2023-12-13 23:40       ` Andrii Nakryiko
@ 2023-12-14  9:38       ` Hao Sun
  2023-12-14 15:10         ` Eduard Zingerman
  1 sibling, 1 reply; 24+ messages in thread
From: Hao Sun @ 2023-12-14  9:38 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List



> On 14 Dec 2023, at 12:35 AM, Eduard Zingerman <eddyz87@gmail.com> wrote:
> 
> On Wed, 2023-12-13 at 11:25 +0100, Hao Sun wrote:
> [...]
> 
>> I tried to convert the repro to a valid test case in inline asm, but seems
>> JSET (if r0 & 0xfffffffe goto pc+3) is currently not supported in clang-17.
>> Will try after clang-18 is released.
>> 
>> #30 is expected to be executed, see below where everything after ";" is
>> the runtime value:
>>   ...
>>   6: (36) if w8 >= 0x69 goto pc+1    ; w8 = 0xbe, always taken
>>   ...
>>  11: (45) if r0 & 0xfffffffe goto pc+3  ; r0 = 0x616, taken
>>  ...
>>  18: (56) if w8 != 0xf goto pc+3     ; w8 not touched, taken
>>  ...
>>  23: (bf) r5 = r8     ; w5 = 0xbe
>>  24: (18) r2 = 0x4
>>  26: (7e) if w8 s>= w0 goto pc+5    ; non-taken
>>  27: (4f) r8 |= r8
>>  28: (0f) r8 += r8
>>  29: (d6) if w5 s<= 0x1d goto pc+2  ; non-taken
>>  30: (18) r0 = 0x4      ; executed
>> 
>> Since the verifier prunes at #26, #30 is dead and eliminated. So, #30
>> is executed after manually commenting out the dead code rewrite pass.
>> 
>> From my understanding, I think r0 should be marked as precise when
>> first backtrack from #29, because r5 range at this point depends on w0
>> as r8 and r5 share the same id at #26.
> 
> Hi Hao, Andrii,
> 
> I converted program in question to a runnable test, here is a link to
> the patch adding it and disabling dead code removal:
> https://gist.github.com/eddyz87/e888ad70c947f28f94146a47e33cd378
> 
> Run the test as follows:
>  ./test_progs -vvv -a verifier_and/pruning_test
> 
> And inspect the retval:
>  do_prog_test_run:PASS:bpf_prog_test_run 0 nsec
>  run_subtest:FAIL:647 Unexpected retval: 1353935089 != 4
> 

Thanks for the runnable test! 

The reason why retval checks fails is that the way you disable dead
code removal pass is not complete. Disable opt_remove_dead_code()
just prevent the instruction #30 from being removed, but also note
opt_hard_wire_dead_code_branches(), which convert conditional jump
into unconditional one, so #30 is still skipped.



> Note that I tried this test with two functions:
> - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> - bpf_get_prandom_u32, with this function I get a random retval each time.
> 
> What is the expectation when 'bpf_get_current_cgroup_id' is used?
> That it is some known (to us) number, but verifier treats it as unknown scalar?
> 

Either one would work, but to make #30 always taken, r0 should be
non-zero.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-14  9:38       ` Hao Sun
@ 2023-12-14 15:10         ` Eduard Zingerman
  2023-12-14 16:26           ` Eduard Zingerman
  0 siblings, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-14 15:10 UTC (permalink / raw)
  To: Hao Sun
  Cc: Andrii Nakryiko, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

[...]
> The reason why retval checks fails is that the way you disable dead
> code removal pass is not complete. Disable opt_remove_dead_code()
> just prevent the instruction #30 from being removed, but also note
> opt_hard_wire_dead_code_branches(), which convert conditional jump
> into unconditional one, so #30 is still skipped.
> 
> > Note that I tried this test with two functions:
> > - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> > - bpf_get_prandom_u32, with this function I get a random retval each time.
> > 
> > What is the expectation when 'bpf_get_current_cgroup_id' is used?
> > That it is some known (to us) number, but verifier treats it as unknown scalar?
> > 
> 
> Either one would work, but to make #30 always taken, r0 should be
> non-zero.

Oh, thank you, I made opt_hard_wire_dead_code_branches() a noop,
replaced r0 = 0x4 by r0 /= 0 and see "divide error: 0000 [#1] PREEMPT SMP NOPTI"
error in the kernel log on every second or third run of the test
(when using prandom).

Working to minimize the test case will share results a bit later.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-14 15:10         ` Eduard Zingerman
@ 2023-12-14 16:26           ` Eduard Zingerman
  2023-12-15  0:06             ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-14 16:26 UTC (permalink / raw)
  To: Hao Sun
  Cc: Andrii Nakryiko, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 17:10 +0200, Eduard Zingerman wrote:
> [...]
> > The reason why retval checks fails is that the way you disable dead
> > code removal pass is not complete. Disable opt_remove_dead_code()
> > just prevent the instruction #30 from being removed, but also note
> > opt_hard_wire_dead_code_branches(), which convert conditional jump
> > into unconditional one, so #30 is still skipped.
> > 
> > > Note that I tried this test with two functions:
> > > - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> > > - bpf_get_prandom_u32, with this function I get a random retval each time.
> > > 
> > > What is the expectation when 'bpf_get_current_cgroup_id' is used?
> > > That it is some known (to us) number, but verifier treats it as unknown scalar?
> > > 
> > 
> > Either one would work, but to make #30 always taken, r0 should be
> > non-zero.
> 
> Oh, thank you, I made opt_hard_wire_dead_code_branches() a noop,
> replaced r0 = 0x4 by r0 /= 0 and see "divide error: 0000 [#1] PREEMPT SMP NOPTI"
> error in the kernel log on every second or third run of the test
> (when using prandom).
> 
> Working to minimize the test case will share results a bit later.

Here is the minimized version of the test:
https://gist.github.com/eddyz87/fb4d3c7d5aabdc2ae247ed73fefccd32

If executed several times: ./test_progs -vvv -a verifier_and/pruning_test
it eventually crashes VM with the following error:

[    2.039066] divide error: 0000 [#1] PREEMPT SMP NOPTI
               ...
[    2.039987] Call Trace:
[    2.039987]  <TASK>
[    2.039987]  ? die+0x36/0x90
[    2.039987]  ? do_trap+0xdb/0x100
[    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
[    2.039987]  ? do_error_trap+0x7d/0x110
[    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
[    2.039987]  ? exc_divide_error+0x38/0x50
[    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
[    2.039987]  ? asm_exc_divide_error+0x1a/0x20
[    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
[    2.039987]  bpf_test_run+0x1b5/0x350
[    2.039987]  ? bpf_test_run+0x115/0x350
               ...

I'll continue debugging this a bit later today.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-14 16:26           ` Eduard Zingerman
@ 2023-12-15  0:06             ` Andrii Nakryiko
  2023-12-15  0:16               ` Eduard Zingerman
  2023-12-15  0:49               ` Eduard Zingerman
  0 siblings, 2 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-15  0:06 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Thu, Dec 14, 2023 at 8:26 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-12-14 at 17:10 +0200, Eduard Zingerman wrote:
> > [...]
> > > The reason why retval checks fails is that the way you disable dead
> > > code removal pass is not complete. Disable opt_remove_dead_code()
> > > just prevent the instruction #30 from being removed, but also note
> > > opt_hard_wire_dead_code_branches(), which convert conditional jump
> > > into unconditional one, so #30 is still skipped.
> > >
> > > > Note that I tried this test with two functions:
> > > > - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> > > > - bpf_get_prandom_u32, with this function I get a random retval each time.
> > > >
> > > > What is the expectation when 'bpf_get_current_cgroup_id' is used?
> > > > That it is some known (to us) number, but verifier treats it as unknown scalar?
> > > >
> > >
> > > Either one would work, but to make #30 always taken, r0 should be
> > > non-zero.
> >
> > Oh, thank you, I made opt_hard_wire_dead_code_branches() a noop,
> > replaced r0 = 0x4 by r0 /= 0 and see "divide error: 0000 [#1] PREEMPT SMP NOPTI"
> > error in the kernel log on every second or third run of the test
> > (when using prandom).
> >
> > Working to minimize the test case will share results a bit later.
>
> Here is the minimized version of the test:
> https://gist.github.com/eddyz87/fb4d3c7d5aabdc2ae247ed73fefccd32
>
> If executed several times: ./test_progs -vvv -a verifier_and/pruning_test
> it eventually crashes VM with the following error:
>
> [    2.039066] divide error: 0000 [#1] PREEMPT SMP NOPTI
>                ...
> [    2.039987] Call Trace:
> [    2.039987]  <TASK>
> [    2.039987]  ? die+0x36/0x90
> [    2.039987]  ? do_trap+0xdb/0x100
> [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> [    2.039987]  ? do_error_trap+0x7d/0x110
> [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> [    2.039987]  ? exc_divide_error+0x38/0x50
> [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> [    2.039987]  ? asm_exc_divide_error+0x1a/0x20
> [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> [    2.039987]  bpf_test_run+0x1b5/0x350
> [    2.039987]  ? bpf_test_run+0x115/0x350
>                ...
>
> I'll continue debugging this a bit later today.
>

Great, thanks a lot, Eduard. Let's paste the program here for discussion:

$ cat progs/verifier_blah.c
// SPDX-License-Identifier: GPL-2.0
/* Copyright (C) 2023 SUSE LLC */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"

SEC("socket")
__success
__flag(BPF_F_TEST_STATE_FREQ)
__retval(42)
__naked void pruning_test(void)
{
        asm volatile (
        "   call %[bpf_get_prandom_u32];\n"
        "   r7 = r0;\n"
        "   call %[bpf_get_prandom_u32];\n"
        "   r8 = 2;\n"
        "   if r0 > 1 goto 1f;\n"
        "   r8 = r7;\n"
        "1: r5 = r8;\n"
        "   if r8 >= r0 goto 2f;\n"
        "   r8 += r8;\n"
        "   if r5 == 0 goto 2f;\n"
        "   r0 /= 0;\n"
        "2: r0 = 42;\n"
        "   exit;\n"
        :
        : __imm(bpf_get_prandom_u32)
        : __clobber_all);
}

char _license[] SEC("license") = "GPL";


If we look at relevant portion of verifier log for `if r5 == 0` we see this:

9: (15) if r5 == 0x0 goto pc+1
mark_precise: frame0: last_idx 9 first_idx 7 subseq_idx -1
mark_precise: frame0: regs=r5,r7 stack= before 8: (0f) r8 += r8
mark_precise: frame0: regs=r5,r7 stack= before 7: (3d) if r8 >= r0 goto pc+3

^^ Note here that we only have r5 and r7, not r8.

mark_precise: frame0: parent state regs=r5,r7 stack=:
R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0;
0x1)) R5_rw=Pscalar(id=1) R7_w=Pscalar(id=1) R8_rw=scalar(id=1)
R10=fp0
mark_precise: frame0: last_idx 6 first_idx 0 subseq_idx 7
mark_precise: frame0: regs=r5,r7,r8 stack= before 6: (bf) r5 = r8
mark_precise: frame0: regs=r7,r8 stack= before 5: (bf) r8 = r7
mark_precise: frame0: regs=r7 stack= before 4: (25) if r0 > 0x1 goto pc+1
mark_precise: frame0: regs=r7 stack= before 3: (b7) r8 = 2
mark_precise: frame0: regs=r7 stack= before 2: (85) call bpf_get_prandom_u32#7
mark_precise: frame0: regs=r7 stack= before 1: (bf) r7 = r0
mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7

Note above that r0 in `if r8 >= r0` is not marked as precise because
at that point we don't know that r8 should be precise (due to us
"forgetting" linked ID information).

Now, let's comment out the "r8 += r8" instruction so that we preserve
linkage between r5 and r8 (and also r7, but that's less relevant
here).

8: (15) if r5 == 0x0 goto pc+1
mark_precise: frame0: last_idx 8 first_idx 7 subseq_idx -1
mark_precise: frame0: regs=r5,r7,r8 stack= before 7: (3d) if r8 >= r0 goto pc+2

^^ Here note how we seek for r5,r7, *and* r8 to be precise...

mark_precise: frame0: parent state regs=r0,r5,r7,r8 stack=:

... which leads to us adding r0 to the set due to that `if r8 >= r0`
instruction.

(btw, I was wrong yesterday, we do have logic to mark *both* registers
of conditional jump if at least one of them is precise, so seems like
we handle that well)


 R0_rw=Pscalar(smin=smin32=0,smax=umax=smax32=umax32=1,var_off=(0x0;
0x1)) R5_rw=Pscalar(id=1) R7_w=Pscalar(id=1) R8_rw=Pscalar(id=1)
R10=fp0
mark_precise: frame0: last_idx 6 first_idx 0 subseq_idx 7
mark_precise: frame0: regs=r0,r5,r7,r8 stack= before 6: (bf) r5 = r8
mark_precise: frame0: regs=r0,r7,r8 stack= before 5: (bf) r8 = r7
mark_precise: frame0: regs=r0,r7 stack= before 4: (25) if r0 > 0x1 goto pc+1
mark_precise: frame0: regs=r0,r7 stack= before 3: (b7) r8 = 2
mark_precise: frame0: regs=r0,r7 stack= before 2: (85) call
bpf_get_prandom_u32#7
mark_precise: frame0: regs=r7 stack= before 1: (bf) r7 = r0
mark_precise: frame0: regs=r0 stack= before 0: (85) call bpf_get_prandom_u32#7


So all in all, I still think that the root cause is what I said
yesterday. We don't preserve information about linked registers at the
per-instruction level, but we should.


If you agree with the analysis, we can start discussing what's the
best way to fix this.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  0:06             ` Andrii Nakryiko
@ 2023-12-15  0:16               ` Eduard Zingerman
  2023-12-15  0:49               ` Eduard Zingerman
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15  0:16 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 16:06 -0800, Andrii Nakryiko wrote:
> On Thu, Dec 14, 2023 at 8:26 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Thu, 2023-12-14 at 17:10 +0200, Eduard Zingerman wrote:
> > > [...]
> > > > The reason why retval checks fails is that the way you disable dead
> > > > code removal pass is not complete. Disable opt_remove_dead_code()
> > > > just prevent the instruction #30 from being removed, but also note
> > > > opt_hard_wire_dead_code_branches(), which convert conditional jump
> > > > into unconditional one, so #30 is still skipped.
> > > > 
> > > > > Note that I tried this test with two functions:
> > > > > - bpf_get_current_cgroup_id, with this function I get retval 2, not 4 :)
> > > > > - bpf_get_prandom_u32, with this function I get a random retval each time.
> > > > > 
> > > > > What is the expectation when 'bpf_get_current_cgroup_id' is used?
> > > > > That it is some known (to us) number, but verifier treats it as unknown scalar?
> > > > > 
> > > > 
> > > > Either one would work, but to make #30 always taken, r0 should be
> > > > non-zero.
> > > 
> > > Oh, thank you, I made opt_hard_wire_dead_code_branches() a noop,
> > > replaced r0 = 0x4 by r0 /= 0 and see "divide error: 0000 [#1] PREEMPT SMP NOPTI"
> > > error in the kernel log on every second or third run of the test
> > > (when using prandom).
> > > 
> > > Working to minimize the test case will share results a bit later.
> > 
> > Here is the minimized version of the test:
> > https://gist.github.com/eddyz87/fb4d3c7d5aabdc2ae247ed73fefccd32
> > 
> > If executed several times: ./test_progs -vvv -a verifier_and/pruning_test
> > it eventually crashes VM with the following error:
> > 
> > [    2.039066] divide error: 0000 [#1] PREEMPT SMP NOPTI
> >                ...
> > [    2.039987] Call Trace:
> > [    2.039987]  <TASK>
> > [    2.039987]  ? die+0x36/0x90
> > [    2.039987]  ? do_trap+0xdb/0x100
> > [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> > [    2.039987]  ? do_error_trap+0x7d/0x110
> > [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> > [    2.039987]  ? exc_divide_error+0x38/0x50
> > [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> > [    2.039987]  ? asm_exc_divide_error+0x1a/0x20
> > [    2.039987]  ? bpf_prog_32cfdb2c00b08250_pruning_test+0x4d/0x60
> > [    2.039987]  bpf_test_run+0x1b5/0x350
> > [    2.039987]  ? bpf_test_run+0x115/0x350
> >                ...
> > 
> > I'll continue debugging this a bit later today.
> > 
> 
> Great, thanks a lot, Eduard. Let's paste the program here for discussion:
> 
> ...
> 

I managed to minimize it a bit more, getting rid of r5,
(not that it changes anything):

SEC("socket")
__success
__flag(BPF_F_TEST_STATE_FREQ)
__retval(42)
__naked void pruning_test(void)
{
	asm volatile (
	"   call %[bpf_get_prandom_u32];\n"
	"   r7 = r0;\n"
	"   r8 = r0;\n"
	"   call %[bpf_get_prandom_u32];\n"
	"   if r0 > 1 goto +0;\n"
	"   if r8 >= r0 goto 1f;\n"
	"   r8 += r8;\n"
	"   if r7 == 0 goto 1f;\n"
	"   r0 /= 0;\n"
	"1: r0 = 42;\n"
	"   exit;\n"
	:
	: __imm(bpf_get_prandom_u32)
	: __clobber_all);
}

> If you agree with the analysis, we can start discussing what's the
> best way to fix this.

Please give me some more time, I'm adding some prints do understand
why current logic does not mark r8 for state that has "if r8 >= r0 goto 1f;\n"
as it's first instruction, on a surface it should.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  0:06             ` Andrii Nakryiko
  2023-12-15  0:16               ` Eduard Zingerman
@ 2023-12-15  0:49               ` Eduard Zingerman
  2023-12-15  1:24                 ` Eduard Zingerman
  1 sibling, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15  0:49 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 16:06 -0800, Andrii Nakryiko wrote:
[...]
> If you agree with the analysis, we can start discussing what's the
> best way to fix this.

Ok, yeap, I agree with you. 
Backtracker marks both registers in 'if' statement if one of them is
tracked, but r8 is not marked at block entry and we miss r0.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  0:49               ` Eduard Zingerman
@ 2023-12-15  1:24                 ` Eduard Zingerman
  2023-12-15  1:43                   ` Eduard Zingerman
  2023-12-15  2:16                   ` Alexei Starovoitov
  0 siblings, 2 replies; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15  1:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Fri, 2023-12-15 at 02:49 +0200, Eduard Zingerman wrote:
> On Thu, 2023-12-14 at 16:06 -0800, Andrii Nakryiko wrote:
> [...]
> > If you agree with the analysis, we can start discussing what's the
> > best way to fix this.
> 
> Ok, yeap, I agree with you. 
> Backtracker marks both registers in 'if' statement if one of them is
> tracked, but r8 is not marked at block entry and we miss r0.

The brute-force solution is to keep a special mask for each
conditional jump in jump history. In this mask, mark all registers and
stack slots that gained range because of find_equal_scalars() executed
for this conditional jump. Use this mask to extend precise registers set.
However, such mask would be prohibitively large: (10+64)*8 bits.

---

Here is an option that would fix the test in question, but I'm not
sure if it covers all cases:
1. At the last instruction of each state (first instruction to be
   backtracked) we know the set of IDs that should be tracked for
   precision, as currently marked by mark_precise_scalar_ids().
2. In jump history we can record IDs for src and dst registers when new
   entry is pushed.
3. While backtracking 'if' statement, if one of the recorded IDs is in
   the set identified at (1), add src/dst regs to precise registers set.

E.g. for the test-case at hand:

  0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
  1: (bf) r7 = r0                       ; R0=scalar(id=1) R7_w=scalar(id=1)
  2: (bf) r8 = r0                       ; R0=scalar(id=1) R8_w=scalar(id=1)
  3: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
  --- checkpoint #1 r7.id = 1, r8.id = 1 ---
  4: (25) if r0 > 0x1 goto pc+0         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,...)
  --- checkpoint #2 r7.id = 1, r8.id = 1 ---
  5: (3d) if r8 >= r0 goto pc+3         ; R0=1 R8=0 | record r8.id=1 in jump history
  6: (0f) r8 += r8                      ; R8=0
  --- checkpoint #3 r7.id = 1, r8.id = 0 ---
  7: (15) if r7 == 0x0 goto pc+1

The precise set for checkpoint #3 state is {1}.
When insn (5) is backtracked r8.id would be in jump history and in
"precise set" => r8 and r0 would be added to backtracker state.

But this seems a bit ad-hoc.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  1:24                 ` Eduard Zingerman
@ 2023-12-15  1:43                   ` Eduard Zingerman
  2023-12-15  2:16                   ` Alexei Starovoitov
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15  1:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Hao Sun, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	bpf, Linux Kernel Mailing List

On Fri, 2023-12-15 at 03:24 +0200, Eduard Zingerman wrote:
[...]
> Here is an option that would fix the test in question, but I'm not
> sure if it covers all cases:
> 1. At the last instruction of each state (first instruction to be
>    backtracked) we know the set of IDs that should be tracked for
>    precision, as currently marked by mark_precise_scalar_ids().
> 2. In jump history we can record IDs for src and dst registers when new
>    entry is pushed.
> 3. While backtracking 'if' statement, if one of the recorded IDs is in
>    the set identified at (1), add src/dst regs to precise registers set.

Nah... this won't work for "second order" ids.

    --- suppose r2.id == r3.id here
    if r3 > 10 goto exit;
    r1 += r2
    ... use r1 as precise ...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  1:24                 ` Eduard Zingerman
  2023-12-15  1:43                   ` Eduard Zingerman
@ 2023-12-15  2:16                   ` Alexei Starovoitov
  2023-12-15  2:28                     ` Eduard Zingerman
  2023-12-15 20:55                     ` Eduard Zingerman
  1 sibling, 2 replies; 24+ messages in thread
From: Alexei Starovoitov @ 2023-12-15  2:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, Dec 14, 2023 at 5:24 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Fri, 2023-12-15 at 02:49 +0200, Eduard Zingerman wrote:
> > On Thu, 2023-12-14 at 16:06 -0800, Andrii Nakryiko wrote:
> > [...]
> > > If you agree with the analysis, we can start discussing what's the
> > > best way to fix this.
> >
> > Ok, yeap, I agree with you.
> > Backtracker marks both registers in 'if' statement if one of them is
> > tracked, but r8 is not marked at block entry and we miss r0.
>
> The brute-force solution is to keep a special mask for each
> conditional jump in jump history. In this mask, mark all registers and
> stack slots that gained range because of find_equal_scalars() executed
> for this conditional jump. Use this mask to extend precise registers set.
> However, such mask would be prohibitively large: (10+64)*8 bits.
>
> ---
>
> Here is an option that would fix the test in question, but I'm not
> sure if it covers all cases:
> 1. At the last instruction of each state (first instruction to be
>    backtracked) we know the set of IDs that should be tracked for
>    precision, as currently marked by mark_precise_scalar_ids().
> 2. In jump history we can record IDs for src and dst registers when new
>    entry is pushed.
> 3. While backtracking 'if' statement, if one of the recorded IDs is in
>    the set identified at (1), add src/dst regs to precise registers set.
>
> E.g. for the test-case at hand:
>
>   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
>   1: (bf) r7 = r0                       ; R0=scalar(id=1) R7_w=scalar(id=1)
>   2: (bf) r8 = r0                       ; R0=scalar(id=1) R8_w=scalar(id=1)
>   3: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
>   --- checkpoint #1 r7.id = 1, r8.id = 1 ---
>   4: (25) if r0 > 0x1 goto pc+0         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,...)
>   --- checkpoint #2 r7.id = 1, r8.id = 1 ---
>   5: (3d) if r8 >= r0 goto pc+3         ; R0=1 R8=0 | record r8.id=1 in jump history
>   6: (0f) r8 += r8                      ; R8=0

can we detect that any register link is broken and force checkpoint here?

>   --- checkpoint #3 r7.id = 1, r8.id = 0 ---
>   7: (15) if r7 == 0x0 goto pc+1

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  2:16                   ` Alexei Starovoitov
@ 2023-12-15  2:28                     ` Eduard Zingerman
  2023-12-15  5:20                       ` Andrii Nakryiko
  2023-12-15 20:55                     ` Eduard Zingerman
  1 sibling, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15  2:28 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 18:16 -0800, Alexei Starovoitov wrote:
[...]
> > E.g. for the test-case at hand:
> > 
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (bf) r7 = r0                       ; R0=scalar(id=1) R7_w=scalar(id=1)
> >   2: (bf) r8 = r0                       ; R0=scalar(id=1) R8_w=scalar(id=1)
> >   3: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   --- checkpoint #1 r7.id = 1, r8.id = 1 ---
> >   4: (25) if r0 > 0x1 goto pc+0         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,...)
> >   --- checkpoint #2 r7.id = 1, r8.id = 1 ---
> >   5: (3d) if r8 >= r0 goto pc+3         ; R0=1 R8=0 | record r8.id=1 in jump history
> >   6: (0f) r8 += r8                      ; R8=0
> 
> can we detect that any register link is broken and force checkpoint here?

Should be possible. I'll try this in the morning and check veristat results.

By the way, I added some stats collection for find_equal_scalars() and see
the following results when run on ./test_progs:
- maximal number of registers with same id per call: 3
- average number of registers with same id per call: 1.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  2:28                     ` Eduard Zingerman
@ 2023-12-15  5:20                       ` Andrii Nakryiko
  2023-12-15 16:22                         ` Eduard Zingerman
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-15  5:20 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, Dec 14, 2023 at 6:28 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-12-14 at 18:16 -0800, Alexei Starovoitov wrote:
> [...]
> > > E.g. for the test-case at hand:
> > >
> > >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> > >   1: (bf) r7 = r0                       ; R0=scalar(id=1) R7_w=scalar(id=1)
> > >   2: (bf) r8 = r0                       ; R0=scalar(id=1) R8_w=scalar(id=1)
> > >   3: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> > >   --- checkpoint #1 r7.id = 1, r8.id = 1 ---
> > >   4: (25) if r0 > 0x1 goto pc+0         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,...)
> > >   --- checkpoint #2 r7.id = 1, r8.id = 1 ---
> > >   5: (3d) if r8 >= r0 goto pc+3         ; R0=1 R8=0 | record r8.id=1 in jump history
> > >   6: (0f) r8 += r8                      ; R8=0
> >
> > can we detect that any register link is broken and force checkpoint here?
>
> Should be possible. I'll try this in the morning and check veristat results.
>
> By the way, I added some stats collection for find_equal_scalars() and see
> the following results when run on ./test_progs:
> - maximal number of registers with same id per call: 3
> - average number of registers with same id per call: 1.4


What if we keep 8 extra bytes in jump/instruction history and encode
up to 8 linked registers/slots:

1. 1 bit to mark whether it's a src_reg set, or dst_reg set
2. 1 bit to mark whether it's a stack slot or register
3. 6 bits (0..63 values) to record register or slot number

If we ever need more than 8 linked registers, we can just forcefully
some "links" by resetting some IDs?

BTW, is it only conditional jumps that need to record this linked
register sets? Did we previously discuss why we don't need this for
any other operation?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  5:20                       ` Andrii Nakryiko
@ 2023-12-15 16:22                         ` Eduard Zingerman
  2023-12-15 17:01                           ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15 16:22 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 21:20 -0800, Andrii Nakryiko wrote:
[...]
> > > can we detect that any register link is broken and force checkpoint here?
> > 
> > Should be possible. I'll try this in the morning and check veristat results.

{still working on this}

> > By the way, I added some stats collection for find_equal_scalars() and see
> > the following results when run on ./test_progs:
> > - maximal number of registers with same id per call: 3
> > - average number of registers with same id per call: 1.4
> 
> What if we keep 8 extra bytes in jump/instruction history and encode
> up to 8 linked registers/slots:
> 
> 1. 1 bit to mark whether it's a src_reg set, or dst_reg set
> 2. 1 bit to mark whether it's a stack slot or register
> 3. 6 bits (0..63 values) to record register or slot number
> 
> If we ever need more than 8 linked registers, we can just forcefully
> some "links" by resetting some IDs?

That should work as well.
Probably don't need src/dst bit, as backtracker marks both as precise
when processing conditional jump.

You mean "just forcefully [breaking] some "links" by resetting ...", right?

> BTW, is it only conditional jumps that need to record this linked
> register sets? Did we previously discuss why we don't need this for
> any other operation?

Don't think that we discussed it.
Here is my reasoning: the range transfer happens at find_equal_scalars()
which is called only from check_cond_jmp_op().
I think there are no other effects IDs have for scalar values.
Thus, covering conditional jumps seems sufficient.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15 16:22                         ` Eduard Zingerman
@ 2023-12-15 17:01                           ` Andrii Nakryiko
  0 siblings, 0 replies; 24+ messages in thread
From: Andrii Nakryiko @ 2023-12-15 17:01 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Fri, Dec 15, 2023 at 8:22 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2023-12-14 at 21:20 -0800, Andrii Nakryiko wrote:
> [...]
> > > > can we detect that any register link is broken and force checkpoint here?
> > >
> > > Should be possible. I'll try this in the morning and check veristat results.
>
> {still working on this}
>
> > > By the way, I added some stats collection for find_equal_scalars() and see
> > > the following results when run on ./test_progs:
> > > - maximal number of registers with same id per call: 3
> > > - average number of registers with same id per call: 1.4
> >
> > What if we keep 8 extra bytes in jump/instruction history and encode
> > up to 8 linked registers/slots:
> >
> > 1. 1 bit to mark whether it's a src_reg set, or dst_reg set
> > 2. 1 bit to mark whether it's a stack slot or register
> > 3. 6 bits (0..63 values) to record register or slot number
> >
> > If we ever need more than 8 linked registers, we can just forcefully
> > some "links" by resetting some IDs?
>
> That should work as well.
> Probably don't need src/dst bit, as backtracker marks both as precise
> when processing conditional jump.

yeah, probably

>
> You mean "just forcefully [breaking] some "links" by resetting ...", right?

yeah, breaking, sorry, inattentive brain :)

>
> > BTW, is it only conditional jumps that need to record this linked
> > register sets? Did we previously discuss why we don't need this for
> > any other operation?
>
> Don't think that we discussed it.
> Here is my reasoning: the range transfer happens at find_equal_scalars()
> which is called only from check_cond_jmp_op().
> I think there are no other effects IDs have for scalar values.
> Thus, covering conditional jumps seems sufficient.
>
>

yep, makes sense, any other operation is breaking the link

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Bug Report] bpf: incorrectly pruning runtime execution path
  2023-12-15  2:16                   ` Alexei Starovoitov
  2023-12-15  2:28                     ` Eduard Zingerman
@ 2023-12-15 20:55                     ` Eduard Zingerman
  1 sibling, 0 replies; 24+ messages in thread
From: Eduard Zingerman @ 2023-12-15 20:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, Hao Sun, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, bpf, Linux Kernel Mailing List

On Thu, 2023-12-14 at 18:16 -0800, Alexei Starovoitov wrote:
[...]
> > E.g. for the test-case at hand:
> > 
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (bf) r7 = r0                       ; R0=scalar(id=1) R7_w=scalar(id=1)
> >   2: (bf) r8 = r0                       ; R0=scalar(id=1) R8_w=scalar(id=1)
> >   3: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   --- checkpoint #1 r7.id = 1, r8.id = 1 ---
> >   4: (25) if r0 > 0x1 goto pc+0         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=1,...)
> >   --- checkpoint #2 r7.id = 1, r8.id = 1 ---
> >   5: (3d) if r8 >= r0 goto pc+3         ; R0=1 R8=0 | record r8.id=1 in jump history
> >   6: (0f) r8 += r8                      ; R8=0
> 
> can we detect that any register link is broken and force checkpoint here?

I implemented this and pushed to github [0] for the moment.
The minimized test case and original reproducer are both passing.
About 15 self-tests are failing, I looked through each once and
failures seem to be caused by changes in the log.
I might have missed something, though.

Veristat results are "not great, not terrible", the full table
comparing this patch to master is at [1], the summary is as follows:
- average increase in number of processed instructions: 3%
- max     increase in number of processed instructions: 81%
- average increase in number of processed states      : 76%
- max     increase in number of processed states      : 482%

The hack with adding BPF_ID_TRANSFERED_RANGE bit to scalar id, if that
id was used for range transfer is ugly but necessary.
W/o it number of processed states might increase 10x times for some selftests.

I will now implement 8-byte jump history modification suggested by
Andrii in order to compare patches.

[0] https://github.com/eddyz87/bpf/tree/find-equal-scalars-and-precision-fix-new-states
[1] https://gist.github.com/eddyz87/73e3c6df31a80ad8660ae079e16ae365

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-12-15 20:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-11 15:31 [Bug Report] bpf: incorrectly pruning runtime execution path Hao Sun
2023-12-13  0:51 ` Andrii Nakryiko
2023-12-13 10:25   ` Hao Sun
2023-12-13 23:30     ` Andrii Nakryiko
2023-12-14  0:08       ` Eduard Zingerman
2023-12-14  0:36         ` Andrii Nakryiko
2023-12-13 23:35     ` Eduard Zingerman
2023-12-13 23:40       ` Andrii Nakryiko
2023-12-13 23:47         ` Eduard Zingerman
2023-12-13 23:50           ` Andrii Nakryiko
2023-12-14  9:38       ` Hao Sun
2023-12-14 15:10         ` Eduard Zingerman
2023-12-14 16:26           ` Eduard Zingerman
2023-12-15  0:06             ` Andrii Nakryiko
2023-12-15  0:16               ` Eduard Zingerman
2023-12-15  0:49               ` Eduard Zingerman
2023-12-15  1:24                 ` Eduard Zingerman
2023-12-15  1:43                   ` Eduard Zingerman
2023-12-15  2:16                   ` Alexei Starovoitov
2023-12-15  2:28                     ` Eduard Zingerman
2023-12-15  5:20                       ` Andrii Nakryiko
2023-12-15 16:22                         ` Eduard Zingerman
2023-12-15 17:01                           ` Andrii Nakryiko
2023-12-15 20:55                     ` Eduard Zingerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox