public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Hao Sun <sunhao.th@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>, bpf <bpf@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [Bug Report] bpf: incorrectly pruning runtime execution path
Date: Thu, 14 Dec 2023 02:08:56 +0200	[thread overview]
Message-ID: <35be3c5f29ee0e9a49ed29e71044f0ad25d97d9d.camel@gmail.com> (raw)
In-Reply-To: <CAEf4BzbGSrU4NgM1Ps0g_ch8G68CWEsP50Y+Wy8-SfYnpHwVGA@mail.gmail.com>

On Wed, 2023-12-13 at 15:30 -0800, Andrii Nakryiko wrote:
[...]
> Yes, thanks, the execution trace above was helpful. Let's try to
> minimize the example here, I'll keep original instruction indices,
> though:
> 
>    23: (bf) r5 = r8                   ; here we link r5 and r8 together
>    26: (7e) if w8 s>= w0 goto pc+5    ; here it's not always/never
> taken, so w8 and w0 remain imprecise
>    28: (0f) r8 += r8                  ; here link between r8 and r5 is broken
>    29: (d6) if w5 s<= 0x1d goto pc+2  ; here we know value of w5 and
> so it's always/never taken, r5 is marked precise
> 
> Now, if we look at r5's precision log at this instruction:
> 
> 29: (d6) if w5 s<= 0x1d goto pc+2
> mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1
> mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
> mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Sorry, maybe it's time for me to get some sleep, but I don't see an
issue here. The "before" log is printed by backtrack_insn() before
instruction is backtracked. So the following:

> mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5

Is a state of backtracker before "if w8 s>= w0 ..." is processed.
But running the test case I've shared wider precision trace for
this instruction looks as follows:

  26: (7e) if w8 s>= w0 goto pc+5       ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
                                          R8=scalar(id=2,smax32=1)
  27: (4f) r8 |= r8                     ; R8_w=scalar()
  28: (0f) r8 += r8                     ; R8_w=scalar()
  29: (d6) if w5 s<= 0x1d goto pc+2
  mark_precise: frame0: last_idx 29 first_idx 26 subseq_idx -1 
  mark_precise: frame0: regs=r5 stack= before 28: (0f) r8 += r8
  mark_precise: frame0: regs=r5 stack= before 27: (4f) r8 |= r8
  mark_precise: frame0: regs=r5 stack= before 26: (7e) if w8 s>= w0 goto pc+5
  mark_precise: frame0: parent state regs=r5 stack=: 
     R0_rw=scalar(smin=smin32=0,smax=umax=smax32=umax32=2,var_off=(0x0; 0x3))
     R2_w=4
     R3_w=0x1f00000034
     R4_w=scalar(smin=0,smax=umax=0x3fffffffc,smax32=0x7ffffffc,umax32=0xfffffffc,
                 var_off=(0x0; 0x3fffffffc))
     R5_rw=Pscalar(id=2)
     R8_rw=scalar(id=2) R10=fp0
  mark_precise: frame0: last_idx 24 first_idx 11 subseq_idx 26 
  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4     <------ !!!
  mark_precise: frame0: regs=r5,r8 stack= before 23: (bf) r5 = r8
  mark_precise: frame0: regs=r8 stack= before 22: (67) r4 <<= 2
  ...

Note, that right after "if w8 s>= w0 goto pc+5" is processed the
backtracker state is:

  mark_precise: frame0: regs=r5,r8 stack= before 24: (18) r2 = 0x4

So both r5 and r8 are accounted for.

> Note how at this instruction r5 and r8 *WERE* linked together, but we
> already lost this information for backtracking. So we don't mark w8 as
> precise. That's one part of the problem.
> 
> The second part is that even if we knew that w8/r8 is precise, should
> we mark w0/r0 as precise? I actually need to think about this some
> more. Right now for conditional jumps we eagerly mark precision for
> both registers only in always/never taken scenarios.
> 
> For now just narrowing down the issue, as I'm sure not many people
> followed all the above stuff carefully.
> 
> 
> P.S. For solving tracking of linked registers we can probably utilize
> instruction history, though technically they can be spread across
> multiple frames, between registers and stack slots, so that's a bit
> tricky.

For precision back-propagation we currently rely on id values stored
in the parent state, when moving up from child to parent boundary.

  reply	other threads:[~2023-12-14  0:09 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 15:31 [Bug Report] bpf: incorrectly pruning runtime execution path Hao Sun
2023-12-13  0:51 ` Andrii Nakryiko
2023-12-13 10:25   ` Hao Sun
2023-12-13 23:30     ` Andrii Nakryiko
2023-12-14  0:08       ` Eduard Zingerman [this message]
2023-12-14  0:36         ` Andrii Nakryiko
2023-12-13 23:35     ` Eduard Zingerman
2023-12-13 23:40       ` Andrii Nakryiko
2023-12-13 23:47         ` Eduard Zingerman
2023-12-13 23:50           ` Andrii Nakryiko
2023-12-14  9:38       ` Hao Sun
2023-12-14 15:10         ` Eduard Zingerman
2023-12-14 16:26           ` Eduard Zingerman
2023-12-15  0:06             ` Andrii Nakryiko
2023-12-15  0:16               ` Eduard Zingerman
2023-12-15  0:49               ` Eduard Zingerman
2023-12-15  1:24                 ` Eduard Zingerman
2023-12-15  1:43                   ` Eduard Zingerman
2023-12-15  2:16                   ` Alexei Starovoitov
2023-12-15  2:28                     ` Eduard Zingerman
2023-12-15  5:20                       ` Andrii Nakryiko
2023-12-15 16:22                         ` Eduard Zingerman
2023-12-15 17:01                           ` Andrii Nakryiko
2023-12-15 20:55                     ` Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35be3c5f29ee0e9a49ed29e71044f0ad25d97d9d.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sunhao.th@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox