BPF List
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Andrew Werner <awerner32@gmail.com>, bpf <bpf@vger.kernel.org>,
	Andrei Matei <andreimatei1@gmail.com>,
	Tamir Duberstein <tamird@gmail.com>,
	Joanne Koong <joannelkoong@gmail.com>,
	kernel-team@dataexmachina.dev, Song Liu <song@kernel.org>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>
Subject: Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)
Date: Fri, 22 Sep 2023 04:01:05 +0300	[thread overview]
Message-ID: <ec118c24a33fb740ecaafd9a55416d56fcb77776.camel@gmail.com> (raw)
In-Reply-To: <52df1240415be1ee8827cb6395fd339a720e229c.camel@gmail.com>

On Thu, 2023-09-21 at 21:16 +0300, Eduard Zingerman wrote:
> On Thu, 2023-09-21 at 09:35 -0700, Andrii Nakryiko wrote:
> > I've been thinking in a similar direction as Alexei, overnight. Here
> > are some raw thoughts.
> > 
> > I think the overall approach with iter verification is sound. If we
> > loop and see an equivalent state at the entry to next loop iteration,
> > then it's safe to assume doing many iterations is safe. The problem is
> > that delayed precision and read marks make this state equivalence
> > wrong in some case. So we need to find a solution that will ensure
> > that all precision and read marks are propagated to parent state
> > before making a decision about convergence.
> > 
> > The idea is to let verifier explore all code paths starting from
> > iteration #N, except the code paths that lead to looping into
> > iteration #N+1. I tried to do that with validating NULL case first and
> > exiting from loop on each iteration (first), but clearly that doesn't
> > capture all the cases, as Eduard have shown.
> > 
> > So what if we delay convergence state checks (and then further
> > exploration at iteration #N+1) using BFS instead of DFS? That is, when
> > we are about to start new iteration and check state convergence, we
> > enqueue current state to be processed later after all the states that
> > "belong" to iteration #N are processed.
> 
> This sounds correct if one iterator is involved.
> 
> > We should work out exact details on how to do this hybrid BFS+DFS, but
> > let's think carefully if this will solve the problems?
> > 
> > I think this is conceptually similar to what Alexei proposed above.
> > Basically, we "unroll" loop verification iteration by iteration, but
> > make sure that we explore all the branching within iteration #N before
> > going one iteration deeper.
> > 
> > Let's think if there are any cases which wouldn't be handled. And then
> > think how to implement this elegantly (e.g., some sort of queue within
> > a parent state, which sounds similar to this separate "branches"
> > counter that Alexei is proposing above).
> 
> To better understand the suggestion, suppose there is some iterator
> 'I' and two states:
> - state S1 where depth(I) == N and pending instruction is "next(I)"
> - state S2 where depth(I) == N and pending instruction is *not* "next(I)"
> In such situation state S2 should be verified first, right?
> And in general, any state that is not at "next(I)" should be explored
> before S1, right?
> 
> Such interpretation seems to be prone to deadlocks, e.g. suppose there
> are two iterators: I1 and I2, and two states:
> - state S1 with depth(I1) == N, depth(I2) == M, at instruction "next(I1)";
> - state S2 with depth(I1) == N, depth(I2) == M, at instruction "next(I2)".
> 
> E.g. like in the following loop:
> 
>     for (;;) {
>       if (<random condition>)
>         if (!next(it1)) break; // a
>       else
>         if (!next(it2)) break; // b
>       ...
>     }
> 
> I think it is possible to get to two states here:
> - (a) it1(active, depth 0), it2(active, depth 0) at insn "next(it1)"
> - (b) it1(active, depth 0), it2(active, depth 0) at insn "next(it2)"
> 
> And it is unclear which one should be processed next.
> Am I missing something?

Not sure if such ordering issues a real issues or any of the
candidates could be picked.

How about a more dumb solution which, imho, is easier to convince
oneself is correct (at-least from logical pov, not necessarily
implementation): substitute the goal of finding exact precision marks
with the goal of finding conservative precision marks. These marks
could be used instead of exact for states_equal() when .branches > 0.

A straightforward way to compute such marks is to run a flow-sensitive
context/path-insensitive backwards DFA before do_check(). The result
of this DFA is a bitmask encoding which registers / stack slots may be
precise at each instruction. Information for instructions other than
is_iter_next_insn() could be discarded.

Looking at the tests that are currently failing with my local fixes
(which force exact states comparions for .branches > 0) I think that
DFA based version would cover all of them.

This sure adds some code to the verifier, however changing current
states traversal logic from DFS to combination of DFS+BFS is a
significant change as well.

Wdyt?

  reply	other threads:[~2023-09-22  1:01 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 14:04 [BUG] verifier escape with iteration helpers (bpf_loop, ...) Andrew Werner
2023-07-07 16:44 ` Alexei Starovoitov
2023-07-07 18:08   ` Andrii Nakryiko
2023-07-07 18:21     ` Andrew Werner
2023-09-17 21:37     ` Eduard Zingerman
2023-09-17 22:09       ` Kumar Kartikeya Dwivedi
2023-09-18 13:06         ` Eduard Zingerman
2023-09-19 16:28           ` Eduard Zingerman
2023-09-19 23:02             ` Andrii Nakryiko
2023-09-20  0:19               ` Eduard Zingerman
2023-09-20 16:20                 ` Eduard Zingerman
2023-09-20 16:57                   ` Andrii Nakryiko
2023-09-21  9:14                 ` Alexei Starovoitov
2023-09-21 11:03                   ` Eduard Zingerman
2023-09-21 12:56                     ` Alexei Starovoitov
2023-09-21 16:23                       ` Eduard Zingerman
2023-09-21 16:35                         ` Andrii Nakryiko
2023-09-21 18:16                           ` Eduard Zingerman
2023-09-22  1:01                             ` Eduard Zingerman [this message]
2023-09-22  2:48                               ` Andrii Nakryiko
2023-09-22 18:36                                 ` Eduard Zingerman
2023-09-22 20:52                                   ` Andrii Nakryiko
2023-09-25  1:01                                     ` Eduard Zingerman
2023-09-26  0:33                                       ` Andrii Nakryiko
2023-09-26 15:55                                         ` Eduard Zingerman
2023-09-26 16:25                                           ` Andrii Nakryiko
2023-09-28  1:09                                             ` Eduard Zingerman
2023-09-28 18:30                                               ` Andrii Nakryiko
2023-10-02  3:26                                                 ` Eduard Zingerman
2023-09-30  0:41                                               ` Alexei Starovoitov
2023-10-02  1:40                                                 ` Eduard Zingerman
2023-10-02 16:29                                                   ` Alexei Starovoitov
2023-10-02 17:18                                                     ` Eduard Zingerman
2023-10-03  0:05                                                       ` Alexei Starovoitov
2023-10-03  2:00                                                         ` Alexei Starovoitov
2023-10-03 15:33                                                         ` Eduard Zingerman
2023-10-03 16:07                                                           ` Alexei Starovoitov
2023-10-03 18:50                                                           ` Alexei Starovoitov
2023-10-03 21:52                                                             ` Eduard Zingerman
2023-10-03 22:03                                                               ` Eduard Zingerman
2023-10-03 23:08                                                               ` Alexei Starovoitov
2023-10-03 23:14                                                                 ` Eduard Zingerman
2023-10-04  0:22                                                                   ` Andrii Nakryiko
2023-10-04  1:05                                                                 ` Eduard Zingerman
2023-10-04  2:57                                                                   ` Alexei Starovoitov
2023-10-04  5:50                                                                     ` Alexei Starovoitov
2023-10-04  9:49                                                                       ` Eduard Zingerman
2023-10-04 11:52                                                                     ` Eduard Zingerman
2023-09-19 23:14       ` Andrii Nakryiko
2023-09-20  0:06         ` Eduard Zingerman
2023-09-20 16:37           ` Andrii Nakryiko
2023-09-20 17:13             ` Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ec118c24a33fb740ecaafd9a55416d56fcb77776.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andreimatei1@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=awerner32@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=joannelkoong@gmail.com \
    --cc=kernel-team@dataexmachina.dev \
    --cc=memxor@gmail.com \
    --cc=song@kernel.org \
    --cc=tamird@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox