From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andrew Werner <awerner32@gmail.com>, bpf <bpf@vger.kernel.org>,
Andrei Matei <andreimatei1@gmail.com>,
Tamir Duberstein <tamird@gmail.com>,
Joanne Koong <joannelkoong@gmail.com>,
kernel-team@dataexmachina.dev, Song Liu <song@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>
Subject: Re: [BUG] verifier escape with iteration helpers (bpf_loop, ...)
Date: Fri, 22 Sep 2023 04:01:05 +0300 [thread overview]
Message-ID: <ec118c24a33fb740ecaafd9a55416d56fcb77776.camel@gmail.com> (raw)
In-Reply-To: <52df1240415be1ee8827cb6395fd339a720e229c.camel@gmail.com>
On Thu, 2023-09-21 at 21:16 +0300, Eduard Zingerman wrote:
> On Thu, 2023-09-21 at 09:35 -0700, Andrii Nakryiko wrote:
> > I've been thinking in a similar direction as Alexei, overnight. Here
> > are some raw thoughts.
> >
> > I think the overall approach with iter verification is sound. If we
> > loop and see an equivalent state at the entry to next loop iteration,
> > then it's safe to assume doing many iterations is safe. The problem is
> > that delayed precision and read marks make this state equivalence
> > wrong in some case. So we need to find a solution that will ensure
> > that all precision and read marks are propagated to parent state
> > before making a decision about convergence.
> >
> > The idea is to let verifier explore all code paths starting from
> > iteration #N, except the code paths that lead to looping into
> > iteration #N+1. I tried to do that with validating NULL case first and
> > exiting from loop on each iteration (first), but clearly that doesn't
> > capture all the cases, as Eduard have shown.
> >
> > So what if we delay convergence state checks (and then further
> > exploration at iteration #N+1) using BFS instead of DFS? That is, when
> > we are about to start new iteration and check state convergence, we
> > enqueue current state to be processed later after all the states that
> > "belong" to iteration #N are processed.
>
> This sounds correct if one iterator is involved.
>
> > We should work out exact details on how to do this hybrid BFS+DFS, but
> > let's think carefully if this will solve the problems?
> >
> > I think this is conceptually similar to what Alexei proposed above.
> > Basically, we "unroll" loop verification iteration by iteration, but
> > make sure that we explore all the branching within iteration #N before
> > going one iteration deeper.
> >
> > Let's think if there are any cases which wouldn't be handled. And then
> > think how to implement this elegantly (e.g., some sort of queue within
> > a parent state, which sounds similar to this separate "branches"
> > counter that Alexei is proposing above).
>
> To better understand the suggestion, suppose there is some iterator
> 'I' and two states:
> - state S1 where depth(I) == N and pending instruction is "next(I)"
> - state S2 where depth(I) == N and pending instruction is *not* "next(I)"
> In such situation state S2 should be verified first, right?
> And in general, any state that is not at "next(I)" should be explored
> before S1, right?
>
> Such interpretation seems to be prone to deadlocks, e.g. suppose there
> are two iterators: I1 and I2, and two states:
> - state S1 with depth(I1) == N, depth(I2) == M, at instruction "next(I1)";
> - state S2 with depth(I1) == N, depth(I2) == M, at instruction "next(I2)".
>
> E.g. like in the following loop:
>
> for (;;) {
> if (<random condition>)
> if (!next(it1)) break; // a
> else
> if (!next(it2)) break; // b
> ...
> }
>
> I think it is possible to get to two states here:
> - (a) it1(active, depth 0), it2(active, depth 0) at insn "next(it1)"
> - (b) it1(active, depth 0), it2(active, depth 0) at insn "next(it2)"
>
> And it is unclear which one should be processed next.
> Am I missing something?
Not sure if such ordering issues a real issues or any of the
candidates could be picked.
How about a more dumb solution which, imho, is easier to convince
oneself is correct (at-least from logical pov, not necessarily
implementation): substitute the goal of finding exact precision marks
with the goal of finding conservative precision marks. These marks
could be used instead of exact for states_equal() when .branches > 0.
A straightforward way to compute such marks is to run a flow-sensitive
context/path-insensitive backwards DFA before do_check(). The result
of this DFA is a bitmask encoding which registers / stack slots may be
precise at each instruction. Information for instructions other than
is_iter_next_insn() could be discarded.
Looking at the tests that are currently failing with my local fixes
(which force exact states comparions for .branches > 0) I think that
DFA based version would cover all of them.
This sure adds some code to the verifier, however changing current
states traversal logic from DFS to combination of DFS+BFS is a
significant change as well.
Wdyt?
next prev parent reply other threads:[~2023-09-22 1:01 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-07 14:04 [BUG] verifier escape with iteration helpers (bpf_loop, ...) Andrew Werner
2023-07-07 16:44 ` Alexei Starovoitov
2023-07-07 18:08 ` Andrii Nakryiko
2023-07-07 18:21 ` Andrew Werner
2023-09-17 21:37 ` Eduard Zingerman
2023-09-17 22:09 ` Kumar Kartikeya Dwivedi
2023-09-18 13:06 ` Eduard Zingerman
2023-09-19 16:28 ` Eduard Zingerman
2023-09-19 23:02 ` Andrii Nakryiko
2023-09-20 0:19 ` Eduard Zingerman
2023-09-20 16:20 ` Eduard Zingerman
2023-09-20 16:57 ` Andrii Nakryiko
2023-09-21 9:14 ` Alexei Starovoitov
2023-09-21 11:03 ` Eduard Zingerman
2023-09-21 12:56 ` Alexei Starovoitov
2023-09-21 16:23 ` Eduard Zingerman
2023-09-21 16:35 ` Andrii Nakryiko
2023-09-21 18:16 ` Eduard Zingerman
2023-09-22 1:01 ` Eduard Zingerman [this message]
2023-09-22 2:48 ` Andrii Nakryiko
2023-09-22 18:36 ` Eduard Zingerman
2023-09-22 20:52 ` Andrii Nakryiko
2023-09-25 1:01 ` Eduard Zingerman
2023-09-26 0:33 ` Andrii Nakryiko
2023-09-26 15:55 ` Eduard Zingerman
2023-09-26 16:25 ` Andrii Nakryiko
2023-09-28 1:09 ` Eduard Zingerman
2023-09-28 18:30 ` Andrii Nakryiko
2023-10-02 3:26 ` Eduard Zingerman
2023-09-30 0:41 ` Alexei Starovoitov
2023-10-02 1:40 ` Eduard Zingerman
2023-10-02 16:29 ` Alexei Starovoitov
2023-10-02 17:18 ` Eduard Zingerman
2023-10-03 0:05 ` Alexei Starovoitov
2023-10-03 2:00 ` Alexei Starovoitov
2023-10-03 15:33 ` Eduard Zingerman
2023-10-03 16:07 ` Alexei Starovoitov
2023-10-03 18:50 ` Alexei Starovoitov
2023-10-03 21:52 ` Eduard Zingerman
2023-10-03 22:03 ` Eduard Zingerman
2023-10-03 23:08 ` Alexei Starovoitov
2023-10-03 23:14 ` Eduard Zingerman
2023-10-04 0:22 ` Andrii Nakryiko
2023-10-04 1:05 ` Eduard Zingerman
2023-10-04 2:57 ` Alexei Starovoitov
2023-10-04 5:50 ` Alexei Starovoitov
2023-10-04 9:49 ` Eduard Zingerman
2023-10-04 11:52 ` Eduard Zingerman
2023-09-19 23:14 ` Andrii Nakryiko
2023-09-20 0:06 ` Eduard Zingerman
2023-09-20 16:37 ` Andrii Nakryiko
2023-09-20 17:13 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ec118c24a33fb740ecaafd9a55416d56fcb77776.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andreimatei1@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=awerner32@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@dataexmachina.dev \
--cc=memxor@gmail.com \
--cc=song@kernel.org \
--cc=tamird@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox