From: Eduard Zingerman <eddyz87@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, martin.lau@linux.dev, kernel-team@fb.com,
yonghong.song@linux.dev, memxor@gmail.com, awerner32@gmail.com,
john.fastabend@gmail.com,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Subject: Re: [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks
Date: Mon, 23 Oct 2023 16:38:56 +0300 [thread overview]
Message-ID: <058074a906af81ce86b0d70005289177d94f6a65.camel@gmail.com> (raw)
In-Reply-To: <CAEf4BzbLq7rN-nsgx86wqGPg_kUEwOc=Mvh8OL6=icPk3tf1Aw@mail.gmail.com>
On Sat, 2023-10-21 at 21:16 -0700, Andrii Nakryiko wrote:
> On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > Convergence for open coded iterators is computed in is_state_visited()
> > by examining states with branches count > 1 and using states_equal().
> > states_equal() computes sub-state relation using read and precision marks.
> > Read and precision marks are propagated from children states,
> > thus are not guaranteed to be complete inside a loop when branches
> > count > 1. This could be demonstrated using the following unsafe program:
> >
> > 1. r7 = -16
> > 2. r6 = bpf_get_prandom_u32()
> > 3. while (bpf_iter_num_next(&fp[-8])) {
> > 4. if (r6 != 42) {
> > 5. r7 = -32
> > 6. r6 = bpf_get_prandom_u32()
> > 7. continue
> > 8. }
> > 9. r0 = r10
> > 10. r0 += r7
> > 11. r8 = *(u64 *)(r0 + 0)
> > 12. r6 = bpf_get_prandom_u32()
> > 13. }
> >
> > Here verifier would first visit path 1-3, create a checkpoint at 3
> > with r7=-16, continue to 4-7,3 with r7=-32.
> >
> > Because instructions at 9-12 had not been visitied yet existing
> > checkpoint at 3 does not have read or precision mark for r7.
> > Thus states_equal() would return true and verifier would discard
> > current state, thus unsafe memory access at 11 would not be caught.
> >
> > This commit fixes this loophole by introducing exact state comparisons
> > for iterator convergence logic:
> > - registers are compared using regs_exact() regardless of read or
> > precision marks;
> > - stack slots have to have identical type.
> >
> > Unfortunately, this is too strict even for simple programs like below:
> >
> > i = 0;
> > while(iter_next(&it))
> > i++;
> >
> > At each iteration step i++ would produce a new distinct state and
> > eventually instruction processing limit would be reached.
> >
> > To avoid such behavior speculatively forget (widen) range for
> > imprecise scalar registers, if those registers were not precise at the
> > end of the previous iteration and do not match exactly.
> >
> > This a conservative heuristic that allows to verify wide range of
> > programs, however it precludes verification of programs that conjure
> > an imprecise value on the first loop iteration and use it as precise
> > on the second.
> >
> > Test case iter_task_vma_for_each() presents one of such cases:
> >
> > unsigned int seen = 0;
> > ...
> > bpf_for_each(task_vma, vma, task, 0) {
> > if (seen >= 1000)
> > break;
> > ...
> > seen++;
> > }
> >
> > Here clang generates the following code:
> >
> > <LBB0_4>:
> > 24: r8 = r6 ; stash current value of
> > ... body ... 'seen'
> > 29: r1 = r10
> > 30: r1 += -0x8
> > 31: call bpf_iter_task_vma_next
> > 32: r6 += 0x1 ; seen++;
> > 33: if r0 == 0x0 goto +0x2 <LBB0_6> ; exit on next() == NULL
> > 34: r7 += 0x10
> > 35: if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
> >
> > <LBB0_6>:
> > ... exit ...
> >
> > Note that counter in r6 is copied to r8 and then incremented,
> > conditional jump is done using r8. Because of this precision mark for
> > r6 lags one state behind of precision mark on r8 and widening logic
> > kicks in.
> >
> > Adding barrier_var(seen) after conditional is sufficient to force
> > clang use the same register for both counting and conditional jump.
> >
> > This issue was discussed in the thread [1] which was started by
> > Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
> > in callback functions handling. The callbacks would be addressed
> > in a followup patch.
> >
> > [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
> >
> > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> > include/linux/bpf_verifier.h | 1 +
> > kernel/bpf/verifier.c | 212 +++++++++++++++---
> > .../selftests/bpf/progs/iters_task_vma.c | 1 +
> > 3 files changed, 184 insertions(+), 30 deletions(-)
> >
>
> [...]
>
> > +static int widen_imprecise_scalars(struct bpf_verifier_env *env,
> > + struct bpf_verifier_state *old,
> > + struct bpf_verifier_state *cur)
> > +{
> > + struct bpf_func_state *fold, *fcur;
> > + int i, fr;
> > +
> > + reset_idmap_scratch(env);
> > + for (fr = old->curframe; fr >= 0; fr--) {
> > + fold = old->frame[fr];
> > + fcur = cur->frame[fr];
> > +
> > + for (i = 0; i < MAX_BPF_REG; i++)
> > + maybe_widen_reg(env,
> > + &fold->regs[i],
> > + &fcur->regs[i],
> > + &env->idmap_scratch);
> > +
> > + for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) {
> > + if (is_spilled_scalar_reg(&fold->stack[i]))
>
> should this be !is_spilled_scalar_reg()?
>
> but also your should make sure that fcur->stack[i] is also a spilled
> reg (it could be iter or dynptr, or just hard-coded ZERO or MISC stack
> slot, so accessing fcur->stack[i].spilled_ptr below might be invalid)
Yeap, this is a bug, thank you.
I'll add a test case and submit V3 today.
>
> > + continue;
>
> > +
> > + maybe_widen_reg(env,
> > + &fold->stack[i].spilled_ptr,
> > + &fcur->stack[i].spilled_ptr,
> > + &env->idmap_scratch);
> > + }
> > + }
> > + return 0;
> > +}
> > +
>
> [...]
next prev parent reply other threads:[~2023-10-23 13:39 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-22 1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-22 1:08 ` [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c Eduard Zingerman
2023-10-22 1:08 ` [PATCH bpf-next v2 2/7] bpf: extract same_callsites() as utility function Eduard Zingerman
2023-10-22 1:08 ` [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-22 4:16 ` Andrii Nakryiko
2023-10-23 13:38 ` Eduard Zingerman [this message]
2023-10-22 1:08 ` [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body Eduard Zingerman
2023-10-22 3:00 ` kernel test robot
2023-10-22 1:08 ` [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence Eduard Zingerman
2023-10-22 4:28 ` Andrii Nakryiko
2023-10-23 14:47 ` Eduard Zingerman
2023-10-23 16:16 ` Andrii Nakryiko
2023-10-22 1:08 ` [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case Eduard Zingerman
2023-10-22 3:11 ` kernel test robot
2023-10-22 1:08 ` [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection Eduard Zingerman
2023-10-22 4:28 ` Andrii Nakryiko
2023-10-23 17:17 ` [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-23 21:40 ` Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=058074a906af81ce86b0d70005289177d94f6a65.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=awerner32@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox