[PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks
@ 2023-10-22  1:08 Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c Eduard Zingerman
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

Iterator convergence logic in is_state_visited() uses state_equals()
for states with branches counter > 0 to check if iterator based loop
converges. This is not fully correct because state_equals() relies on
presence of read and precision marks on registers. These marks are not
guaranteed to be finalized while state has branches.
Commit message for patch #3 describes a program that exhibits such
behavior.

This patch-set aims to fix iterator convergence logic by adding notion
of exact states comparison. Exact comparison does not rely on presence
of read or precision marks and thus is more strict.
As explained in commit message for patch #3 exact comparisons require
addition of speculative register bounds widening. The end result for
BPF verifier users could be summarized as follows:

(!) After this update verifier would reject programs that conjure an
    imprecise value on the first loop iteration and use it as precise
    on the second (for iterator based loops).

I urge people to at least skim over the commit message for patch #3.

Patches are organized as follows:
- patches #1,2: moving/extracting utility functions;
- patch #3: introduces exact mode for states comparison and adds
  widening heuristic;
- patch #4: adds test-cases that demonstrate why the series is
  necessary;
- patch #5: extends patch #3 with a notion of state loop entries,
  these entries have to be tracked to correctly identify that
  different verifier states belong to the same states loop;
- patch #6: adds a test-case that demonstrates a program
  which requires loop entry tracking for correct verification;
- patch #7: just adds a few debug prints.

The following actions are planned as a followup for this patch-set:
- implementation has to be adapted for callbacks handling logic as a
  part of a fix for [1];
- it is necessary to explore ways to improve widening heuristic to
  handle iters_task_vma test w/o need to insert barrier_var() calls;
- explored states eviction logic on cache miss has to be extended
  to either:
  - allow eviction of checkpoint states -or-
  - be sped up in case if there are many active checkpoints associated
    with the same instruction.

The patch-set is a followup for mailing list discussion [1].

Changelog:
V1 -> V2 [2], applied changes suggested by Alexei offlist:
- __explored_state() function removed;
- same_callsites() function is now used in clean_live_states();
- patches #1,2 are added as preparatory code movement;
- in process_iter_next_call() a safeguard is added to verify that
  cur_st->parent exists and has expected insn index / call sites.

[1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
[2] https://lore.kernel.org/bpf/20231021005939.1041-1-eddyz87@gmail.com/

Eduard Zingerman (7):
  bpf: move explored_state() closer to the beginning of verifier.c
  bpf: extract same_callsites() as utility function
  bpf: exact states comparison for iterator convergence checks
  selftests/bpf: tests with delayed read/precision makrs in loop body
  bpf: correct loop detection for iterators convergence
  selftests/bpf: test if state loops are detected in a tricky case
  bpf: print full verifier states on infinite loop detection

 include/linux/bpf_verifier.h                  |  16 +
 kernel/bpf/verifier.c                         | 469 +++++++++++--
 tools/testing/selftests/bpf/progs/iters.c     | 620 ++++++++++++++++++
 .../selftests/bpf/progs/iters_task_vma.c      |   1 +
 4 files changed, 1053 insertions(+), 53 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 2/7] bpf: extract same_callsites() as utility function Eduard Zingerman
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

Subsequent patches would make use of explored_state() function.
Move it up to avoid adding unnecessary prototype.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e9bc5d4a25a1..e6232b5d3964 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1817,6 +1817,19 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	return 0;
 }
 
+static u32 state_htab_size(struct bpf_verifier_env *env)
+{
+	return env->prog->len;
+}
+
+static struct bpf_verifier_state_list **explored_state(struct bpf_verifier_env *env, int idx)
+{
+	struct bpf_verifier_state *cur = env->cur_state;
+	struct bpf_func_state *state = cur->frame[cur->curframe];
+
+	return &env->explored_states[(idx ^ state->callsite) % state_htab_size(env)];
+}
+
 static void update_branch_counts(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
 {
 	while (st) {
@@ -15020,21 +15033,6 @@ enum {
 	BRANCH = 2,
 };
 
-static u32 state_htab_size(struct bpf_verifier_env *env)
-{
-	return env->prog->len;
-}
-
-static struct bpf_verifier_state_list **explored_state(
-					struct bpf_verifier_env *env,
-					int idx)
-{
-	struct bpf_verifier_state *cur = env->cur_state;
-	struct bpf_func_state *state = cur->frame[cur->curframe];
-
-	return &env->explored_states[(idx ^ state->callsite) % state_htab_size(env)];
-}
-
 static void mark_prune_point(struct bpf_verifier_env *env, int idx)
 {
 	env->insn_aux_data[idx].prune_point = true;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 2/7] bpf: extract same_callsites() as utility function
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

Extract same_callsites() from clean_live_states() as a utility function.
This function would be used by the next patch in the set.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e6232b5d3964..366029e484a0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1830,6 +1830,20 @@ static struct bpf_verifier_state_list **explored_state(struct bpf_verifier_env *
 	return &env->explored_states[(idx ^ state->callsite) % state_htab_size(env)];
 }
 
+static bool same_callsites(struct bpf_verifier_state *a, struct bpf_verifier_state *b)
+{
+	int fr;
+
+	if (a->curframe != b->curframe)
+		return false;
+
+	for (fr = a->curframe; fr >= 0; fr--)
+		if (a->frame[fr]->callsite != b->frame[fr]->callsite)
+			return false;
+
+	return true;
+}
+
 static void update_branch_counts(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
 {
 	while (st) {
@@ -15909,18 +15923,14 @@ static void clean_live_states(struct bpf_verifier_env *env, int insn,
 			      struct bpf_verifier_state *cur)
 {
 	struct bpf_verifier_state_list *sl;
-	int i;
 
 	sl = *explored_state(env, insn);
 	while (sl) {
 		if (sl->state.branches)
 			goto next;
 		if (sl->state.insn_idx != insn ||
-		    sl->state.curframe != cur->curframe)
+		    !same_callsites(&sl->state, cur))
 			goto next;
-		for (i = 0; i <= cur->curframe; i++)
-			if (sl->state.frame[i]->callsite != cur->frame[i]->callsite)
-				goto next;
 		clean_verifier_state(env, &sl->state);
 next:
 		sl = sl->next;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c Eduard Zingerman
  2023-10-22  1:08 ` [PATCH bpf-next v2 2/7] bpf: extract same_callsites() as utility function Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  4:16   ` Andrii Nakryiko
  2023-10-22  1:08 ` [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body Eduard Zingerman
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman, Andrii Nakryiko,
	Alexei Starovoitov

Convergence for open coded iterators is computed in is_state_visited()
by examining states with branches count > 1 and using states_equal().
states_equal() computes sub-state relation using read and precision marks.
Read and precision marks are propagated from children states,
thus are not guaranteed to be complete inside a loop when branches
count > 1. This could be demonstrated using the following unsafe program:

     1. r7 = -16
     2. r6 = bpf_get_prandom_u32()
     3. while (bpf_iter_num_next(&fp[-8])) {
     4.   if (r6 != 42) {
     5.     r7 = -32
     6.     r6 = bpf_get_prandom_u32()
     7.     continue
     8.   }
     9.   r0 = r10
    10.   r0 += r7
    11.   r8 = *(u64 *)(r0 + 0)
    12.   r6 = bpf_get_prandom_u32()
    13. }

Here verifier would first visit path 1-3, create a checkpoint at 3
with r7=-16, continue to 4-7,3 with r7=-32.

Because instructions at 9-12 had not been visitied yet existing
checkpoint at 3 does not have read or precision mark for r7.
Thus states_equal() would return true and verifier would discard
current state, thus unsafe memory access at 11 would not be caught.

This commit fixes this loophole by introducing exact state comparisons
for iterator convergence logic:
- registers are compared using regs_exact() regardless of read or
  precision marks;
- stack slots have to have identical type.

Unfortunately, this is too strict even for simple programs like below:

    i = 0;
    while(iter_next(&it))
      i++;

At each iteration step i++ would produce a new distinct state and
eventually instruction processing limit would be reached.

To avoid such behavior speculatively forget (widen) range for
imprecise scalar registers, if those registers were not precise at the
end of the previous iteration and do not match exactly.

This a conservative heuristic that allows to verify wide range of
programs, however it precludes verification of programs that conjure
an imprecise value on the first loop iteration and use it as precise
on the second.

Test case iter_task_vma_for_each() presents one of such cases:

        unsigned int seen = 0;
        ...
        bpf_for_each(task_vma, vma, task, 0) {
                if (seen >= 1000)
                        break;
                ...
                seen++;
        }

Here clang generates the following code:

<LBB0_4>:
      24:       r8 = r6                          ; stash current value of
                ... body ...                       'seen'
      29:       r1 = r10
      30:       r1 += -0x8
      31:       call bpf_iter_task_vma_next
      32:       r6 += 0x1                        ; seen++;
      33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
      34:       r7 += 0x10
      35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000

<LBB0_6>:
      ... exit ...

Note that counter in r6 is copied to r8 and then incremented,
conditional jump is done using r8. Because of this precision mark for
r6 lags one state behind of precision mark on r8 and widening logic
kicks in.

Adding barrier_var(seen) after conditional is sufficient to force
clang use the same register for both counting and conditional jump.

This issue was discussed in the thread [1] which was started by
Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
in callback functions handling. The callbacks would be addressed
in a followup patch.

[1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/

Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h                  |   1 +
 kernel/bpf/verifier.c                         | 212 +++++++++++++++---
 .../selftests/bpf/progs/iters_task_vma.c      |   1 +
 3 files changed, 184 insertions(+), 30 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index e67cd45a85be..38b788228594 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -384,6 +384,7 @@ struct bpf_verifier_state {
 	 */
 	struct bpf_idx_pair *jmp_history;
 	u32 jmp_history_cnt;
+	u32 dfs_depth;
 };
 
 #define bpf_get_spilled_reg(slot, frame, mask)				\
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 366029e484a0..23e4eb0a5c23 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1802,6 +1802,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	dst_state->parent = src->parent;
 	dst_state->first_insn_idx = src->first_insn_idx;
 	dst_state->last_insn_idx = src->last_insn_idx;
+	dst_state->dfs_depth = src->dfs_depth;
 	for (i = 0; i <= src->curframe; i++) {
 		dst = dst_state->frame[i];
 		if (!dst) {
@@ -7723,6 +7724,80 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
 	return 0;
 }
 
+/* Look for a previous loop entry at insn_idx: nearest parent state
+ * stopped at insn_idx with callsites matching those in cur->frame.
+ */
+static struct bpf_verifier_state *find_prev_entry(struct bpf_verifier_env *env,
+						  struct bpf_verifier_state *cur,
+						  int insn_idx)
+{
+	struct bpf_verifier_state_list *sl;
+	struct bpf_verifier_state *st;
+
+	/* Explored states are pushed in stack order, most recent states come first */
+	sl = *explored_state(env, insn_idx);
+	for (; sl; sl = sl->next) {
+		/* If st->branches != 0 state is a part of current DFS verification path,
+		 * hence cur & st for a loop.
+		 */
+		st = &sl->state;
+		if (st->insn_idx == insn_idx && st->branches && same_callsites(st, cur) &&
+		    st->dfs_depth < cur->dfs_depth)
+			return st;
+	}
+
+	return NULL;
+}
+
+static void reset_idmap_scratch(struct bpf_verifier_env *env);
+static bool regs_exact(const struct bpf_reg_state *rold,
+		       const struct bpf_reg_state *rcur,
+		       struct bpf_idmap *idmap);
+
+static void maybe_widen_reg(struct bpf_verifier_env *env,
+			    struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
+			    struct bpf_idmap *idmap)
+{
+	if (rold->type != SCALAR_VALUE)
+		return;
+	if (rold->type != rcur->type)
+		return;
+	if (rold->precise || rcur->precise || regs_exact(rold, rcur, idmap))
+		return;
+	__mark_reg_unknown(env, rcur);
+}
+
+static int widen_imprecise_scalars(struct bpf_verifier_env *env,
+				   struct bpf_verifier_state *old,
+				   struct bpf_verifier_state *cur)
+{
+	struct bpf_func_state *fold, *fcur;
+	int i, fr;
+
+	reset_idmap_scratch(env);
+	for (fr = old->curframe; fr >= 0; fr--) {
+		fold = old->frame[fr];
+		fcur = cur->frame[fr];
+
+		for (i = 0; i < MAX_BPF_REG; i++)
+			maybe_widen_reg(env,
+					&fold->regs[i],
+					&fcur->regs[i],
+					&env->idmap_scratch);
+
+		for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) {
+			if (is_spilled_scalar_reg(&fold->stack[i]))
+				continue;
+
+			maybe_widen_reg(env,
+					&fold->stack[i].spilled_ptr,
+					&fcur->stack[i].spilled_ptr,
+					&env->idmap_scratch);
+		}
+	}
+	return 0;
+}
+
 /* process_iter_next_call() is called when verifier gets to iterator's next
  * "method" (e.g., bpf_iter_num_next() for numbers iterator) call. We'll refer
  * to it as just "iter_next()" in comments below.
@@ -7764,25 +7839,47 @@ static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_id
  * is some statically known limit on number of iterations (e.g., if there is
  * an explicit `if n > 100 then break;` statement somewhere in the loop).
  *
- * One very subtle but very important aspect is that we *always* simulate NULL
- * condition first (as the current state) before we simulate non-NULL case.
- * This has to do with intricacies of scalar precision tracking. By simulating
- * "exit condition" of iter_next() returning NULL first, we make sure all the
- * relevant precision marks *that will be set **after** we exit iterator loop*
- * are propagated backwards to common parent state of NULL and non-NULL
- * branches. Thanks to that, state equivalence checks done later in forked
- * state, when reaching iter_next() for ACTIVE iterator, can assume that
- * precision marks are finalized and won't change. Because simulating another
- * ACTIVE iterator iteration won't change them (because given same input
- * states we'll end up with exactly same output states which we are currently
- * comparing; and verification after the loop already propagated back what
- * needs to be **additionally** tracked as precise). It's subtle, grok
- * precision tracking for more intuitive understanding.
+ * Iteration convergence logic in is_state_visited() relies on exact
+ * states comparison, which ignores read and precision marks.
+ * This is necessary because read and precision marks are not finalized
+ * while in the loop. Exact comparison might preclude convergence for
+ * simple programs like below:
+ *
+ *     i = 0;
+ *     while(iter_next(&it))
+ *       i++;
+ *
+ * At each iteration step i++ would produce a new distinct state and
+ * eventually instruction processing limit would be reached.
+ *
+ * To avoid such behavior speculatively forget (widen) range for
+ * imprecise scalar registers, if those registers were not precise at the
+ * end of the previous iteration and do not match exactly.
+ *
+ * This is a conservative heuristic that allows to verify wide range of programs,
+ * however it precludes verification of programs that conjure an
+ * imprecise value on the first loop iteration and use it as precise on a second.
+ * For example, the following safe program would fail to verify:
+ *
+ *     struct bpf_num_iter it;
+ *     int arr[10];
+ *     int i = 0, a = 0;
+ *     bpf_iter_num_new(&it, 0, 10);
+ *     while (bpf_iter_num_next(&it)) {
+ *       if (a == 0) {
+ *         a = 1;
+ *         i = 7; // Because i changed verifier would forget
+ *                // it's range on second loop entry.
+ *       } else {
+ *         arr[i] = 42; // This would fail to verify.
+ *       }
+ *     }
+ *     bpf_iter_num_destroy(&it);
  */
 static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
 				  struct bpf_kfunc_call_arg_meta *meta)
 {
-	struct bpf_verifier_state *cur_st = env->cur_state, *queued_st;
+	struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st;
 	struct bpf_func_state *cur_fr = cur_st->frame[cur_st->curframe], *queued_fr;
 	struct bpf_reg_state *cur_iter, *queued_iter;
 	int iter_frameno = meta->iter.frameno;
@@ -7800,6 +7897,19 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
 	}
 
 	if (cur_iter->iter.state == BPF_ITER_STATE_ACTIVE) {
+		/* Because iter_next() call is a checkpoint is_state_visitied()
+		 * should guarantee parent state with same call sites and insn_idx.
+		 */
+		if (!cur_st->parent || cur_st->parent->insn_idx != insn_idx ||
+		    !same_callsites(cur_st->parent, cur_st)) {
+			verbose(env, "bug: bad parent state for iter next call");
+			return -EFAULT;
+		}
+		/* Note cur_st->parent in the call below, it is necessary to skip
+		 * checkpoint created for cur_st by is_state_visited()
+		 * right at this instruction.
+		 */
+		prev_st = find_prev_entry(env, cur_st->parent, insn_idx);
 		/* branch out active iter state */
 		queued_st = push_stack(env, insn_idx + 1, insn_idx, false);
 		if (!queued_st)
@@ -7808,6 +7918,8 @@ static int process_iter_next_call(struct bpf_verifier_env *env, int insn_idx,
 		queued_iter = &queued_st->frame[iter_frameno]->stack[iter_spi].spilled_ptr;
 		queued_iter->iter.state = BPF_ITER_STATE_ACTIVE;
 		queued_iter->iter.depth++;
+		if (prev_st)
+			widen_imprecise_scalars(env, prev_st, queued_st);
 
 		queued_fr = queued_st->frame[queued_st->curframe];
 		mark_ptr_not_null_reg(&queued_fr->regs[BPF_REG_0]);
@@ -15948,8 +16060,11 @@ static bool regs_exact(const struct bpf_reg_state *rold,
 
 /* Returns true if (rold safe implies rcur safe) */
 static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
-		    struct bpf_reg_state *rcur, struct bpf_idmap *idmap)
+		    struct bpf_reg_state *rcur, struct bpf_idmap *idmap, bool exact)
 {
+	if (exact)
+		return regs_exact(rold, rcur, idmap);
+
 	if (!(rold->live & REG_LIVE_READ))
 		/* explored state didn't use this */
 		return true;
@@ -16066,7 +16181,7 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 }
 
 static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
-		      struct bpf_func_state *cur, struct bpf_idmap *idmap)
+		      struct bpf_func_state *cur, struct bpf_idmap *idmap, bool exact)
 {
 	int i, spi;
 
@@ -16079,7 +16194,12 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 
 		spi = i / BPF_REG_SIZE;
 
-		if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ)) {
+		if (exact &&
+		    old->stack[spi].slot_type[i % BPF_REG_SIZE] !=
+		    cur->stack[spi].slot_type[i % BPF_REG_SIZE])
+			return false;
+
+		if (!(old->stack[spi].spilled_ptr.live & REG_LIVE_READ) && !exact) {
 			i += BPF_REG_SIZE - 1;
 			/* explored state didn't use this */
 			continue;
@@ -16129,7 +16249,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			 * return false to continue verification of this path
 			 */
 			if (!regsafe(env, &old->stack[spi].spilled_ptr,
-				     &cur->stack[spi].spilled_ptr, idmap))
+				     &cur->stack[spi].spilled_ptr, idmap, exact))
 				return false;
 			break;
 		case STACK_DYNPTR:
@@ -16211,16 +16331,16 @@ static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur,
  * the current state will reach 'bpf_exit' instruction safely
  */
 static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
-			      struct bpf_func_state *cur)
+			      struct bpf_func_state *cur, bool exact)
 {
 	int i;
 
 	for (i = 0; i < MAX_BPF_REG; i++)
 		if (!regsafe(env, &old->regs[i], &cur->regs[i],
-			     &env->idmap_scratch))
+			     &env->idmap_scratch, exact))
 			return false;
 
-	if (!stacksafe(env, old, cur, &env->idmap_scratch))
+	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
 		return false;
 
 	if (!refsafe(old, cur, &env->idmap_scratch))
@@ -16229,17 +16349,23 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
 	return true;
 }
 
+static void reset_idmap_scratch(struct bpf_verifier_env *env)
+{
+	env->idmap_scratch.tmp_id_gen = env->id_gen;
+	memset(&env->idmap_scratch.map, 0, sizeof(env->idmap_scratch.map));
+}
+
 static bool states_equal(struct bpf_verifier_env *env,
 			 struct bpf_verifier_state *old,
-			 struct bpf_verifier_state *cur)
+			 struct bpf_verifier_state *cur,
+			 bool exact)
 {
 	int i;
 
 	if (old->curframe != cur->curframe)
 		return false;
 
-	env->idmap_scratch.tmp_id_gen = env->id_gen;
-	memset(&env->idmap_scratch.map, 0, sizeof(env->idmap_scratch.map));
+	reset_idmap_scratch(env);
 
 	/* Verification state from speculative execution simulation
 	 * must never prune a non-speculative execution one.
@@ -16269,7 +16395,7 @@ static bool states_equal(struct bpf_verifier_env *env,
 	for (i = 0; i <= old->curframe; i++) {
 		if (old->frame[i]->callsite != cur->frame[i]->callsite)
 			return false;
-		if (!func_states_equal(env, old->frame[i], cur->frame[i]))
+		if (!func_states_equal(env, old->frame[i], cur->frame[i], exact))
 			return false;
 	}
 	return true;
@@ -16579,9 +16705,33 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 * It's safe to assume that iterator loop will finish, taking into
 			 * account iter_next() contract of eventually returning
 			 * sticky NULL result.
+			 *
+			 * Note, that states have to be compared exactly in this case because
+			 * read and precision marks might not be finalized inside the loop.
+			 * E.g. as in the program below:
+			 *
+			 *     1. r7 = -16
+			 *     2. r6 = bpf_get_prandom_u32()
+			 *     3. while (bpf_iter_num_next(&fp[-8])) {
+			 *     4.   if (r6 != 42) {
+			 *     5.     r7 = -32
+			 *     6.     r6 = bpf_get_prandom_u32()
+			 *     7.     continue
+			 *     8.   }
+			 *     9.   r0 = r10
+			 *    10.   r0 += r7
+			 *    11.   r8 = *(u64 *)(r0 + 0)
+			 *    12.   r6 = bpf_get_prandom_u32()
+			 *    13. }
+			 *
+			 * Here verifier would first visit path 1-3, create a checkpoint at 3
+			 * with r7=-16, continue to 4-7,3. Existing checkpoint at 3 does
+			 * not have read or precision mark for r7 yet, thus inexact states
+			 * comparison would discard current state with r7=-32
+			 * => unsafe memory access at 11 would not be caught.
 			 */
 			if (is_iter_next_insn(env, insn_idx)) {
-				if (states_equal(env, &sl->state, cur)) {
+				if (states_equal(env, &sl->state, cur, true)) {
 					struct bpf_func_state *cur_frame;
 					struct bpf_reg_state *iter_state, *iter_reg;
 					int spi;
@@ -16604,7 +16754,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			}
 			/* attempt to detect infinite loop to avoid unnecessary doomed work */
 			if (states_maybe_looping(&sl->state, cur) &&
-			    states_equal(env, &sl->state, cur) &&
+			    states_equal(env, &sl->state, cur, false) &&
 			    !iter_active_depths_differ(&sl->state, cur)) {
 				verbose_linfo(env, insn_idx, "; ");
 				verbose(env, "infinite loop detected at insn %d\n", insn_idx);
@@ -16629,7 +16779,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 				add_new_state = false;
 			goto miss;
 		}
-		if (states_equal(env, &sl->state, cur)) {
+		if (states_equal(env, &sl->state, cur, false)) {
 hit:
 			sl->hit_cnt++;
 			/* reached equivalent register/stack state,
@@ -16669,7 +16819,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 		 * Higher numbers increase max_states_per_insn and verification time,
 		 * but do not meaningfully decrease insn_processed.
 		 */
-		if (sl->miss_cnt > sl->hit_cnt * 3 + 3) {
+		if (sl->miss_cnt > sl->hit_cnt * 3 + 3 &&
+		    !(is_force_checkpoint(env, insn_idx) && sl->state.branches > 0)) {
 			/* the state is unlikely to be useful. Remove it to
 			 * speed up verification
 			 */
@@ -16743,6 +16894,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 
 	cur->parent = new;
 	cur->first_insn_idx = insn_idx;
+	cur->dfs_depth = new->dfs_depth + 1;
 	clear_jmp_history(cur);
 	new_sl->next = *explored_state(env, insn_idx);
 	*explored_state(env, insn_idx) = new_sl;
diff --git a/tools/testing/selftests/bpf/progs/iters_task_vma.c b/tools/testing/selftests/bpf/progs/iters_task_vma.c
index 44edecfdfaee..e085a51d153e 100644
--- a/tools/testing/selftests/bpf/progs/iters_task_vma.c
+++ b/tools/testing/selftests/bpf/progs/iters_task_vma.c
@@ -30,6 +30,7 @@ int iter_task_vma_for_each(const void *ctx)
 	bpf_for_each(task_vma, vma, task, 0) {
 		if (seen >= 1000)
 			break;
+		barrier_var(seen);
 
 		vm_ranges[seen].vm_start = vma->vm_start;
 		vm_ranges[seen].vm_end = vma->vm_end;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks
  2023-10-22  1:08 ` [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks Eduard Zingerman
@ 2023-10-22  4:16   ` Andrii Nakryiko
  2023-10-23 13:38     ` Eduard Zingerman
  0 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2023-10-22  4:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend, Alexei Starovoitov

On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Convergence for open coded iterators is computed in is_state_visited()
> by examining states with branches count > 1 and using states_equal().
> states_equal() computes sub-state relation using read and precision marks.
> Read and precision marks are propagated from children states,
> thus are not guaranteed to be complete inside a loop when branches
> count > 1. This could be demonstrated using the following unsafe program:
>
>      1. r7 = -16
>      2. r6 = bpf_get_prandom_u32()
>      3. while (bpf_iter_num_next(&fp[-8])) {
>      4.   if (r6 != 42) {
>      5.     r7 = -32
>      6.     r6 = bpf_get_prandom_u32()
>      7.     continue
>      8.   }
>      9.   r0 = r10
>     10.   r0 += r7
>     11.   r8 = *(u64 *)(r0 + 0)
>     12.   r6 = bpf_get_prandom_u32()
>     13. }
>
> Here verifier would first visit path 1-3, create a checkpoint at 3
> with r7=-16, continue to 4-7,3 with r7=-32.
>
> Because instructions at 9-12 had not been visitied yet existing
> checkpoint at 3 does not have read or precision mark for r7.
> Thus states_equal() would return true and verifier would discard
> current state, thus unsafe memory access at 11 would not be caught.
>
> This commit fixes this loophole by introducing exact state comparisons
> for iterator convergence logic:
> - registers are compared using regs_exact() regardless of read or
>   precision marks;
> - stack slots have to have identical type.
>
> Unfortunately, this is too strict even for simple programs like below:
>
>     i = 0;
>     while(iter_next(&it))
>       i++;
>
> At each iteration step i++ would produce a new distinct state and
> eventually instruction processing limit would be reached.
>
> To avoid such behavior speculatively forget (widen) range for
> imprecise scalar registers, if those registers were not precise at the
> end of the previous iteration and do not match exactly.
>
> This a conservative heuristic that allows to verify wide range of
> programs, however it precludes verification of programs that conjure
> an imprecise value on the first loop iteration and use it as precise
> on the second.
>
> Test case iter_task_vma_for_each() presents one of such cases:
>
>         unsigned int seen = 0;
>         ...
>         bpf_for_each(task_vma, vma, task, 0) {
>                 if (seen >= 1000)
>                         break;
>                 ...
>                 seen++;
>         }
>
> Here clang generates the following code:
>
> <LBB0_4>:
>       24:       r8 = r6                          ; stash current value of
>                 ... body ...                       'seen'
>       29:       r1 = r10
>       30:       r1 += -0x8
>       31:       call bpf_iter_task_vma_next
>       32:       r6 += 0x1                        ; seen++;
>       33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
>       34:       r7 += 0x10
>       35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
>
> <LBB0_6>:
>       ... exit ...
>
> Note that counter in r6 is copied to r8 and then incremented,
> conditional jump is done using r8. Because of this precision mark for
> r6 lags one state behind of precision mark on r8 and widening logic
> kicks in.
>
> Adding barrier_var(seen) after conditional is sufficient to force
> clang use the same register for both counting and conditional jump.
>
> This issue was discussed in the thread [1] which was started by
> Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
> in callback functions handling. The callbacks would be addressed
> in a followup patch.
>
> [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
>
> Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf_verifier.h                  |   1 +
>  kernel/bpf/verifier.c                         | 212 +++++++++++++++---
>  .../selftests/bpf/progs/iters_task_vma.c      |   1 +
>  3 files changed, 184 insertions(+), 30 deletions(-)
>

[...]

> +static int widen_imprecise_scalars(struct bpf_verifier_env *env,
> +                                  struct bpf_verifier_state *old,
> +                                  struct bpf_verifier_state *cur)
> +{
> +       struct bpf_func_state *fold, *fcur;
> +       int i, fr;
> +
> +       reset_idmap_scratch(env);
> +       for (fr = old->curframe; fr >= 0; fr--) {
> +               fold = old->frame[fr];
> +               fcur = cur->frame[fr];
> +
> +               for (i = 0; i < MAX_BPF_REG; i++)
> +                       maybe_widen_reg(env,
> +                                       &fold->regs[i],
> +                                       &fcur->regs[i],
> +                                       &env->idmap_scratch);
> +
> +               for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) {
> +                       if (is_spilled_scalar_reg(&fold->stack[i]))

should this be !is_spilled_scalar_reg()?

but also your should make sure that fcur->stack[i] is also a spilled
reg (it could be iter or dynptr, or just hard-coded ZERO or MISC stack
slot, so accessing fcur->stack[i].spilled_ptr below might be invalid)

> +                               continue;

> +
> +                       maybe_widen_reg(env,
> +                                       &fold->stack[i].spilled_ptr,
> +                                       &fcur->stack[i].spilled_ptr,
> +                                       &env->idmap_scratch);
> +               }
> +       }
> +       return 0;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks
  2023-10-22  4:16   ` Andrii Nakryiko
@ 2023-10-23 13:38     ` Eduard Zingerman
  0 siblings, 0 replies; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-23 13:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend, Alexei Starovoitov

On Sat, 2023-10-21 at 21:16 -0700, Andrii Nakryiko wrote:
> On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > Convergence for open coded iterators is computed in is_state_visited()
> > by examining states with branches count > 1 and using states_equal().
> > states_equal() computes sub-state relation using read and precision marks.
> > Read and precision marks are propagated from children states,
> > thus are not guaranteed to be complete inside a loop when branches
> > count > 1. This could be demonstrated using the following unsafe program:
> > 
> >      1. r7 = -16
> >      2. r6 = bpf_get_prandom_u32()
> >      3. while (bpf_iter_num_next(&fp[-8])) {
> >      4.   if (r6 != 42) {
> >      5.     r7 = -32
> >      6.     r6 = bpf_get_prandom_u32()
> >      7.     continue
> >      8.   }
> >      9.   r0 = r10
> >     10.   r0 += r7
> >     11.   r8 = *(u64 *)(r0 + 0)
> >     12.   r6 = bpf_get_prandom_u32()
> >     13. }
> > 
> > Here verifier would first visit path 1-3, create a checkpoint at 3
> > with r7=-16, continue to 4-7,3 with r7=-32.
> > 
> > Because instructions at 9-12 had not been visitied yet existing
> > checkpoint at 3 does not have read or precision mark for r7.
> > Thus states_equal() would return true and verifier would discard
> > current state, thus unsafe memory access at 11 would not be caught.
> > 
> > This commit fixes this loophole by introducing exact state comparisons
> > for iterator convergence logic:
> > - registers are compared using regs_exact() regardless of read or
> >   precision marks;
> > - stack slots have to have identical type.
> > 
> > Unfortunately, this is too strict even for simple programs like below:
> > 
> >     i = 0;
> >     while(iter_next(&it))
> >       i++;
> > 
> > At each iteration step i++ would produce a new distinct state and
> > eventually instruction processing limit would be reached.
> > 
> > To avoid such behavior speculatively forget (widen) range for
> > imprecise scalar registers, if those registers were not precise at the
> > end of the previous iteration and do not match exactly.
> > 
> > This a conservative heuristic that allows to verify wide range of
> > programs, however it precludes verification of programs that conjure
> > an imprecise value on the first loop iteration and use it as precise
> > on the second.
> > 
> > Test case iter_task_vma_for_each() presents one of such cases:
> > 
> >         unsigned int seen = 0;
> >         ...
> >         bpf_for_each(task_vma, vma, task, 0) {
> >                 if (seen >= 1000)
> >                         break;
> >                 ...
> >                 seen++;
> >         }
> > 
> > Here clang generates the following code:
> > 
> > <LBB0_4>:
> >       24:       r8 = r6                          ; stash current value of
> >                 ... body ...                       'seen'
> >       29:       r1 = r10
> >       30:       r1 += -0x8
> >       31:       call bpf_iter_task_vma_next
> >       32:       r6 += 0x1                        ; seen++;
> >       33:       if r0 == 0x0 goto +0x2 <LBB0_6>  ; exit on next() == NULL
> >       34:       r7 += 0x10
> >       35:       if r8 < 0x3e7 goto -0xc <LBB0_4> ; loop on seen < 1000
> > 
> > <LBB0_6>:
> >       ... exit ...
> > 
> > Note that counter in r6 is copied to r8 and then incremented,
> > conditional jump is done using r8. Because of this precision mark for
> > r6 lags one state behind of precision mark on r8 and widening logic
> > kicks in.
> > 
> > Adding barrier_var(seen) after conditional is sufficient to force
> > clang use the same register for both counting and conditional jump.
> > 
> > This issue was discussed in the thread [1] which was started by
> > Andrew Werner <awerner32@gmail.com> demonstrating a similar bug
> > in callback functions handling. The callbacks would be addressed
> > in a followup patch.
> > 
> > [1] https://lore.kernel.org/bpf/97a90da09404c65c8e810cf83c94ac703705dc0e.camel@gmail.com/
> > 
> > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h                  |   1 +
> >  kernel/bpf/verifier.c                         | 212 +++++++++++++++---
> >  .../selftests/bpf/progs/iters_task_vma.c      |   1 +
> >  3 files changed, 184 insertions(+), 30 deletions(-)
> > 
> 
> [...]
> 
> > +static int widen_imprecise_scalars(struct bpf_verifier_env *env,
> > +                                  struct bpf_verifier_state *old,
> > +                                  struct bpf_verifier_state *cur)
> > +{
> > +       struct bpf_func_state *fold, *fcur;
> > +       int i, fr;
> > +
> > +       reset_idmap_scratch(env);
> > +       for (fr = old->curframe; fr >= 0; fr--) {
> > +               fold = old->frame[fr];
> > +               fcur = cur->frame[fr];
> > +
> > +               for (i = 0; i < MAX_BPF_REG; i++)
> > +                       maybe_widen_reg(env,
> > +                                       &fold->regs[i],
> > +                                       &fcur->regs[i],
> > +                                       &env->idmap_scratch);
> > +
> > +               for (i = 0; i < fold->allocated_stack / BPF_REG_SIZE; i++) {
> > +                       if (is_spilled_scalar_reg(&fold->stack[i]))
> 
> should this be !is_spilled_scalar_reg()?
> 
> but also your should make sure that fcur->stack[i] is also a spilled
> reg (it could be iter or dynptr, or just hard-coded ZERO or MISC stack
> slot, so accessing fcur->stack[i].spilled_ptr below might be invalid)

Yeap, this is a bug, thank you.
I'll add a test case and submit V3 today.

> 
> > +                               continue;
> 
> > +
> > +                       maybe_widen_reg(env,
> > +                                       &fold->stack[i].spilled_ptr,
> > +                                       &fcur->stack[i].spilled_ptr,
> > +                                       &env->idmap_scratch);
> > +               }
> > +       }
> > +       return 0;
> > +}
> > +
> 
> [...]


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (2 preceding siblings ...)
  2023-10-22  1:08 ` [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  3:00   ` kernel test robot
  2023-10-22  1:08 ` [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence Eduard Zingerman
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

These test cases try to hide read and precision marks from loop
convergence logic: marks would only be assigned on subsequent loop
iterations or after exploring states pushed to env->head stack first.
Without verifier fix to use exact states comparison logic for
iterators convergence these tests (except 'triple_continue') would be
errorneously marked as safe.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/iters.c | 443 ++++++++++++++++++++++
 1 file changed, 443 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index 6b9b3c56f009..ee85cc6d3444 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -14,6 +14,13 @@ int my_pid;
 int arr[256];
 int small_arr[16] SEC(".data.small_arr");
 
+struct {
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 10);
+	__type(key, int);
+	__type(value, int);
+} amap SEC(".maps");
+
 #ifdef REAL_TEST
 #define MY_PID_GUARD() if (my_pid != (bpf_get_current_pid_tgid() >> 32)) return 0
 #else
@@ -716,4 +723,440 @@ int iter_pass_iter_ptr_to_subprog(const void *ctx)
 	return 0;
 }
 
+SEC("?raw_tp")
+__failure
+__msg("R1 type=scalar expected=fp")
+__naked int delayed_read_mark(void)
+{
+	/* This is equivalent to C program below.
+	 * The call to bpf_iter_num_next() is reachable with r7 values &fp[-16] and 0xdead.
+	 * State with r7=&fp[-16] is visited first and follows r6 != 42 ... continue branch.
+	 * At this point iterator next() call is reached with r7 that has no read mark.
+	 * Loop body with r7=0xdead would only be visited if verifier would decide to continue
+	 * with second loop iteration. Absence of read mark on r7 might affect state
+	 * equivalent logic used for iterator convergence tracking.
+	 *
+	 * r7 = &fp[-16]
+	 * fp[-16] = 0
+	 * r6 = bpf_get_prandom_u32()
+	 * bpf_iter_num_new(&fp[-8], 0, 10)
+	 * while (bpf_iter_num_next(&fp[-8])) {
+	 *   r6++
+	 *   if (r6 != 42) {
+	 *     r7 = 0xdead
+	 *     continue;
+	 *   }
+	 *   bpf_probe_read_user(r7, 8, 0xdeadbeef); // this is not safe
+	 * }
+	 * bpf_iter_num_destroy(&fp[-8])
+	 * return 0
+	 */
+	asm volatile (
+		"r7 = r10;"
+		"r7 += -16;"
+		"r0 = 0;"
+		"*(u64 *)(r7 + 0) = r0;"
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+	"1:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto 2f;"
+		"r6 += 1;"
+		"if r6 != 42 goto 3f;"
+		"r7 = 0xdead;"
+		"goto 1b;"
+	"3:"
+		"r1 = r7;"
+		"r2 = 8;"
+		"r3 = 0xdeadbeef;"
+		"call %[bpf_probe_read_user];"
+		"goto 1b;"
+	"2:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy),
+		  __imm(bpf_probe_read_user)
+		: __clobber_all
+	);
+}
+
+SEC("?raw_tp")
+__failure
+__msg("math between fp pointer and register with unbounded")
+__naked int delayed_precision_mark(void)
+{
+	/* This is equivalent to C program below.
+	 * The test is similar to delayed_iter_mark but verifies that incomplete
+	 * precision don't fool verifier.
+	 * The call to bpf_iter_num_next() is reachable with r7 values -16 and -32.
+	 * State with r7=-16 is visited first and follows r6 != 42 ... continue branch.
+	 * At this point iterator next() call is reached with r7 that has no read
+	 * and precision marks.
+	 * Loop body with r7=-32 would only be visited if verifier would decide to continue
+	 * with second loop iteration. Absence of precision mark on r7 might affect state
+	 * equivalent logic used for iterator convergence tracking.
+	 *
+	 * r8 = 0
+	 * fp[-16] = 0
+	 * r7 = -16
+	 * r6 = bpf_get_prandom_u32()
+	 * bpf_iter_num_new(&fp[-8], 0, 10)
+	 * while (bpf_iter_num_next(&fp[-8])) {
+	 *   if (r6 != 42) {
+	 *     r7 = -32
+	 *     r6 = bpf_get_prandom_u32()
+	 *     continue;
+	 *   }
+	 *   r0 = r10
+	 *   r0 += r7
+	 *   r8 = *(u64 *)(r0 + 0)           // this is not safe
+	 *   r6 = bpf_get_prandom_u32()
+	 * }
+	 * bpf_iter_num_destroy(&fp[-8])
+	 * return r8
+	 */
+	asm volatile (
+		"r8 = 0;"
+		"*(u64 *)(r10 - 16) = r8;"
+		"r7 = -16;"
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+	"1:"
+		"r1 = r10;"
+		"r1 += -8;\n"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto 2f;"
+		"if r6 != 42 goto 3f;"
+		"r7 = -32;"
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		"goto 1b;\n"
+	"3:"
+		"r0 = r10;"
+		"r0 += r7;"
+		"r8 = *(u64 *)(r0 + 0);"
+		"call %[bpf_get_prandom_u32];"
+		"r6 = r0;"
+		"goto 1b;\n"
+	"2:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = r8;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy),
+		  __imm(bpf_probe_read_user)
+		: __clobber_all
+	);
+}
+
+SEC("?raw_tp")
+__failure
+__msg("math between fp pointer and register with unbounded")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked int loop_state_deps1(void)
+{
+	/* This is equivalent to C program below.
+	 *
+	 * The case turns out to be tricky in a sense that:
+	 * - states with c=-25 are explored only on a second iteration
+	 *   of the outer loop;
+	 * - states with read+precise mark on c are explored only on
+	 *   second iteration of the inner loop and in a state which
+	 *   is pushed to states stack first.
+	 *
+	 * Depending on the details of iterator convergence logic
+	 * verifier might stop states traversal too early and miss
+	 * unsafe c=-25 memory access.
+	 *
+	 *   j = iter_new();		 // fp[-16]
+	 *   a = 0;			 // r6
+	 *   b = 0;			 // r7
+	 *   c = -24;			 // r8
+	 *   while (iter_next(j)) {
+	 *     i = iter_new();		 // fp[-8]
+	 *     a = 0;			 // r6
+	 *     b = 0;			 // r7
+	 *     while (iter_next(i)) {
+	 *	 if (a == 1) {
+	 *	   a = 0;
+	 *	   b = 1;
+	 *	 } else if (a == 0) {
+	 *	   a = 1;
+	 *	   if (random() == 42)
+	 *	     continue;
+	 *	   if (b == 1) {
+	 *	     *(r10 + c) = 7;  // this is not safe
+	 *	     iter_destroy(i);
+	 *	     iter_destroy(j);
+	 *	     return;
+	 *	   }
+	 *	 }
+	 *     }
+	 *     iter_destroy(i);
+	 *     a = 0;
+	 *     b = 0;
+	 *     c = -25;
+	 *   }
+	 *   iter_destroy(j);
+	 *   return;
+	 */
+	asm volatile (
+		"r1 = r10;"
+		"r1 += -16;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r6 = 0;"
+		"r7 = 0;"
+		"r8 = -24;"
+	"j_loop_%=:"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto j_loop_end_%=;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r6 = 0;"
+		"r7 = 0;"
+	"i_loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto i_loop_end_%=;"
+	"check_one_r6_%=:"
+		"if r6 != 1 goto check_zero_r6_%=;"
+		"r6 = 0;"
+		"r7 = 1;"
+		"goto i_loop_%=;"
+	"check_zero_r6_%=:"
+		"if r6 != 0 goto i_loop_%=;"
+		"r6 = 1;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto check_one_r7_%=;"
+		"goto i_loop_%=;"
+	"check_one_r7_%=:"
+		"if r7 != 1 goto i_loop_%=;"
+		"r0 = r10;"
+		"r0 += r8;"
+		"r1 = 7;"
+		"*(u64 *)(r0 + 0) = r1;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+	"i_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r6 = 0;"
+		"r7 = 0;"
+		"r8 = -25;"
+		"goto j_loop_%=;"
+	"j_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy)
+		: __clobber_all
+	);
+}
+
+SEC("?raw_tp")
+__success
+__naked int triple_continue(void)
+{
+	/* This is equivalent to C program below.
+	 * High branching factor of the loop body turned out to be
+	 * problematic for one of the iterator convergence tracking
+	 * algorithms explored.
+	 *
+	 * r6 = bpf_get_prandom_u32()
+	 * bpf_iter_num_new(&fp[-8], 0, 10)
+	 * while (bpf_iter_num_next(&fp[-8])) {
+	 *   if (bpf_get_prandom_u32() != 42)
+	 *     continue;
+	 *   if (bpf_get_prandom_u32() != 42)
+	 *     continue;
+	 *   if (bpf_get_prandom_u32() != 42)
+	 *     continue;
+	 *   r0 += 0;
+	 * }
+	 * bpf_iter_num_destroy(&fp[-8])
+	 * return 0
+	 */
+	asm volatile (
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+	"loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto loop_end_%=;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto loop_%=;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto loop_%=;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto loop_%=;"
+		"r0 += 0;"
+		"goto loop_%=;"
+	"loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy)
+		: __clobber_all
+	);
+}
+
+SEC("raw_tp")
+__success
+__naked int checkpoint_states_deletion(void)
+{
+	/* This is equivalent to C program below.
+	 *
+	 *   int i;
+	 *   int sum = 0;
+	 *   bpf_for(i, 0, 10) {
+	 *     int *a = bpf_map_lookup_elem(&amap, &i);
+	 *     int *b = bpf_map_lookup_elem(&amap, &i);
+	 *     int *c = bpf_map_lookup_elem(&amap, &i);
+	 *     if (a)
+	 *             sum += 1;
+	 *     if (b)
+	 *             sum += 1;
+	 *     if (c)
+	 *             sum += 1;
+	 *     // a = NULL;
+	 *   }
+	 *   return 0;
+	 *
+	 * Note the commented out 'a = NULL;'.
+	 * The body of the loop spawns multiple simulation paths
+	 * with different combination of NULL/non-NULL information for a/b/c.
+	 * Each combination is unique from states_equal() point of view.
+	 * Explored states checkpoint is created after each iterator next call.
+	 * Iterator convergence logic expects that eventually current state
+	 * would get equal to one of the explored states and thus loop
+	 * exploration would be finished (at-least for a specific path).
+	 * Verifier evicts explored states with high miss to hit ratio
+	 * to to avoid comparing current state with too many explored
+	 * states per instruction.
+	 * This test is designed to trick the following eviction policy:
+	 *
+	 *    sl->miss_cnt > sl->hit_cnt * 3 + 3 // if true sl->state is evicted
+	 *
+	 * *If* checkpoint states are allowed to be evicted and the
+	 * policy above is used, verifier would remove states too early,
+	 * before loop convergence could be established.
+	 * Uncommenting 'a = NULL;' reduces number of distinct states
+	 * and program verifies.
+	 */
+	asm volatile (
+		"r6 = 0;"   /* a */
+		"r7 = 0;"   /* b */
+		"r8 = 0;"   /* c */
+		"r9 = 0;"   /* sum */
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+	"loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto loop_end_%=;"
+
+		"*(u64 *)(r10 - 16) = r0;"
+
+		"r1 = %[amap] ll;"
+		"r2 = r10;"
+		"r2 += -16;"
+		"call %[bpf_map_lookup_elem];"
+		"r6 = r0;"
+
+		"r1 = %[amap] ll;"
+		"r2 = r10;"
+		"r2 += -16;"
+		"call %[bpf_map_lookup_elem];"
+		"r7 = r0;"
+
+		"r1 = %[amap] ll;"
+		"r2 = r10;"
+		"r2 += -16;"
+		"call %[bpf_map_lookup_elem];"
+		"r8 = r0;"
+
+		"if r6 == 0 goto +1;"
+		"r9 += 1;"
+		"if r7 == 0 goto +1;"
+		"r9 += 1;"
+		"if r8 == 0 goto +1;"
+		"r9 += 1;"
+
+		/* "r6 = 0;" // Commented out 'a = NULL;' */
+		"goto loop_%=;"
+	"loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_map_lookup_elem),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy),
+		  __imm_addr(amap)
+		: __clobber_all
+	);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body
  2023-10-22  1:08 ` [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body Eduard Zingerman
@ 2023-10-22  3:00   ` kernel test robot
  0 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2023-10-22  3:00 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: oe-kbuild-all

Hi Eduard,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Eduard-Zingerman/bpf-move-explored_state-closer-to-the-beginning-of-verifier-c/20231022-091124
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20231022010812.9201-5-eddyz87%40gmail.com
patch subject: [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body
reproduce: (https://download.01.org/0day-ci/archive/20231022/202310221039.YDfIS2hn-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310221039.YDfIS2hn-lkp@intel.com/

# many are suggestions rather than must-fix

WARNING:SPLIT_STRING: quoted string split across lines
#82: FILE: tools/testing/selftests/bpf/progs/iters.c:767:
+	"1:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#91: FILE: tools/testing/selftests/bpf/progs/iters.c:776:
+	"3:"
+		"r1 = r7;"

WARNING:SPLIT_STRING: quoted string split across lines
#97: FILE: tools/testing/selftests/bpf/progs/iters.c:782:
+	"2:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#159: FILE: tools/testing/selftests/bpf/progs/iters.c:844:
+	"1:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#169: FILE: tools/testing/selftests/bpf/progs/iters.c:854:
+	"3:"
+		"r0 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#176: FILE: tools/testing/selftests/bpf/progs/iters.c:861:
+	"2:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#252: FILE: tools/testing/selftests/bpf/progs/iters.c:937:
+	"j_loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#264: FILE: tools/testing/selftests/bpf/progs/iters.c:949:
+	"i_loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#269: FILE: tools/testing/selftests/bpf/progs/iters.c:954:
+	"check_one_r6_%=:"
+		"if r6 != 1 goto check_zero_r6_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#274: FILE: tools/testing/selftests/bpf/progs/iters.c:959:
+	"check_zero_r6_%=:"
+		"if r6 != 0 goto i_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#280: FILE: tools/testing/selftests/bpf/progs/iters.c:965:
+	"check_one_r7_%=:"
+		"if r7 != 1 goto i_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#294: FILE: tools/testing/selftests/bpf/progs/iters.c:979:
+	"i_loop_end_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#302: FILE: tools/testing/selftests/bpf/progs/iters.c:987:
+	"j_loop_end_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#346: FILE: tools/testing/selftests/bpf/progs/iters.c:1031:
+	"loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#359: FILE: tools/testing/selftests/bpf/progs/iters.c:1044:
+	"loop_end_%=:"
+		"r1 = r10;"

WARNING:REPEATED_WORD: Possible repeated word: 'to'
#404: FILE: tools/testing/selftests/bpf/progs/iters.c:1089:
+	 * to to avoid comparing current state with too many explored

WARNING:SPLIT_STRING: quoted string split across lines
#427: FILE: tools/testing/selftests/bpf/progs/iters.c:1112:
+	"loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#462: FILE: tools/testing/selftests/bpf/progs/iters.c:1147:
+	"loop_end_%=:"
+		"r1 = r10;"

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (3 preceding siblings ...)
  2023-10-22  1:08 ` [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  4:28   ` Andrii Nakryiko
  2023-10-22  1:08 ` [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case Eduard Zingerman
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman, Andrii Nakryiko,
	Alexei Starovoitov

It turns out that .branches > 0 in is_state_visited() is not a
sufficient condition to identify if two verifier states form a loop
when iterators convergence is computed. This commit adds logic to
distinguish situations like below:

 (I)            initial       (II)            initial
                  |                             |
                  V                             V
     .---------> hdr                           ..
     |            |                             |
     |            V                             V
     |    .------...                    .------..
     |    |       |                     |       |
     |    V       V                     V       V
     |   ...     ...               .-> hdr     ..
     |    |       |                |    |       |
     |    V       V                |    V       V
     |   succ <- cur               |   succ <- cur
     |    |                        |    |
     |    V                        |    V
     |   ...                       |   ...
     |    |                        |    |
     '----'                        '----'

For both (I) and (II) successor 'succ' of the current state 'cur' was
previously explored and has branches count at 0. However, loop entry
'hdr' corresponding to 'succ' might be a part of current DFS path.
If that is the case 'succ' and 'cur' are members of the same loop
and have to be compared exactly.

Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  15 +++
 kernel/bpf/verifier.c        | 207 ++++++++++++++++++++++++++++++++++-
 2 files changed, 218 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 38b788228594..24213a99cc79 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -373,10 +373,25 @@ struct bpf_verifier_state {
 	struct bpf_active_lock active_lock;
 	bool speculative;
 	bool active_rcu_lock;
+	/* If this state was ever pointed-to by other state's loop_entry field
+	 * this flag would be set to true. Used to avoid freeing such states
+	 * while they are still in use.
+	 */
+	bool used_as_loop_entry;
 
 	/* first and last insn idx of this verifier state */
 	u32 first_insn_idx;
 	u32 last_insn_idx;
+	/* If this state is a part of states loop this field points to some
+	 * parent of this state such that:
+	 * - it is also a member of the same states loop;
+	 * - DFS states traversal starting from initial state visits loop_entry
+	 *   state before this state.
+	 * Used to compute topmost loop entry for state loops.
+	 * State loops might appear because of open coded iterators logic.
+	 * See get_loop_entry() for more information.
+	 */
+	struct bpf_verifier_state *loop_entry;
 	/* jmp history recorded from first to last.
 	 * backtracking is using it to go from last to first.
 	 * For most states jmp_history_cnt is [0-3].
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 23e4eb0a5c23..baf31b61b3ff 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1803,6 +1803,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	dst_state->first_insn_idx = src->first_insn_idx;
 	dst_state->last_insn_idx = src->last_insn_idx;
 	dst_state->dfs_depth = src->dfs_depth;
+	dst_state->used_as_loop_entry = src->used_as_loop_entry;
 	for (i = 0; i <= src->curframe; i++) {
 		dst = dst_state->frame[i];
 		if (!dst) {
@@ -1845,11 +1846,176 @@ static bool same_callsites(struct bpf_verifier_state *a, struct bpf_verifier_sta
 	return true;
 }
 
+/* Open coded iterators allow back-edges in the state graph in order to
+ * check unbounded loops that iterators.
+ *
+ * In is_state_visited() it is necessary to know if explored states are
+ * part of some loops in order to decide whether non-exact states
+ * comparison could be used:
+ * - non-exact states comparison establishes sub-state relation and uses
+ *   read and precision marks to do so, these marks are propagated from
+ *   children states and thus are not guaranteed to be final in a loop;
+ * - exact states comparison just checks if current and explored states
+ *   are identical (and thus form a back-edge).
+ *
+ * Paper "A New Algorithm for Identifying Loops in Decompilation"
+ * by Tao Wei, Jian Mao, Wei Zou and Yu Chen [1] presents a convenient
+ * algorithm for loop structure detection and gives an overview of
+ * relevant terminology. It also has helpful illustrations.
+ *
+ * [1] https://api.semanticscholar.org/CorpusID:15784067
+ *
+ * We use a similar algorithm but because loop nested structure is
+ * irrelevant for verifier ours is significantly simpler and resembles
+ * strongly connected components algorithm from Sedgewick's textbook.
+ *
+ * Define topmost loop entry as a first node of the loop traversed in a
+ * depth first search starting from initial state. The goal of the loop
+ * tracking algorithm is to associate topmost loop entries with states
+ * derived from these entries.
+ *
+ * For each step in the DFS states traversal algorithm needs to identify
+ * the following situations:
+ *
+ *          initial                     initial                   initial
+ *            |                           |                         |
+ *            V                           V                         V
+ *           ...                         ...           .---------> hdr
+ *            |                           |            |            |
+ *            V                           V            |            V
+ *           cur                     .-> succ          |    .------...
+ *            |                      |    |            |    |       |
+ *            V                      |    V            |    V       V
+ *           succ                    '-- cur           |   ...     ...
+ *                                                     |    |       |
+ *                                                     |    V       V
+ *                                                     |   succ <- cur
+ *                                                     |    |
+ *                                                     |    V
+ *                                                     |   ...
+ *                                                     |    |
+ *                                                     '----'
+ *
+ *  (A) successor state of cur   (B) successor state of cur or it's entry
+ *      not yet traversed            are in current DFS path, thus cur and succ
+ *                                   are members of the same outermost loop
+ *
+ *                      initial                  initial
+ *                        |                        |
+ *                        V                        V
+ *                       ...                      ...
+ *                        |                        |
+ *                        V                        V
+ *                .------...               .------...
+ *                |       |                |       |
+ *                V       V                V       V
+ *           .-> hdr     ...              ...     ...
+ *           |    |       |                |       |
+ *           |    V       V                V       V
+ *           |   succ <- cur              succ <- cur
+ *           |    |                        |
+ *           |    V                        V
+ *           |   ...                      ...
+ *           |    |                        |
+ *           '----'                       exit
+ *
+ * (C) successor state of cur is a part of some loop but this loop
+ *     does not include cur or successor state is not in a loop at all.
+ *
+ * Algorithm could be described as the following python code:
+ *
+ *     traversed = set()   # Set of traversed nodes
+ *     entries = {}        # Mapping from node to loop entry
+ *     depths = {}         # Depth level assigned to graph node
+ *     path = set()        # Current DFS path
+ *
+ *     # Find outermost loop entry known for n
+ *     def get_loop_entry(n):
+ *         h = entries.get(n, None)
+ *         while h in entries and entries[h] != h:
+ *             h = entries[h]
+ *         return h
+ *
+ *     # Update n's loop entry if h's outermost entry comes
+ *     # before n's outermost entry in current DFS path.
+ *     def update_loop_entry(n, h):
+ *         n1 = get_loop_entry(n) or n
+ *         h1 = get_loop_entry(h) or h
+ *         if h1 in path and depths[h1] <= depths[n1]:
+ *             entries[n] = h1
+ *
+ *     def dfs(n, depth):
+ *         traversed.add(n)
+ *         path.add(n)
+ *         depths[n] = depth
+ *         for succ in G.successors(n):
+ *             if succ not in traversed:
+ *                 # Case A: explore succ and update cur's loop entry
+ *                 #         only if succ's entry is in current DFS path.
+ *                 dfs(succ, depth + 1)
+ *                 h = get_loop_entry(succ)
+ *                 update_loop_entry(n, h)
+ *             else:
+ *                 # Case B or C depending on `h1 in path` check in update_loop_entry().
+ *                 update_loop_entry(n, succ)
+ *         path.remove(n)
+ *
+ * To adapt this algorithm for use with verifier:
+ * - use st->branch == 0 as a signal that DFS of succ had been finished
+ *   and cur's loop entry has to be updated (case A), handle this in
+ *   update_branch_counts();
+ * - use st->branch > 0 as a signal that st is in the current DFS path;
+ * - handle cases B and C in is_state_visited();
+ * - update topmost loop entry for intermediate states in get_loop_entry().
+ */
+static struct bpf_verifier_state *get_loop_entry(struct bpf_verifier_state *st)
+{
+	struct bpf_verifier_state *topmost = st->loop_entry, *old;
+
+	while (topmost && topmost->loop_entry && topmost != topmost->loop_entry)
+		topmost = topmost->loop_entry;
+	/* Update loop entries for intermediate states to avoid this
+	 * traversal in future get_loop_entry() calls.
+	 */
+	while (st && st->loop_entry != topmost) {
+		old = st->loop_entry;
+		st->loop_entry = topmost;
+		st = old;
+	}
+	return topmost;
+}
+
+static void update_loop_entry(struct bpf_verifier_state *cur, struct bpf_verifier_state *hdr)
+{
+	struct bpf_verifier_state *cur1, *hdr1;
+
+	cur1 = get_loop_entry(cur) ?: cur;
+	hdr1 = get_loop_entry(hdr) ?: hdr;
+	/* The head1->branches check decides between cases B and C in
+	 * comment for get_loop_entry(). If hdr1->branches == 0 then
+	 * head's topmost loop entry is not in current DFS path,
+	 * hence 'cur' and 'hdr' are not in the same loop and there is
+	 * no need to update cur->loop_entry.
+	 */
+	if (hdr1->branches && hdr1->dfs_depth <= cur1->dfs_depth) {
+		cur->loop_entry = hdr;
+		hdr->used_as_loop_entry = true;
+	}
+}
+
 static void update_branch_counts(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
 {
 	while (st) {
 		u32 br = --st->branches;
 
+		/* br == 0 signals that DFS exploration for 'st' is finished,
+		 * thus it is necessary to update parent's loop entry if it
+		 * turned out that st is a part of some loop.
+		 * This is a part of 'case A' in get_loop_entry() comment.
+		 */
+		if (br == 0 && st->parent && st->loop_entry)
+			update_loop_entry(st->parent, st->loop_entry);
+
 		/* WARN_ON(br > 1) technically makes sense here,
 		 * but see comment in push_stack(), hence:
 		 */
@@ -16649,10 +16815,11 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 {
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl, **pprev;
-	struct bpf_verifier_state *cur = env->cur_state, *new;
+	struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
 	int i, j, err, states_cnt = 0;
 	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
 	bool add_new_state = force_new_state;
+	bool force_exact;
 
 	/* bpf progs typically have pruning point every 4 instructions
 	 * http://vger.kernel.org/bpfconf2019.html#session-1
@@ -16747,8 +16914,10 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 					 */
 					spi = __get_spi(iter_reg->off + iter_reg->var_off.value);
 					iter_state = &func(env, iter_reg)->stack[spi].spilled_ptr;
-					if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE)
+					if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE) {
+						update_loop_entry(cur, &sl->state);
 						goto hit;
+					}
 				}
 				goto skip_inf_loop_check;
 			}
@@ -16779,7 +16948,36 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 				add_new_state = false;
 			goto miss;
 		}
-		if (states_equal(env, &sl->state, cur, false)) {
+		/* If sl->state is a part of a loop and this loop's entry is a part of
+		 * current verification path then states have to be compared exactly.
+		 * 'force_exact' is needed to catch the following case:
+		 *
+		 *                initial     Here state 'succ' was processed first,
+		 *                  |         it was eventually tracked to produce a
+		 *                  V         state identical to 'hdr'.
+		 *     .---------> hdr        All branches from 'succ' had been explored
+		 *     |            |         and thus 'succ' has its .branches == 0.
+		 *     |            V
+		 *     |    .------...        Suppose states 'cur' and 'succ' correspond
+		 *     |    |       |         to the same instruction + callsites.
+		 *     |    V       V         In such case it is necessary to check
+		 *     |   ...     ...        if 'succ' and 'cur' are states_equal().
+		 *     |    |       |         If 'succ' and 'cur' are a part of the
+		 *     |    V       V         same loop exact flag has to be set.
+		 *     |   succ <- cur        To check if that is the case, verify
+		 *     |    |                 if loop entry of 'succ' is in current
+		 *     |    V                 DFS path.
+		 *     |   ...
+		 *     |    |
+		 *     '----'
+		 *
+		 * Additional details are in the comment before get_loop_entry().
+		 */
+		loop_entry = get_loop_entry(&sl->state);
+		force_exact = loop_entry && loop_entry->branches > 0;
+		if (states_equal(env, &sl->state, cur, force_exact)) {
+			if (force_exact)
+				update_loop_entry(cur, loop_entry);
 hit:
 			sl->hit_cnt++;
 			/* reached equivalent register/stack state,
@@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 * speed up verification
 			 */
 			*pprev = sl->next;
-			if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) {
+			if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE &&
+			    !sl->state.used_as_loop_entry) {
 				u32 br = sl->state.branches;
 
 				WARN_ONCE(br,
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence
  2023-10-22  1:08 ` [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence Eduard Zingerman
@ 2023-10-22  4:28   ` Andrii Nakryiko
  2023-10-23 14:47     ` Eduard Zingerman
  0 siblings, 1 reply; 18+ messages in thread
From: Andrii Nakryiko @ 2023-10-22  4:28 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend, Alexei Starovoitov

On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> It turns out that .branches > 0 in is_state_visited() is not a
> sufficient condition to identify if two verifier states form a loop
> when iterators convergence is computed. This commit adds logic to
> distinguish situations like below:
>
>  (I)            initial       (II)            initial
>                   |                             |
>                   V                             V
>      .---------> hdr                           ..
>      |            |                             |
>      |            V                             V
>      |    .------...                    .------..
>      |    |       |                     |       |
>      |    V       V                     V       V
>      |   ...     ...               .-> hdr     ..
>      |    |       |                |    |       |
>      |    V       V                |    V       V
>      |   succ <- cur               |   succ <- cur
>      |    |                        |    |
>      |    V                        |    V
>      |   ...                       |   ...
>      |    |                        |    |
>      '----'                        '----'
>
> For both (I) and (II) successor 'succ' of the current state 'cur' was
> previously explored and has branches count at 0. However, loop entry
> 'hdr' corresponding to 'succ' might be a part of current DFS path.
> If that is the case 'succ' and 'cur' are members of the same loop
> and have to be compared exactly.
>
> Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  include/linux/bpf_verifier.h |  15 +++
>  kernel/bpf/verifier.c        | 207 ++++++++++++++++++++++++++++++++++-
>  2 files changed, 218 insertions(+), 4 deletions(-)
>

LGTM, but see the note below about state being its own loop_entry. It
feels like a bit better approach would be to use "loop_entry_ref_cnt"
instead of just a bool used_as_loop_entry, and do a proper accounting
when child state is freed and when propagating loop_entries. But
perhaps that can be done in a follow up, if you think it's a good
idea.

Reviewed-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 38b788228594..24213a99cc79 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h

[...]

> + * To adapt this algorithm for use with verifier:
> + * - use st->branch == 0 as a signal that DFS of succ had been finished
> + *   and cur's loop entry has to be updated (case A), handle this in
> + *   update_branch_counts();
> + * - use st->branch > 0 as a signal that st is in the current DFS path;
> + * - handle cases B and C in is_state_visited();
> + * - update topmost loop entry for intermediate states in get_loop_entry().
> + */
> +static struct bpf_verifier_state *get_loop_entry(struct bpf_verifier_state *st)
> +{
> +       struct bpf_verifier_state *topmost = st->loop_entry, *old;
> +
> +       while (topmost && topmost->loop_entry && topmost != topmost->loop_entry)
> +               topmost = topmost->loop_entry;
> +       /* Update loop entries for intermediate states to avoid this
> +        * traversal in future get_loop_entry() calls.
> +        */
> +       while (st && st->loop_entry != topmost) {
> +               old = st->loop_entry;
> +               st->loop_entry = topmost;
> +               st = old;
> +       }
> +       return topmost;
> +}
> +
> +static void update_loop_entry(struct bpf_verifier_state *cur, struct bpf_verifier_state *hdr)
> +{
> +       struct bpf_verifier_state *cur1, *hdr1;
> +
> +       cur1 = get_loop_entry(cur) ?: cur;
> +       hdr1 = get_loop_entry(hdr) ?: hdr;
> +       /* The head1->branches check decides between cases B and C in
> +        * comment for get_loop_entry(). If hdr1->branches == 0 then
> +        * head's topmost loop entry is not in current DFS path,
> +        * hence 'cur' and 'hdr' are not in the same loop and there is
> +        * no need to update cur->loop_entry.
> +        */
> +       if (hdr1->branches && hdr1->dfs_depth <= cur1->dfs_depth) {
> +               cur->loop_entry = hdr;
> +               hdr->used_as_loop_entry = true;
> +       }
> +}
> +
>  static void update_branch_counts(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
>  {
>         while (st) {
>                 u32 br = --st->branches;
>
> +               /* br == 0 signals that DFS exploration for 'st' is finished,
> +                * thus it is necessary to update parent's loop entry if it
> +                * turned out that st is a part of some loop.
> +                * This is a part of 'case A' in get_loop_entry() comment.
> +                */
> +               if (br == 0 && st->parent && st->loop_entry)
> +                       update_loop_entry(st->parent, st->loop_entry);
> +
>                 /* WARN_ON(br > 1) technically makes sense here,
>                  * but see comment in push_stack(), hence:
>                  */
> @@ -16649,10 +16815,11 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  {
>         struct bpf_verifier_state_list *new_sl;
>         struct bpf_verifier_state_list *sl, **pprev;
> -       struct bpf_verifier_state *cur = env->cur_state, *new;
> +       struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
>         int i, j, err, states_cnt = 0;
>         bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
>         bool add_new_state = force_new_state;
> +       bool force_exact;
>
>         /* bpf progs typically have pruning point every 4 instructions
>          * http://vger.kernel.org/bpfconf2019.html#session-1
> @@ -16747,8 +16914,10 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                                          */
>                                         spi = __get_spi(iter_reg->off + iter_reg->var_off.value);
>                                         iter_state = &func(env, iter_reg)->stack[spi].spilled_ptr;
> -                                       if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE)
> +                                       if (iter_state->iter.state == BPF_ITER_STATE_ACTIVE) {
> +                                               update_loop_entry(cur, &sl->state);
>                                                 goto hit;
> +                                       }
>                                 }
>                                 goto skip_inf_loop_check;
>                         }
> @@ -16779,7 +16948,36 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                                 add_new_state = false;
>                         goto miss;
>                 }
> -               if (states_equal(env, &sl->state, cur, false)) {
> +               /* If sl->state is a part of a loop and this loop's entry is a part of
> +                * current verification path then states have to be compared exactly.
> +                * 'force_exact' is needed to catch the following case:
> +                *
> +                *                initial     Here state 'succ' was processed first,
> +                *                  |         it was eventually tracked to produce a
> +                *                  V         state identical to 'hdr'.
> +                *     .---------> hdr        All branches from 'succ' had been explored
> +                *     |            |         and thus 'succ' has its .branches == 0.
> +                *     |            V
> +                *     |    .------...        Suppose states 'cur' and 'succ' correspond
> +                *     |    |       |         to the same instruction + callsites.
> +                *     |    V       V         In such case it is necessary to check
> +                *     |   ...     ...        if 'succ' and 'cur' are states_equal().
> +                *     |    |       |         If 'succ' and 'cur' are a part of the
> +                *     |    V       V         same loop exact flag has to be set.
> +                *     |   succ <- cur        To check if that is the case, verify
> +                *     |    |                 if loop entry of 'succ' is in current
> +                *     |    V                 DFS path.
> +                *     |   ...
> +                *     |    |
> +                *     '----'
> +                *
> +                * Additional details are in the comment before get_loop_entry().
> +                */
> +               loop_entry = get_loop_entry(&sl->state);
> +               force_exact = loop_entry && loop_entry->branches > 0;
> +               if (states_equal(env, &sl->state, cur, force_exact)) {
> +                       if (force_exact)
> +                               update_loop_entry(cur, loop_entry);
>  hit:
>                         sl->hit_cnt++;
>                         /* reached equivalent register/stack state,
> @@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                          * speed up verification
>                          */
>                         *pprev = sl->next;
> -                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) {
> +                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE &&
> +                           !sl->state.used_as_loop_entry) {

In get_loop_entry() you have an additional `topmost !=
topmost->loop_entry` check, suggesting that state can be its own
loop_entry. Can that happen? And if yes, should we account for that
here?


>                                 u32 br = sl->state.branches;
>
>                                 WARN_ONCE(br,
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence
  2023-10-22  4:28   ` Andrii Nakryiko
@ 2023-10-23 14:47     ` Eduard Zingerman
  2023-10-23 16:16       ` Andrii Nakryiko
  0 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-23 14:47 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend, Alexei Starovoitov

On Sat, 2023-10-21 at 21:28 -0700, Andrii Nakryiko wrote:
> On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > It turns out that .branches > 0 in is_state_visited() is not a
> > sufficient condition to identify if two verifier states form a loop
> > when iterators convergence is computed. This commit adds logic to
> > distinguish situations like below:
> > 
> >  (I)            initial       (II)            initial
> >                   |                             |
> >                   V                             V
> >      .---------> hdr                           ..
> >      |            |                             |
> >      |            V                             V
> >      |    .------...                    .------..
> >      |    |       |                     |       |
> >      |    V       V                     V       V
> >      |   ...     ...               .-> hdr     ..
> >      |    |       |                |    |       |
> >      |    V       V                |    V       V
> >      |   succ <- cur               |   succ <- cur
> >      |    |                        |    |
> >      |    V                        |    V
> >      |   ...                       |   ...
> >      |    |                        |    |
> >      '----'                        '----'
> > 
> > For both (I) and (II) successor 'succ' of the current state 'cur' was
> > previously explored and has branches count at 0. However, loop entry
> > 'hdr' corresponding to 'succ' might be a part of current DFS path.
> > If that is the case 'succ' and 'cur' are members of the same loop
> > and have to be compared exactly.
> > 
> > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h |  15 +++
> >  kernel/bpf/verifier.c        | 207 ++++++++++++++++++++++++++++++++++-
> >  2 files changed, 218 insertions(+), 4 deletions(-)
> > 
> 
> LGTM, but see the note below about state being its own loop_entry. It
> feels like a bit better approach would be to use "loop_entry_ref_cnt"
> instead of just a bool used_as_loop_entry, and do a proper accounting
> when child state is freed and when propagating loop_entries. But
> perhaps that can be done in a follow up, if you think it's a good
> idea.

I though about reference counting but decided to use flag instead
because it's a bit simpler. In any case the full mechanism is
opportunistic and having a few stale states shouldn't be a big deal,
those would be freed when syscall exits.
I'll make ref_cnt version and send it as a follow-up, so we can decide
looking at the code whether to peek it or drop it.

> 
> Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
> 
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 38b788228594..24213a99cc79 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> 
[...]
> > @@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >                          * speed up verification
> >                          */
> >                         *pprev = sl->next;
> > -                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) {
> > +                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE &&
> > +                           !sl->state.used_as_loop_entry) {
> 
> In get_loop_entry() you have an additional `topmost !=
> topmost->loop_entry` check, suggesting that state can be its own
> loop_entry. Can that happen?

It can, e.g. in the following loop:

    loop: r1 = r10;
          r1 += -8;
       --- checkpoint here ---
          call %[bpf_iter_num_next];
          goto loop;
  

> And if yes, should we account for that here?

With flag I don't think we need to account for it here because it's a
best-effort thing anyways. (E.g. it misses situation when something
was marked as loop entry initially than entry upper in DFS chain had
been found). With reference count -- yes, it would me more important.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence
  2023-10-23 14:47     ` Eduard Zingerman
@ 2023-10-23 16:16       ` Andrii Nakryiko
  0 siblings, 0 replies; 18+ messages in thread
From: Andrii Nakryiko @ 2023-10-23 16:16 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend, Alexei Starovoitov

On Mon, Oct 23, 2023 at 7:47 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Sat, 2023-10-21 at 21:28 -0700, Andrii Nakryiko wrote:
> > On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > It turns out that .branches > 0 in is_state_visited() is not a
> > > sufficient condition to identify if two verifier states form a loop
> > > when iterators convergence is computed. This commit adds logic to
> > > distinguish situations like below:
> > >
> > >  (I)            initial       (II)            initial
> > >                   |                             |
> > >                   V                             V
> > >      .---------> hdr                           ..
> > >      |            |                             |
> > >      |            V                             V
> > >      |    .------...                    .------..
> > >      |    |       |                     |       |
> > >      |    V       V                     V       V
> > >      |   ...     ...               .-> hdr     ..
> > >      |    |       |                |    |       |
> > >      |    V       V                |    V       V
> > >      |   succ <- cur               |   succ <- cur
> > >      |    |                        |    |
> > >      |    V                        |    V
> > >      |   ...                       |   ...
> > >      |    |                        |    |
> > >      '----'                        '----'
> > >
> > > For both (I) and (II) successor 'succ' of the current state 'cur' was
> > > previously explored and has branches count at 0. However, loop entry
> > > 'hdr' corresponding to 'succ' might be a part of current DFS path.
> > > If that is the case 'succ' and 'cur' are members of the same loop
> > > and have to be compared exactly.
> > >
> > > Co-developed-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > Co-developed-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
> > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > > ---
> > >  include/linux/bpf_verifier.h |  15 +++
> > >  kernel/bpf/verifier.c        | 207 ++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 218 insertions(+), 4 deletions(-)
> > >
> >
> > LGTM, but see the note below about state being its own loop_entry. It
> > feels like a bit better approach would be to use "loop_entry_ref_cnt"
> > instead of just a bool used_as_loop_entry, and do a proper accounting
> > when child state is freed and when propagating loop_entries. But
> > perhaps that can be done in a follow up, if you think it's a good
> > idea.
>
> I though about reference counting but decided to use flag instead
> because it's a bit simpler. In any case the full mechanism is
> opportunistic and having a few stale states shouldn't be a big deal,
> those would be freed when syscall exits.
> I'll make ref_cnt version and send it as a follow-up, so we can decide
> looking at the code whether to peek it or drop it.
>
> >
> > Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
> >
> > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > index 38b788228594..24213a99cc79 100644
> > > --- a/include/linux/bpf_verifier.h
> > > +++ b/include/linux/bpf_verifier.h
> >
> [...]
> > > @@ -16825,7 +17023,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >                          * speed up verification
> > >                          */
> > >                         *pprev = sl->next;
> > > -                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE) {
> > > +                       if (sl->state.frame[0]->regs[0].live & REG_LIVE_DONE &&
> > > +                           !sl->state.used_as_loop_entry) {
> >
> > In get_loop_entry() you have an additional `topmost !=
> > topmost->loop_entry` check, suggesting that state can be its own
> > loop_entry. Can that happen?
>
> It can, e.g. in the following loop:
>
>     loop: r1 = r10;
>           r1 += -8;
>        --- checkpoint here ---
>           call %[bpf_iter_num_next];
>           goto loop;
>
>
> > And if yes, should we account for that here?
>
> With flag I don't think we need to account for it here because it's a
> best-effort thing anyways. (E.g. it misses situation when something
> was marked as loop entry initially than entry upper in DFS chain had
> been found). With reference count -- yes, it would me more important.
>

OK, no big deal, I wanted to make sure we won't leak some states, if
they are their own loop_entry.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (4 preceding siblings ...)
  2023-10-22  1:08 ` [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  3:11   ` kernel test robot
  2023-10-22  1:08 ` [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection Eduard Zingerman
  2023-10-23 17:17 ` [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

A convoluted test case for iterators convergence logic that
demonstrates that states with branch count equal to 0 might still be
a part of not completely explored loop.

E.g. consider the following state diagram:

               initial     Here state 'succ' was processed first,
                 |         it was eventually tracked to produce a
                 V         state identical to 'hdr'.
    .---------> hdr        All branches from 'succ' had been explored
    |            |         and thus 'succ' has its .branches == 0.
    |            V
    |    .------...        Suppose states 'cur' and 'succ' correspond
    |    |       |         to the same instruction + callsites.
    |    V       V         In such case it is necessary to check
    |   ...     ...        whether 'succ' and 'cur' are identical.
    |    |       |         If 'succ' and 'cur' are a part of the same loop
    |    V       V         they have to be compared exactly.
    |   succ <- cur
    |    |
    |    V
    |   ...
    |    |
    '----'

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/iters.c | 177 ++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index ee85cc6d3444..89aaddec9a6d 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -998,6 +998,183 @@ __naked int loop_state_deps1(void)
 	);
 }
 
+SEC("?raw_tp")
+__failure
+__msg("math between fp pointer and register with unbounded")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked int loop_state_deps2(void)
+{
+	/* This is equivalent to C program below.
+	 *
+	 * The case turns out to be tricky in a sense that:
+	 * - states with read+precise mark on c are explored only on a second
+	 *   iteration of the first inner loop and in a state which is pushed to
+	 *   states stack first.
+	 * - states with c=-25 are explored only on a second iteration of the
+	 *   second inner loop and in a state which is pushed to states stack
+	 *   first.
+	 *
+	 * Depending on the details of iterator convergence logic
+	 * verifier might stop states traversal too early and miss
+	 * unsafe c=-25 memory access.
+	 *
+	 *   j = iter_new();             // fp[-16]
+	 *   a = 0;                      // r6
+	 *   b = 0;                      // r7
+	 *   c = -24;                    // r8
+	 *   while (iter_next(j)) {
+	 *     i = iter_new();           // fp[-8]
+	 *     a = 0;                    // r6
+	 *     b = 0;                    // r7
+	 *     while (iter_next(i)) {
+	 *       if (a == 1) {
+	 *         a = 0;
+	 *         b = 1;
+	 *       } else if (a == 0) {
+	 *         a = 1;
+	 *         if (random() == 42)
+	 *           continue;
+	 *         if (b == 1) {
+	 *           *(r10 + c) = 7;     // this is not safe
+	 *           iter_destroy(i);
+	 *           iter_destroy(j);
+	 *           return;
+	 *         }
+	 *       }
+	 *     }
+	 *     iter_destroy(i);
+	 *     i = iter_new();           // fp[-8]
+	 *     a = 0;                    // r6
+	 *     b = 0;                    // r7
+	 *     while (iter_next(i)) {
+	 *       if (a == 1) {
+	 *         a = 0;
+	 *         b = 1;
+	 *       } else if (a == 0) {
+	 *         a = 1;
+	 *         if (random() == 42)
+	 *           continue;
+	 *         if (b == 1) {
+	 *           a = 0;
+	 *           c = -25;
+	 *         }
+	 *       }
+	 *     }
+	 *     iter_destroy(i);
+	 *   }
+	 *   iter_destroy(j);
+	 *   return;
+	 */
+	asm volatile (
+		"r1 = r10;"
+		"r1 += -16;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r6 = 0;"
+		"r7 = 0;"
+		"r8 = -24;"
+	"j_loop_%=:"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto j_loop_end_%=;"
+
+		/* first inner loop */
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r6 = 0;"
+		"r7 = 0;"
+	"i_loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto i_loop_end_%=;"
+	"check_one_r6_%=:"
+		"if r6 != 1 goto check_zero_r6_%=;"
+		"r6 = 0;"
+		"r7 = 1;"
+		"goto i_loop_%=;"
+	"check_zero_r6_%=:"
+		"if r6 != 0 goto i_loop_%=;"
+		"r6 = 1;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto check_one_r7_%=;"
+		"goto i_loop_%=;"
+	"check_one_r7_%=:"
+		"if r7 != 1 goto i_loop_%=;"
+		"r0 = r10;"
+		"r0 += r8;"
+		"r1 = 7;"
+		"*(u64 *)(r0 + 0) = r1;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+	"i_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+
+		/* second inner loop */
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r6 = 0;"
+		"r7 = 0;"
+	"i2_loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto i2_loop_end_%=;"
+	"check2_one_r6_%=:"
+		"if r6 != 1 goto check2_zero_r6_%=;"
+		"r6 = 0;"
+		"r7 = 1;"
+		"goto i2_loop_%=;"
+	"check2_zero_r6_%=:"
+		"if r6 != 0 goto i2_loop_%=;"
+		"r6 = 1;"
+		"call %[bpf_get_prandom_u32];"
+		"if r0 != 42 goto check2_one_r7_%=;"
+		"goto i2_loop_%=;"
+	"check2_one_r7_%=:"
+		"if r7 != 1 goto i2_loop_%=;"
+		"r6 = 0;"
+		"r8 = -25;"
+		"goto i2_loop_%=;"
+	"i2_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+
+		"r6 = 0;"
+		"r7 = 0;"
+		"goto j_loop_%=;"
+	"j_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -16;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy)
+		: __clobber_all
+	);
+}
+
 SEC("?raw_tp")
 __success
 __naked int triple_continue(void)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case
  2023-10-22  1:08 ` [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case Eduard Zingerman
@ 2023-10-22  3:11   ` kernel test robot
  0 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2023-10-22  3:11 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: oe-kbuild-all

Hi Eduard,

kernel test robot noticed the following build warnings:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Eduard-Zingerman/bpf-move-explored_state-closer-to-the-beginning-of-verifier-c/20231022-091124
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20231022010812.9201-7-eddyz87%40gmail.com
patch subject: [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case
reproduce: (https://download.01.org/0day-ci/archive/20231022/202310221141.vepbw7v8-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310221141.vepbw7v8-lkp@intel.com/

# many are suggestions rather than must-fix

WARNING:SPLIT_STRING: quoted string split across lines
#122: FILE: tools/testing/selftests/bpf/progs/iters.c:1078:
+	"j_loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#136: FILE: tools/testing/selftests/bpf/progs/iters.c:1092:
+	"i_loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#141: FILE: tools/testing/selftests/bpf/progs/iters.c:1097:
+	"check_one_r6_%=:"
+		"if r6 != 1 goto check_zero_r6_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#146: FILE: tools/testing/selftests/bpf/progs/iters.c:1102:
+	"check_zero_r6_%=:"
+		"if r6 != 0 goto i_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#152: FILE: tools/testing/selftests/bpf/progs/iters.c:1108:
+	"check_one_r7_%=:"
+		"if r7 != 1 goto i_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#166: FILE: tools/testing/selftests/bpf/progs/iters.c:1122:
+	"i_loop_end_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#179: FILE: tools/testing/selftests/bpf/progs/iters.c:1135:
+	"i2_loop_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#184: FILE: tools/testing/selftests/bpf/progs/iters.c:1140:
+	"check2_one_r6_%=:"
+		"if r6 != 1 goto check2_zero_r6_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#189: FILE: tools/testing/selftests/bpf/progs/iters.c:1145:
+	"check2_zero_r6_%=:"
+		"if r6 != 0 goto i2_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#195: FILE: tools/testing/selftests/bpf/progs/iters.c:1151:
+	"check2_one_r7_%=:"
+		"if r7 != 1 goto i2_loop_%=;"

WARNING:SPLIT_STRING: quoted string split across lines
#200: FILE: tools/testing/selftests/bpf/progs/iters.c:1156:
+	"i2_loop_end_%=:"
+		"r1 = r10;"

WARNING:SPLIT_STRING: quoted string split across lines
#208: FILE: tools/testing/selftests/bpf/progs/iters.c:1164:
+	"j_loop_end_%=:"
+		"r1 = r10;"

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (5 preceding siblings ...)
  2023-10-22  1:08 ` [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case Eduard Zingerman
@ 2023-10-22  1:08 ` Eduard Zingerman
  2023-10-22  4:28   ` Andrii Nakryiko
  2023-10-23 17:17 ` [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-22  1:08 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend, Eduard Zingerman

Additional logging in is_state_visited(): if infinite loop is detected
print full verifier state for both current and equivalent states.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index baf31b61b3ff..a91aa8638dba 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16927,6 +16927,10 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			    !iter_active_depths_differ(&sl->state, cur)) {
 				verbose_linfo(env, insn_idx, "; ");
 				verbose(env, "infinite loop detected at insn %d\n", insn_idx);
+				verbose(env, "cur state:");
+				print_verifier_state(env, cur->frame[cur->curframe], true);
+				verbose(env, "old state:");
+				print_verifier_state(env, sl->state.frame[cur->curframe], true);
 				return -EINVAL;
 			}
 			/* if the verifier is processing a loop, avoid adding new state
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection
  2023-10-22  1:08 ` [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection Eduard Zingerman
@ 2023-10-22  4:28   ` Andrii Nakryiko
  0 siblings, 0 replies; 18+ messages in thread
From: Andrii Nakryiko @ 2023-10-22  4:28 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song,
	memxor, awerner32, john.fastabend

On Sat, Oct 21, 2023 at 6:08 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> Additional logging in is_state_visited(): if infinite loop is detected
> print full verifier state for both current and equivalent states.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/verifier.c | 4 ++++
>  1 file changed, 4 insertions(+)
>

Great, thanks!

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index baf31b61b3ff..a91aa8638dba 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -16927,6 +16927,10 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                             !iter_active_depths_differ(&sl->state, cur)) {
>                                 verbose_linfo(env, insn_idx, "; ");
>                                 verbose(env, "infinite loop detected at insn %d\n", insn_idx);
> +                               verbose(env, "cur state:");
> +                               print_verifier_state(env, cur->frame[cur->curframe], true);
> +                               verbose(env, "old state:");
> +                               print_verifier_state(env, sl->state.frame[cur->curframe], true);
>                                 return -EINVAL;
>                         }
>                         /* if the verifier is processing a loop, avoid adding new state
> --
> 2.42.0
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks
  2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
                   ` (6 preceding siblings ...)
  2023-10-22  1:08 ` [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection Eduard Zingerman
@ 2023-10-23 17:17 ` Eduard Zingerman
  2023-10-23 21:40   ` Eduard Zingerman
  7 siblings, 1 reply; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-23 17:17 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend

On Sun, 2023-10-22 at 04:08 +0300, Eduard Zingerman wrote:
[...]
> Changelog:
> V1 -> V2 [2], applied changes suggested by Alexei offlist:
> - __explored_state() function removed;
> - same_callsites() function is now used in clean_live_states();
> - patches #1,2 are added as preparatory code movement;
> - in process_iter_next_call() a safeguard is added to verify that
>   cur_st->parent exists and has expected insn index / call sites.

I have V3 ready and passing CI.

However I checked on Alexei's concerns regarding performance on
explored states cache miss and verifier does not behave well with this
patch-set. For example, the program at the end of the email causes
verifier to "hang" (loop inside is_state_visited() to no end).

There are several options to fix this:
(a) limit total iteration depth, as in [1], the limit would have to be
    at-least 1000 to make iters/task_vma pass;
(b) limit maximal number of checkpoint states associated with
    instruction and evict those with lowest dfs_depth;
(c) choose bigger constants in `sl->miss_cnt > sl->hit_cnt * 3 + 3` for
    checkpoint states.

Given that current failure mode is bad, should I submit V3 as-is or
should I explore options (b,c) and add one of those to V3?

[1] https://github.com/eddyz87/bpf/tree/bpf-iter-next-exact-widening-max-iter-depth

---

SEC("?raw_tp")
__failure
__naked int max_iter_depth(void)
{
	/* This is equivalent to C program below.
	 * The counter stored in r6 is used as precise after the loop,
	 * thus preventing widening. Verifier won't be able to conclude
	 * that such program terminates but it should gracefully exit.
	 *
	 * r6 = 0
	 * bpf_iter_num_new(&fp[-8], 0, 10)
	 * while (bpf_iter_num_next(&fp[-8])) {
	 *   r6 += 1;
	 * }
	 * bpf_iter_num_destroy(&fp[-8])
	 * ... force r6 precise ...
	 * return 0
	 */
	asm volatile (
		"r6 = 0;"
		"r1 = r10;"
		"r1 += -8;"
		"r2 = 0;"
		"r3 = 10;"
		"call %[bpf_iter_num_new];"
	"loop_%=:"
		"r1 = r10;"
		"r1 += -8;"
		"call %[bpf_iter_num_next];"
		"if r0 == 0 goto loop_end_%=;"
		"r6 += 1;"
		"goto loop_%=;"
	"loop_end_%=:"
		"r1 = r10;"
		"r1 += -8;"
		"call %[bpf_iter_num_destroy];"
		"r0 = r10;"
		"r0 += r6;" /* this forces r6 to be precise */
		"r0 = 0;"
		"exit;"
		:
		: __imm(bpf_iter_num_new),
		  __imm(bpf_iter_num_next),
		  __imm(bpf_iter_num_destroy)
		: __clobber_all
	);
}


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks
  2023-10-23 17:17 ` [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
@ 2023-10-23 21:40   ` Eduard Zingerman
  0 siblings, 0 replies; 18+ messages in thread
From: Eduard Zingerman @ 2023-10-23 21:40 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, memxor,
	awerner32, john.fastabend

On Mon, 2023-10-23 at 20:17 +0300, Eduard Zingerman wrote:
> On Sun, 2023-10-22 at 04:08 +0300, Eduard Zingerman wrote:
> [...]
> > Changelog:
> > V1 -> V2 [2], applied changes suggested by Alexei offlist:
> > - __explored_state() function removed;
> > - same_callsites() function is now used in clean_live_states();
> > - patches #1,2 are added as preparatory code movement;
> > - in process_iter_next_call() a safeguard is added to verify that
> >   cur_st->parent exists and has expected insn index / call sites.
> 
> I have V3 ready and passing CI.
> 
> However I checked on Alexei's concerns regarding performance on
> explored states cache miss and verifier does not behave well with this
> patch-set. For example, the program at the end of the email causes
> verifier to "hang" (loop inside is_state_visited() to no end).
> 
> There are several options to fix this:
> (a) limit total iteration depth, as in [1], the limit would have to be
>     at-least 1000 to make iters/task_vma pass;
> (b) limit maximal number of checkpoint states associated with
>     instruction and evict those with lowest dfs_depth;
> (c) choose bigger constants in `sl->miss_cnt > sl->hit_cnt * 3 + 3` for
>     checkpoint states.

I played a bit with constants in 'eviction on miss' formula using [1] (option c).
There are three relevant tests:
- iters/max_iter_depth: should report load failure within a reasonable time;
- iters/checkpoint_states_deletion: should pass;
- verif_scale_pyperf600_iter: should pass.

I think iters/checkpoint_states_deletion represents the worst case scenario,
because depending on number of variables N, it produces 2**N distinct states.
The formula for eviction that does not loose relevant states is:

    sl->miss_cnt > sl->hit_cnt * 2**N + 2**N

(because states start to repeat after 2**N iterations).

W/o eviction for checkpoint states maximal number of variables
verifier could handle in this test is 9, with reported 958,883 insns processed.
Which corresponds to formula (sl->miss_cnt > sl->hit_cnt * 512 + 512).

Using these values I get the following execution times:

| test                       | time ./test_progs -a <test> |
|----------------------------+-----------------------------|
| verif_scale_pyperf600_iter |                        0.2s |
| checkpoint_states_deletion |                        5.8s |
| max_iter_depth             |                       23.9s |

Going one step lower to 8 variables (and 256 as a constant),
checkpoint_states_deletion takes 248,133 insns to complete
and timings table looks as follows:

| test                       | time ./test_progs -a <test> |
|----------------------------+-----------------------------|
| verif_scale_pyperf600_iter |                        0.2s |
| checkpoint_states_deletion |                        1.0s |
| max_iter_depth             |                       15.2s |

So, it is possible to get speedup for worst case scenario by leaving
some instruction budget on the table.

IMO using formula (sl->miss_cnt > sl->hit_cnt * 512 + 512) to evict
checkpoints kind-off sort-off makes sense but is very hacky.
(Or 256).

I think that better solution would be to go for option (b) from a
previous email:
- evict old checkpoints basing on dfs_depth;
- also use a secondary hash-table for checkpoints and hash not only
  insn_idx but also some fingerprint of register states, thus avoiding
  long state list walks.

But that would require some additional coding and I know that Alexei
wants to land this thing sooner than later.

[1] https://github.com/eddyz87/bpf/tree/iter-exact-eviction-formula-experiments

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-10-23 21:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-22  1:08 [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-22  1:08 ` [PATCH bpf-next v2 1/7] bpf: move explored_state() closer to the beginning of verifier.c Eduard Zingerman
2023-10-22  1:08 ` [PATCH bpf-next v2 2/7] bpf: extract same_callsites() as utility function Eduard Zingerman
2023-10-22  1:08 ` [PATCH bpf-next v2 3/7] bpf: exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-22  4:16   ` Andrii Nakryiko
2023-10-23 13:38     ` Eduard Zingerman
2023-10-22  1:08 ` [PATCH bpf-next v2 4/7] selftests/bpf: tests with delayed read/precision makrs in loop body Eduard Zingerman
2023-10-22  3:00   ` kernel test robot
2023-10-22  1:08 ` [PATCH bpf-next v2 5/7] bpf: correct loop detection for iterators convergence Eduard Zingerman
2023-10-22  4:28   ` Andrii Nakryiko
2023-10-23 14:47     ` Eduard Zingerman
2023-10-23 16:16       ` Andrii Nakryiko
2023-10-22  1:08 ` [PATCH bpf-next v2 6/7] selftests/bpf: test if state loops are detected in a tricky case Eduard Zingerman
2023-10-22  3:11   ` kernel test robot
2023-10-22  1:08 ` [PATCH bpf-next v2 7/7] bpf: print full verifier states on infinite loop detection Eduard Zingerman
2023-10-22  4:28   ` Andrii Nakryiko
2023-10-23 17:17 ` [PATCH bpf-next v2 0/7] exact states comparison for iterator convergence checks Eduard Zingerman
2023-10-23 21:40   ` Eduard Zingerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox