Re: [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eduard Zingerman <eddyz87@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>,  Alexei Starovoitov <ast@kernel.org>,
	 Andrii Nakryiko <andrii@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	 Kernel Team <kernel-team@fb.com>,
	 Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry
Date: Mon, 28 Apr 2025 12:26:25 -0700	[thread overview]
Message-ID: <m2plgwjdem.fsf@gmail.com> (raw)
In-Reply-To: <CAADnVQK1tP1_of=pn7HdeZNqmPu=4AqpRETeOVeQMjDfSt0NOw@mail.gmail.com> (Alexei Starovoitov's message of "Mon, 28 Apr 2025 10:31:20 -0700")

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

[...]

>> @@ -466,11 +456,10 @@ struct bpf_verifier_state {
>>         u32 dfs_depth;
>>         u32 callback_unroll_depth;
>>         u32 may_goto_depth;
>> -       /* If this state was ever pointed-to by other state's loop_entry field
>> -        * this flag would be set to true. Used to avoid freeing such states
>> -        * while they are still in use.
>> +       /* If this state is a checkpoint at insn_idx that belongs to an SCC,
>> +        * record the SCC epoch at the time of checkpoint creation.
>>          */
>
> Please use normal kernel comment style for all new code:
> /*
>  * multi-
>  * line
>  * comment
>  */
>

Ack.

>> -       u32 used_as_loop_entry;
>> +       u32 scc_epoch;
>>  };
>>
>>  #define bpf_get_spilled_reg(slot, frame, mask)                         \
>> @@ -717,6 +706,29 @@ struct bpf_idset {
>>         u32 ids[BPF_ID_MAP_SIZE];
>>  };
>>
>> +/* Information tracked for CFG strongly connected components */
>> +struct bpf_scc_info {
>> +       /* True if states_equal(... RANGE_WITHIN) ever returned
>> +        * true for a state with insn_idx within this SCC.
>> +        * E.g. for iterator next call.
>
> I feel RANGE_WITHIN is unnecessary information here.
> Maybe reword it as:
> Set to true when is_state_visited() detected convergence of open coded iterator.
> ?

Technically this could also happen for callbacks, e.g. for bpf_loop().
Will reword as you suggest, should be clear enough for a few people
working with this code.

[...]

>> @@ -18222,15 +18129,17 @@ static void clean_func_state(struct bpf_verifier_env *env,
>>         }
>>  }
>>
>> +static bool verifier_state_cleaned(struct bpf_verifier_state *st)
>> +{
>> +       /* all regs in this state in all frames were already marked */
>> +       return st->frame[0]->regs[0].live & REG_LIVE_DONE;
>> +}
>
> move refactor into pre-patch.
>

Ack.

[...]

>> @@ -18243,6 +18152,114 @@ static u32 frame_insn_idx(struct bpf_verifier_state *st, u32 frame)
>>                : st->frame[frame + 1]->callsite;
>>  }
>>
>> +/* Open coded iterators introduce loops in the verifier state graph.
>> + * State graph loops can result in incomplete read and precision marks
>> + * on individual states. E.g. consider the following states graph:
>> + *
>> + *  .-> A --.  Assume the states are visited in the order A, B, C.
>> + *  |   |   |  Assume that state B reaches a state equivalent to state A.
>> + *  |   v   v  At this point, state C has not been processed yet,
>> + *  '-- B   C  so state A does not have any read or precision marks from C yet.
>> + *             As a result, these marks won't be propagated to B.
>> + *
>> + * If the marks on B are incomplete, it would be unsafe to use it in
>> + * states_equal() checks.
>
> earlier RANGE_WITHIN distinction was unnecessary,
> but here we should clarify that
> states_equal(NOT_EXACT) is unsafe,
> while states_equal(RANGE_WITHIN) is fine.
> Right?
>

Ack.

>> + *
>> + * To avoid this safety issue, and since states with incomplete read
>> + * marks can only occur within control flow graph loops, the verifier
>> + * assumes that any state with bpf_verifier_state->insn_idx residing
>> + * in a strongly connected component (SCC) has read and precision
>> + * marks for all registers. This assumption is enforced by the
>> + * function mark_all_regs_read_and_precise(), which assigns
>> + * corresponding marks.
>> + *
>> + * An intuitive point to call mark_all_regs_read_and_precise() would
>> + * be when a new state is created in is_state_visited().
>> + * However, doing so would interfere with widen_imprecise_scalars(),
>> + * which widens scalars in the current state after checking registers in a
>> + * parent state. Registers are not widened if they are marked as precise
>> + * in the parent state.
>> + *
>> + * To avoid interfering with widening logic,
>> + * a call to mark_all_regs_read_and_precise() for state is postponed
>> + * until no widening is possible in any descendant of state S.
>> + *
>> + * Another intuitive spot to call mark_all_regs_read_and_precise()
>> + * would be in update_branch_counts() when S's branches counter
>> + * reaches 0. However, this falls short in the following case:
>> + *
>> + *     sum = 0
>> + *     bpf_repeat(10) {                              // a
>> + *             if (unlikely(bpf_get_prandom_u32()))  // b
>> + *                     sum += 1;
>> + *             if (bpf_get_prandom_u32())            // c
>> + *                     asm volatile ("");
>> + *             asm volatile ("goto +0;");            // d
>> + *     }
>> + *
>> + * Here a checkpoint is created at (d) with {sum=0} and the branch counter
>> + * for (d) reaches 0, so 'sum' would be marked precise.
>> + * When second branch of (c) reaches (d), checkpoint would be hit,
>> + * and the precision mark for 'sum' propagated to (a).
>> + * When the second branch of (b) reaches (a), the state would be {sum=1},
>> + * no widening would occur, causing verification to continue forever.
>> + *
>> + * To avoid such premature precision markings, the verifier postpones
>> + * the call to mark_all_regs_read_and_precise() for state S even further.
>> + * Suppose state P is a [grand]parent of state S and is the first state
>> + * in the current state chain with state->insn_idx within current SCC.
>> + * mark_all_regs_read_and_precise() for state S is only called once P
>> + * is fully explored.
>> + *
>> + * The struct 'bpf_scc_info' is used to track this condition:
>> + * - bpf_scc_info->branches counts how many states currently
>> + *   in env->cur_state or env->head originate from this SCC;
>
> I'm still struggling with this definition of scc_info->branches
> and this extra 'enter':
>
>>         cur->insn_hist_start = cur->insn_hist_end;
>>         cur->dfs_depth = new->dfs_depth + 1;
>>         list_add(&new_sl->node, head);
>> +       parent_scc_enter(env, env->cur_state);
>
> since we don't do it for st->branches.
>
> Can we make scc->branches symmetrical to st->branches ?
> Only push_stack() will do scc_enter(),
> and scc_exit() in update_branch_counts()
> even if st->branches > 0.
>
> If that's not possible let's pick a different name for scc_info->branches
> to make it clear that the logic is quite different to st->branches.
>

It is simmetrical in a way.

Operations on bpf_verifier_state->branches:
- do_check_common() initializes env->cur_state->branches == 1
  (and this is maintained as invariant);
- push_stack() does env->cur_state->parent->branches++;
- update_branch_counts() does env->cur_state->parent->branches--
  (and continues recursively).

Operations on bpf_scc_info->branches:
- is_state_visited() does insn_scc(env->cur_state->parent->insn_idx)->branches++
- push_stack() does insn_scc(env->cur_state->parent->insn_idx)->branches++;
- update_branch_counts() does
  insn_scc(env->cur_state->parent->insn_idx)->branches--
  (and continues recursively);

The main difference is that bpf_verifier_state->branches is initialized to 1,
while bpf_scc_info->branches is initialized to 0.
Hence a call to parent_scc_enter() in is_state_visited().

>> + * - bpf_scc_info->scc_epoch counts how many times 'branches'
>> + *   has reached zero;
>> + * - bpf_verifier_state->scc_epoch records the epoch of the SCC
>> + *   corresponding to bpf_verifier_state->insn_idx at the moment
>> + *   of state creation.
>> + *
>> + * Functions parent_scc_enter() and parent_scc_exit() maintain the
>> + * bpf_scc_info->{branches,scc_epoch} counters.
>> + *
>> + * bpf_scc_info->branches reaching zero indicates that state P is
>> + * fully explored. Its descendants residing in the same SCC have
>> + * state->scc_epoch == scc_info->scc_epoch. parent_scc_exit()
>> + * increments scc_info->scc_epoch, allowing clean_live_states() to
>> + * detect these states and apply mark_all_regs_read_and_precise().
>> + */

next prev parent reply	other threads:[~2025-04-28 19:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-26 10:46 [PATCH bpf-next v1 0/4] bpf: compute SCC for BPF program control flow graph Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 1/4] bpf: compute SCCs in " Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 2/4] bpf: frame_insn_idx() utility function Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry Eduard Zingerman
2025-04-26 14:45   ` kernel test robot
2025-04-26 19:51     ` Eduard Zingerman
2025-04-28 17:31   ` Alexei Starovoitov
2025-04-28 19:26     ` Eduard Zingerman [this message]
2025-04-28 20:25       ` Alexei Starovoitov
2025-04-28 21:00         ` Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 4/4] selftests/bpf: tests with a loop state missing read/precision mark Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2plgwjdem.fsf@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@linux.dev \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.