BPF List
 help / color / mirror / Atom feed
From: Eduard Zingerman <eddyz87@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>,  Alexei Starovoitov <ast@kernel.org>,
	 Andrii Nakryiko <andrii@kernel.org>,
	 Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	 Kernel Team <kernel-team@fb.com>,
	 Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry
Date: Mon, 28 Apr 2025 12:26:25 -0700	[thread overview]
Message-ID: <m2plgwjdem.fsf@gmail.com> (raw)
In-Reply-To: <CAADnVQK1tP1_of=pn7HdeZNqmPu=4AqpRETeOVeQMjDfSt0NOw@mail.gmail.com> (Alexei Starovoitov's message of "Mon, 28 Apr 2025 10:31:20 -0700")

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

[...]

>> @@ -466,11 +456,10 @@ struct bpf_verifier_state {
>>         u32 dfs_depth;
>>         u32 callback_unroll_depth;
>>         u32 may_goto_depth;
>> -       /* If this state was ever pointed-to by other state's loop_entry field
>> -        * this flag would be set to true. Used to avoid freeing such states
>> -        * while they are still in use.
>> +       /* If this state is a checkpoint at insn_idx that belongs to an SCC,
>> +        * record the SCC epoch at the time of checkpoint creation.
>>          */
>
> Please use normal kernel comment style for all new code:
> /*
>  * multi-
>  * line
>  * comment
>  */
>

Ack.

>> -       u32 used_as_loop_entry;
>> +       u32 scc_epoch;
>>  };
>>
>>  #define bpf_get_spilled_reg(slot, frame, mask)                         \
>> @@ -717,6 +706,29 @@ struct bpf_idset {
>>         u32 ids[BPF_ID_MAP_SIZE];
>>  };
>>
>> +/* Information tracked for CFG strongly connected components */
>> +struct bpf_scc_info {
>> +       /* True if states_equal(... RANGE_WITHIN) ever returned
>> +        * true for a state with insn_idx within this SCC.
>> +        * E.g. for iterator next call.
>
> I feel RANGE_WITHIN is unnecessary information here.
> Maybe reword it as:
> Set to true when is_state_visited() detected convergence of open coded iterator.
> ?

Technically this could also happen for callbacks, e.g. for bpf_loop().
Will reword as you suggest, should be clear enough for a few people
working with this code.

[...]

>> @@ -18222,15 +18129,17 @@ static void clean_func_state(struct bpf_verifier_env *env,
>>         }
>>  }
>>
>> +static bool verifier_state_cleaned(struct bpf_verifier_state *st)
>> +{
>> +       /* all regs in this state in all frames were already marked */
>> +       return st->frame[0]->regs[0].live & REG_LIVE_DONE;
>> +}
>
> move refactor into pre-patch.
>

Ack.

[...]

>> @@ -18243,6 +18152,114 @@ static u32 frame_insn_idx(struct bpf_verifier_state *st, u32 frame)
>>                : st->frame[frame + 1]->callsite;
>>  }
>>
>> +/* Open coded iterators introduce loops in the verifier state graph.
>> + * State graph loops can result in incomplete read and precision marks
>> + * on individual states. E.g. consider the following states graph:
>> + *
>> + *  .-> A --.  Assume the states are visited in the order A, B, C.
>> + *  |   |   |  Assume that state B reaches a state equivalent to state A.
>> + *  |   v   v  At this point, state C has not been processed yet,
>> + *  '-- B   C  so state A does not have any read or precision marks from C yet.
>> + *             As a result, these marks won't be propagated to B.
>> + *
>> + * If the marks on B are incomplete, it would be unsafe to use it in
>> + * states_equal() checks.
>
> earlier RANGE_WITHIN distinction was unnecessary,
> but here we should clarify that
> states_equal(NOT_EXACT) is unsafe,
> while states_equal(RANGE_WITHIN) is fine.
> Right?
>

Ack.

>> + *
>> + * To avoid this safety issue, and since states with incomplete read
>> + * marks can only occur within control flow graph loops, the verifier
>> + * assumes that any state with bpf_verifier_state->insn_idx residing
>> + * in a strongly connected component (SCC) has read and precision
>> + * marks for all registers. This assumption is enforced by the
>> + * function mark_all_regs_read_and_precise(), which assigns
>> + * corresponding marks.
>> + *
>> + * An intuitive point to call mark_all_regs_read_and_precise() would
>> + * be when a new state is created in is_state_visited().
>> + * However, doing so would interfere with widen_imprecise_scalars(),
>> + * which widens scalars in the current state after checking registers in a
>> + * parent state. Registers are not widened if they are marked as precise
>> + * in the parent state.
>> + *
>> + * To avoid interfering with widening logic,
>> + * a call to mark_all_regs_read_and_precise() for state is postponed
>> + * until no widening is possible in any descendant of state S.
>> + *
>> + * Another intuitive spot to call mark_all_regs_read_and_precise()
>> + * would be in update_branch_counts() when S's branches counter
>> + * reaches 0. However, this falls short in the following case:
>> + *
>> + *     sum = 0
>> + *     bpf_repeat(10) {                              // a
>> + *             if (unlikely(bpf_get_prandom_u32()))  // b
>> + *                     sum += 1;
>> + *             if (bpf_get_prandom_u32())            // c
>> + *                     asm volatile ("");
>> + *             asm volatile ("goto +0;");            // d
>> + *     }
>> + *
>> + * Here a checkpoint is created at (d) with {sum=0} and the branch counter
>> + * for (d) reaches 0, so 'sum' would be marked precise.
>> + * When second branch of (c) reaches (d), checkpoint would be hit,
>> + * and the precision mark for 'sum' propagated to (a).
>> + * When the second branch of (b) reaches (a), the state would be {sum=1},
>> + * no widening would occur, causing verification to continue forever.
>> + *
>> + * To avoid such premature precision markings, the verifier postpones
>> + * the call to mark_all_regs_read_and_precise() for state S even further.
>> + * Suppose state P is a [grand]parent of state S and is the first state
>> + * in the current state chain with state->insn_idx within current SCC.
>> + * mark_all_regs_read_and_precise() for state S is only called once P
>> + * is fully explored.
>> + *
>> + * The struct 'bpf_scc_info' is used to track this condition:
>> + * - bpf_scc_info->branches counts how many states currently
>> + *   in env->cur_state or env->head originate from this SCC;
>
> I'm still struggling with this definition of scc_info->branches
> and this extra 'enter':
>
>>         cur->insn_hist_start = cur->insn_hist_end;
>>         cur->dfs_depth = new->dfs_depth + 1;
>>         list_add(&new_sl->node, head);
>> +       parent_scc_enter(env, env->cur_state);
>
> since we don't do it for st->branches.
>
> Can we make scc->branches symmetrical to st->branches ?
> Only push_stack() will do scc_enter(),
> and scc_exit() in update_branch_counts()
> even if st->branches > 0.
>
> If that's not possible let's pick a different name for scc_info->branches
> to make it clear that the logic is quite different to st->branches.
>

It is simmetrical in a way.

Operations on bpf_verifier_state->branches:
- do_check_common() initializes env->cur_state->branches == 1
  (and this is maintained as invariant);
- push_stack() does env->cur_state->parent->branches++;
- update_branch_counts() does env->cur_state->parent->branches--
  (and continues recursively).

Operations on bpf_scc_info->branches:
- is_state_visited() does insn_scc(env->cur_state->parent->insn_idx)->branches++
- push_stack() does insn_scc(env->cur_state->parent->insn_idx)->branches++;
- update_branch_counts() does
  insn_scc(env->cur_state->parent->insn_idx)->branches--
  (and continues recursively);

The main difference is that bpf_verifier_state->branches is initialized to 1,
while bpf_scc_info->branches is initialized to 0.
Hence a call to parent_scc_enter() in is_state_visited().

>> + * - bpf_scc_info->scc_epoch counts how many times 'branches'
>> + *   has reached zero;
>> + * - bpf_verifier_state->scc_epoch records the epoch of the SCC
>> + *   corresponding to bpf_verifier_state->insn_idx at the moment
>> + *   of state creation.
>> + *
>> + * Functions parent_scc_enter() and parent_scc_exit() maintain the
>> + * bpf_scc_info->{branches,scc_epoch} counters.
>> + *
>> + * bpf_scc_info->branches reaching zero indicates that state P is
>> + * fully explored. Its descendants residing in the same SCC have
>> + * state->scc_epoch == scc_info->scc_epoch. parent_scc_exit()
>> + * increments scc_info->scc_epoch, allowing clean_live_states() to
>> + * detect these states and apply mark_all_regs_read_and_precise().
>> + */

  reply	other threads:[~2025-04-28 19:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-26 10:46 [PATCH bpf-next v1 0/4] bpf: compute SCC for BPF program control flow graph Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 1/4] bpf: compute SCCs in " Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 2/4] bpf: frame_insn_idx() utility function Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry Eduard Zingerman
2025-04-26 14:45   ` kernel test robot
2025-04-26 19:51     ` Eduard Zingerman
2025-04-28 17:31   ` Alexei Starovoitov
2025-04-28 19:26     ` Eduard Zingerman [this message]
2025-04-28 20:25       ` Alexei Starovoitov
2025-04-28 21:00         ` Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 4/4] selftests/bpf: tests with a loop state missing read/precision mark Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m2plgwjdem.fsf@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@linux.dev \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox