From: Eduard Zingerman <eddyz87@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Kernel Team <kernel-team@fb.com>,
Yonghong Song <yonghong.song@linux.dev>
Subject: Re: [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry
Date: Mon, 28 Apr 2025 12:26:25 -0700 [thread overview]
Message-ID: <m2plgwjdem.fsf@gmail.com> (raw)
In-Reply-To: <CAADnVQK1tP1_of=pn7HdeZNqmPu=4AqpRETeOVeQMjDfSt0NOw@mail.gmail.com> (Alexei Starovoitov's message of "Mon, 28 Apr 2025 10:31:20 -0700")
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
[...]
>> @@ -466,11 +456,10 @@ struct bpf_verifier_state {
>> u32 dfs_depth;
>> u32 callback_unroll_depth;
>> u32 may_goto_depth;
>> - /* If this state was ever pointed-to by other state's loop_entry field
>> - * this flag would be set to true. Used to avoid freeing such states
>> - * while they are still in use.
>> + /* If this state is a checkpoint at insn_idx that belongs to an SCC,
>> + * record the SCC epoch at the time of checkpoint creation.
>> */
>
> Please use normal kernel comment style for all new code:
> /*
> * multi-
> * line
> * comment
> */
>
Ack.
>> - u32 used_as_loop_entry;
>> + u32 scc_epoch;
>> };
>>
>> #define bpf_get_spilled_reg(slot, frame, mask) \
>> @@ -717,6 +706,29 @@ struct bpf_idset {
>> u32 ids[BPF_ID_MAP_SIZE];
>> };
>>
>> +/* Information tracked for CFG strongly connected components */
>> +struct bpf_scc_info {
>> + /* True if states_equal(... RANGE_WITHIN) ever returned
>> + * true for a state with insn_idx within this SCC.
>> + * E.g. for iterator next call.
>
> I feel RANGE_WITHIN is unnecessary information here.
> Maybe reword it as:
> Set to true when is_state_visited() detected convergence of open coded iterator.
> ?
Technically this could also happen for callbacks, e.g. for bpf_loop().
Will reword as you suggest, should be clear enough for a few people
working with this code.
[...]
>> @@ -18222,15 +18129,17 @@ static void clean_func_state(struct bpf_verifier_env *env,
>> }
>> }
>>
>> +static bool verifier_state_cleaned(struct bpf_verifier_state *st)
>> +{
>> + /* all regs in this state in all frames were already marked */
>> + return st->frame[0]->regs[0].live & REG_LIVE_DONE;
>> +}
>
> move refactor into pre-patch.
>
Ack.
[...]
>> @@ -18243,6 +18152,114 @@ static u32 frame_insn_idx(struct bpf_verifier_state *st, u32 frame)
>> : st->frame[frame + 1]->callsite;
>> }
>>
>> +/* Open coded iterators introduce loops in the verifier state graph.
>> + * State graph loops can result in incomplete read and precision marks
>> + * on individual states. E.g. consider the following states graph:
>> + *
>> + * .-> A --. Assume the states are visited in the order A, B, C.
>> + * | | | Assume that state B reaches a state equivalent to state A.
>> + * | v v At this point, state C has not been processed yet,
>> + * '-- B C so state A does not have any read or precision marks from C yet.
>> + * As a result, these marks won't be propagated to B.
>> + *
>> + * If the marks on B are incomplete, it would be unsafe to use it in
>> + * states_equal() checks.
>
> earlier RANGE_WITHIN distinction was unnecessary,
> but here we should clarify that
> states_equal(NOT_EXACT) is unsafe,
> while states_equal(RANGE_WITHIN) is fine.
> Right?
>
Ack.
>> + *
>> + * To avoid this safety issue, and since states with incomplete read
>> + * marks can only occur within control flow graph loops, the verifier
>> + * assumes that any state with bpf_verifier_state->insn_idx residing
>> + * in a strongly connected component (SCC) has read and precision
>> + * marks for all registers. This assumption is enforced by the
>> + * function mark_all_regs_read_and_precise(), which assigns
>> + * corresponding marks.
>> + *
>> + * An intuitive point to call mark_all_regs_read_and_precise() would
>> + * be when a new state is created in is_state_visited().
>> + * However, doing so would interfere with widen_imprecise_scalars(),
>> + * which widens scalars in the current state after checking registers in a
>> + * parent state. Registers are not widened if they are marked as precise
>> + * in the parent state.
>> + *
>> + * To avoid interfering with widening logic,
>> + * a call to mark_all_regs_read_and_precise() for state is postponed
>> + * until no widening is possible in any descendant of state S.
>> + *
>> + * Another intuitive spot to call mark_all_regs_read_and_precise()
>> + * would be in update_branch_counts() when S's branches counter
>> + * reaches 0. However, this falls short in the following case:
>> + *
>> + * sum = 0
>> + * bpf_repeat(10) { // a
>> + * if (unlikely(bpf_get_prandom_u32())) // b
>> + * sum += 1;
>> + * if (bpf_get_prandom_u32()) // c
>> + * asm volatile ("");
>> + * asm volatile ("goto +0;"); // d
>> + * }
>> + *
>> + * Here a checkpoint is created at (d) with {sum=0} and the branch counter
>> + * for (d) reaches 0, so 'sum' would be marked precise.
>> + * When second branch of (c) reaches (d), checkpoint would be hit,
>> + * and the precision mark for 'sum' propagated to (a).
>> + * When the second branch of (b) reaches (a), the state would be {sum=1},
>> + * no widening would occur, causing verification to continue forever.
>> + *
>> + * To avoid such premature precision markings, the verifier postpones
>> + * the call to mark_all_regs_read_and_precise() for state S even further.
>> + * Suppose state P is a [grand]parent of state S and is the first state
>> + * in the current state chain with state->insn_idx within current SCC.
>> + * mark_all_regs_read_and_precise() for state S is only called once P
>> + * is fully explored.
>> + *
>> + * The struct 'bpf_scc_info' is used to track this condition:
>> + * - bpf_scc_info->branches counts how many states currently
>> + * in env->cur_state or env->head originate from this SCC;
>
> I'm still struggling with this definition of scc_info->branches
> and this extra 'enter':
>
>> cur->insn_hist_start = cur->insn_hist_end;
>> cur->dfs_depth = new->dfs_depth + 1;
>> list_add(&new_sl->node, head);
>> + parent_scc_enter(env, env->cur_state);
>
> since we don't do it for st->branches.
>
> Can we make scc->branches symmetrical to st->branches ?
> Only push_stack() will do scc_enter(),
> and scc_exit() in update_branch_counts()
> even if st->branches > 0.
>
> If that's not possible let's pick a different name for scc_info->branches
> to make it clear that the logic is quite different to st->branches.
>
It is simmetrical in a way.
Operations on bpf_verifier_state->branches:
- do_check_common() initializes env->cur_state->branches == 1
(and this is maintained as invariant);
- push_stack() does env->cur_state->parent->branches++;
- update_branch_counts() does env->cur_state->parent->branches--
(and continues recursively).
Operations on bpf_scc_info->branches:
- is_state_visited() does insn_scc(env->cur_state->parent->insn_idx)->branches++
- push_stack() does insn_scc(env->cur_state->parent->insn_idx)->branches++;
- update_branch_counts() does
insn_scc(env->cur_state->parent->insn_idx)->branches--
(and continues recursively);
The main difference is that bpf_verifier_state->branches is initialized to 1,
while bpf_scc_info->branches is initialized to 0.
Hence a call to parent_scc_enter() in is_state_visited().
>> + * - bpf_scc_info->scc_epoch counts how many times 'branches'
>> + * has reached zero;
>> + * - bpf_verifier_state->scc_epoch records the epoch of the SCC
>> + * corresponding to bpf_verifier_state->insn_idx at the moment
>> + * of state creation.
>> + *
>> + * Functions parent_scc_enter() and parent_scc_exit() maintain the
>> + * bpf_scc_info->{branches,scc_epoch} counters.
>> + *
>> + * bpf_scc_info->branches reaching zero indicates that state P is
>> + * fully explored. Its descendants residing in the same SCC have
>> + * state->scc_epoch == scc_info->scc_epoch. parent_scc_exit()
>> + * increments scc_info->scc_epoch, allowing clean_live_states() to
>> + * detect these states and apply mark_all_regs_read_and_precise().
>> + */
next prev parent reply other threads:[~2025-04-28 19:26 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-26 10:46 [PATCH bpf-next v1 0/4] bpf: compute SCC for BPF program control flow graph Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 1/4] bpf: compute SCCs in " Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 2/4] bpf: frame_insn_idx() utility function Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 3/4] bpf: use SCC info instead of loop_entry Eduard Zingerman
2025-04-26 14:45 ` kernel test robot
2025-04-26 19:51 ` Eduard Zingerman
2025-04-28 17:31 ` Alexei Starovoitov
2025-04-28 19:26 ` Eduard Zingerman [this message]
2025-04-28 20:25 ` Alexei Starovoitov
2025-04-28 21:00 ` Eduard Zingerman
2025-04-26 10:46 ` [PATCH bpf-next v1 4/4] selftests/bpf: tests with a loop state missing read/precision mark Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2plgwjdem.fsf@gmail.com \
--to=eddyz87@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=kernel-team@fb.com \
--cc=martin.lau@linux.dev \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox