From: sashiko-bot@kernel.org
To: "Eduard Zingerman" <eddyz87@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH RFC bpf-next v2 5/6] bpf: report register diff summary for hot callchains
Date: Tue, 26 May 2026 21:31:36 +0000 [thread overview]
Message-ID: <20260526213137.382441F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260526-better-1m-reporting-v2-5-e7ec61c45d41@gmail.com>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] O(N^2) state comparisons without cond_resched() can cause CPU lockups
- [Medium] callchain_matches_state() does not check the leaf instruction index
- [Medium] print_hotspots() failure masks the primary -E2BIG complexity limit error
- [Low] Integer overflow in state_diff_cmp sorting function
--
commit 33b39095313d5bead6d88744178c2d3f55d7b64b
Author: Eduard Zingerman <eddyz87@gmail.com>
bpf: report register diff summary for hot callchains
This commit introduces a feature to report the most frequently varying
locations (registers or stack slots) when a hot callchain ends. It modifies
states_equal() to identify the first differing location between states and
counts occurrences across explored states to approximate a diff summary.
> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
> index 8773381360095..79e76bc2d1712 100644
> --- a/kernel/bpf/states.c
> +++ b/kernel/bpf/states.c
[ ... ]
> @@ -1588,3 +1629,98 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> list_add(&new_sl->node, head);
> return 0;
> }
> +
> +static bool callchain_matches_state(struct bpf_callchain *cc,
> + struct bpf_verifier_state *st)
> +{
> + int i;
> +
> + if (st->curframe != cc->curframe)
> + return false;
> + for (i = 0; i < (int)cc->curframe; i++)
> + if (st->frame[i + 1]->callsite != cc->insn_idx[i])
> + return false;
> + return true;
> +}
[Severity: Medium]
Does this function need to verify the leaf instruction index
(st->insn_idx == cc->insn_idx[cc->curframe])?
States for different instructions can hash to the same bucket via
(insn_idx ^ callsite) % env->prog->len. Without checking the leaf
instruction, a hash collision could cause states from different instructions
to be incorrectly matched, violating assumptions in func_states_equal()
which relies on live_regs_before for a specific instruction.
> +
> +struct state_diff_cnt {
> + struct bpf_state_diff diff;
> + u32 cnt;
> +};
> +
> +static int state_diff_cmp(const void *a, const void *b)
> +{
> + return ((struct state_diff_cnt *)b)->cnt - ((struct state_diff_cnt *)a)->cnt;
> +}
[Severity: Low]
Can this subtraction overflow?
The cnt fields are unsigned 32-bit integers, and the function returns a
signed 32-bit integer. If the difference exceeds INT_MAX, the subtraction
will overflow into a negative value, resulting in an inverted sort order.
> +
> +static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b)
> +{
> + return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind;
> +}
> +
> +int bpf_sample_state_diffs(struct bpf_verifier_env *env,
> + struct bpf_callchain *cc,
> + struct bpf_state_diff *top_diffs,
> + int *nr_diffs)
> +{
> + struct bpf_verifier_state_list *sl_i, *sl_j;
> + struct state_diff_cnt *diff_cnts = NULL;
> + struct list_head *pos_i, *pos_j, *head;
> + u32 leaf_insn, callsite, hash_idx;
> + int i, cap = 0, nr_locs = 0;
> +
> + leaf_insn = cc->insn_idx[cc->curframe];
> + callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC;
> + hash_idx = (leaf_insn ^ callsite) % env->prog->len;
> + head = &env->explored_states[hash_idx];
> +
> + list_for_each(pos_i, head) {
> + sl_i = container_of(pos_i, struct bpf_verifier_state_list, node);
> + if (!callchain_matches_state(cc, &sl_i->state))
> + continue;
> + list_for_each(pos_j, head) {
> + struct bpf_state_diff diff = {};
> +
> + if (pos_i == pos_j)
> + continue;
> + sl_j = container_of(pos_j, struct bpf_verifier_state_list, node);
> + if (!callchain_matches_state(cc, &sl_j->state))
> + continue;
> + if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff))
[Severity: High]
Can these nested list_for_each loops cause a CPU soft lockup?
A single hash bucket can accumulate tens to hundreds of thousands of cached
states for a single instruction in complex programs. Without cond_resched()
calls within these O(N^2) loops, billions of states_equal() comparisons
could execute, monopolizing the CPU and leading to a local DoS.
[ ... ]
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 54b7ad65b7fc1..d09c014462f13 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
[ ... ]
> @@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env)
> insn_aux = &env->insn_aux_data[env->insn_idx];
>
> if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> - print_hotspots(env);
> + err = print_hotspots(env);
> + if (err)
> + return err;
[Severity: Medium]
If print_hotspots() returns an error (like -ENOMEM from kvrealloc()), does
this mask the primary -E2BIG complexity limit error?
Returning err directly will skip logging "BPF program is too large" and
return a misleading error code to the user, who might attribute it to system
memory exhaustion instead of hitting the verifier complexity limit.
> verbose(env,
> "BPF program is too large. Processed %d insn\n",
> env->insn_processed);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260526-better-1m-reporting-v2-0-e7ec61c45d41@gmail.com?part=5
next prev parent reply other threads:[~2026-05-26 21:31 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-26 19:37 [PATCH RFC bpf-next v2 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 2/6] bpf: compute loops hierarchy Eduard Zingerman
2026-05-26 20:26 ` sashiko-bot
2026-05-26 20:33 ` Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
2026-05-26 20:17 ` bot+bpf-ci
2026-05-26 20:35 ` Eduard Zingerman
2026-05-26 21:31 ` sashiko-bot [this message]
2026-05-26 23:24 ` Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260526213137.382441F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=eddyz87@gmail.com \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox