Re: [PATCH RFC bpf-next v2 5/6] bpf: report register diff summary for hot callchains

BPF List
 help / color / mirror / Atom feed

From: Eduard Zingerman <eddyz87@gmail.com>
To: sashiko-reviews@lists.linux.dev
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH RFC bpf-next v2 5/6] bpf: report register diff summary for hot callchains
Date: Tue, 26 May 2026 16:24:17 -0700	[thread overview]
Message-ID: <8a5b6bb66a6e2011fcba82f29a8d4eac03f2d191.camel@gmail.com> (raw)
In-Reply-To: <20260526213137.382441F000E9@smtp.kernel.org>

On Tue, 2026-05-26 at 21:31 +0000, sashiko-bot@kernel.org wrote:

[...]

> > diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
> > index 8773381360095..79e76bc2d1712 100644
> > --- a/kernel/bpf/states.c
> > +++ b/kernel/bpf/states.c
> [ ... ]
> > @@ -1588,3 +1629,98 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >  	list_add(&new_sl->node, head);
> >  	return 0;
> >  }
> > +
> > +static bool callchain_matches_state(struct bpf_callchain *cc,
> > +				    struct bpf_verifier_state *st)
> > +{
> > +	int i;
> > +
> > +	if (st->curframe != cc->curframe)
> > +		return false;
> > +	for (i = 0; i < (int)cc->curframe; i++)
> > +		if (st->frame[i + 1]->callsite != cc->insn_idx[i])
> > +			return false;
> > +	return true;
> > +}
> 
> [Severity: Medium]
> Does this function need to verify the leaf instruction index 
> (st->insn_idx == cc->insn_idx[cc->curframe])? 
> 
> States for different instructions can hash to the same bucket via 
> (insn_idx ^ callsite) % env->prog->len. Without checking the leaf
> instruction, a hash collision could cause states from different instructions
> to be incorrectly matched, violating assumptions in func_states_equal()
> which relies on live_regs_before for a specific instruction.

This should be fixed.

> > +
> > +struct state_diff_cnt {
> > +	struct bpf_state_diff diff;
> > +	u32 cnt;
> > +};
> > +
> > +static int state_diff_cmp(const void *a, const void *b)
> > +{
> > +	return ((struct state_diff_cnt *)b)->cnt - ((struct state_diff_cnt *)a)->cnt;
> > +}
> 
> [Severity: Low]
> Can this subtraction overflow? 
> 
> The cnt fields are unsigned 32-bit integers, and the function returns a
> signed 32-bit integer. If the difference exceeds INT_MAX, the subtraction
> will overflow into a negative value, resulting in an inverted sort order.

I don't think overflow can happen here.
But I'll rewrite to make the robot happy.

> > +
> > +static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b)
> > +{
> > +	return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind;
> > +}
> > +
> > +int bpf_sample_state_diffs(struct bpf_verifier_env *env,
> > +			   struct bpf_callchain *cc,
> > +			   struct bpf_state_diff *top_diffs,
> > +			   int *nr_diffs)
> > +{
> > +	struct bpf_verifier_state_list *sl_i, *sl_j;
> > +	struct state_diff_cnt *diff_cnts = NULL;
> > +	struct list_head *pos_i, *pos_j, *head;
> > +	u32 leaf_insn, callsite, hash_idx;
> > +	int i, cap = 0, nr_locs = 0;
> > +
> > +	leaf_insn = cc->insn_idx[cc->curframe];
> > +	callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC;
> > +	hash_idx = (leaf_insn ^ callsite) % env->prog->len;
> > +	head = &env->explored_states[hash_idx];
> > +
> > +	list_for_each(pos_i, head) {
> > +		sl_i = container_of(pos_i, struct bpf_verifier_state_list, node);
> > +		if (!callchain_matches_state(cc, &sl_i->state))
> > +			continue;
> > +		list_for_each(pos_j, head) {
> > +			struct bpf_state_diff diff = {};
> > +
> > +			if (pos_i == pos_j)
> > +				continue;
> > +			sl_j = container_of(pos_j, struct bpf_verifier_state_list, node);
> > +			if (!callchain_matches_state(cc, &sl_j->state))
> > +				continue;
> > +			if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff))
> 
> [Severity: High]
> Can these nested list_for_each loops cause a CPU soft lockup? 
> 
> A single hash bucket can accumulate tens to hundreds of thousands of cached
> states for a single instruction in complex programs. Without cond_resched()
> calls within these O(N^2) loops, billions of states_equal() comparisons
> could execute, monopolizing the CPU and leading to a local DoS.

I'll add cond_resched().

> [ ... ]
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 54b7ad65b7fc1..d09c014462f13 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> [ ... ]
> > @@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env)
> >  		insn_aux = &env->insn_aux_data[env->insn_idx];
> >  
> >  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> > -			print_hotspots(env);
> > +			err = print_hotspots(env);
> > +			if (err)
> > +				return err;
> 
> [Severity: Medium]
> If print_hotspots() returns an error (like -ENOMEM from kvrealloc()), does
> this mask the primary -E2BIG complexity limit error? 
> 
> Returning err directly will skip logging "BPF program is too large" and
> return a misleading error code to the user, who might attribute it to system
> memory exhaustion instead of hitting the verifier complexity limit.
> 
> >  			verbose(env,
> >  				"BPF program is too large. Processed %d insn\n",
> >  				env->insn_processed);

Replied already.

next prev parent reply	other threads:[~2026-05-26 23:24 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26 19:37 [PATCH RFC bpf-next v2 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 2/6] bpf: compute loops hierarchy Eduard Zingerman
2026-05-26 20:26   ` sashiko-bot
2026-05-26 20:33     ` Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
2026-05-26 20:17   ` bot+bpf-ci
2026-05-26 20:35     ` Eduard Zingerman
2026-05-26 21:31   ` sashiko-bot
2026-05-26 23:24     ` Eduard Zingerman [this message]
2026-05-26 19:37 ` [PATCH RFC bpf-next v2 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a5b6bb66a6e2011fcba82f29a8d4eac03f2d191.camel@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox