* [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit
@ 2026-05-27 7:29 Eduard Zingerman
2026-05-27 7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
` (5 more replies)
0 siblings, 6 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw)
To: bpf, ast
Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
Eduard Zingerman
When the BPF verifier exceeds the 1M instruction budget, the current
error output shows a random execution trace that happens to be active
at the moment, which is not very helpful for debugging.
This series improves the error report using a profiler-inspired
approach: collect and count "callchain" stack traces that the verifier
visits during program validation, and report the top 3 hottest traces
when the budget is exhausted. To minimize performance an memory impact
of such profiling, only collect samples when verifier visits loop
headers, iterator next, may_goto and callback-calling instructions.
For callchains ending at iterator next, may_goto, or callback-calling
instructions, identify which registers or stack slots most frequently
differ between cached and current states.
Here is an example of the report for scx lavd_dispatch (an old
version), with verifier limited to 200K instructions to trigger the
error:
lavd_dispatch():
; void BPF_STRUCT_OPS(lavd_dispatch, s32 cpu, struct task_struct *prev) @ main.bpf.c:889
... disassembly ...
consume_task():
; bool consume_task(u64 cpu_dsq_id, u64 cpdom_dsq_id) @ balance.bpf.c:410
... disassembly ...
#1 most visited simulated stacktrace (visited 1807 times):
lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
consume_task/2715 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:316)
#2 most visited simulated stacktrace (visited 1682 times):
lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
consume_task/2994 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:386)
#3 most visited simulated stacktrace (visited 8 times):
lavd_dispatch/255 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1022)
Most varying: R7 (frame 0)
BPF program is too large. Processed 200001 insn
Changelog:
v2 -> v3 (bots):
- Compare leaf instruction index in callchain_matches_state().
- Use cmp_int() in stat_diff_cmp() to avoid overflow warnings.
- Add a comment in bpf_sample_state_diffs() on why it can't stale.
- Add cond_resched() call in bpf_compute_loops() to guard against
pathological inputs.
v1 -> v2 (bots):
- Use kvfree() in bpf_compute_loops().
- Adjust fwd_edges_no_loop test case to avoid dead code elimination
converting 'if' to 'goto'.
- Use GFP_KERNEL_ACCOUNT for callchain entry allocation in
update_callchain_profile().
- Zero-initialize 'cc' in update_callchain_profile() to avoid
copying uninitialized stack memory to the heap.
- Use %td instead of %ld for ptrdiff_t format specifier in
print_callchain_entry() and disasm_subprog().
- Size printed_subs bitmap as BPF_MAX_SUBPROGS + 2 to account for
fake and exception subprograms.
- Fix bpf_sample_state_diffs() inner loop to iterate from head
instead of pos_i, avoiding container_of() on the dummy list head.
- Add DIFF_OTHER to distinguish states that differ because of idmap
or other inconsistencies.
v1: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v1-0-51e4f2c59780@gmail.com/T/
v2: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v2-0-e7ec61c45d41@gmail.com/T/
---
Eduard Zingerman (6):
bpf: move live registers and scc printout to a standalone function
bpf: compute loops hierarchy
selftests/bpf: test cases for loop hierarchy computation
bpf: report hot simulated callchains when 1M instructions limit is met
bpf: report register diff summary for hot callchains
selftests/bpf: test budget exhaustion profiling report
include/linux/bpf_verifier.h | 39 ++++
kernel/bpf/Makefile | 2 +-
kernel/bpf/fixups.c | 5 +
kernel/bpf/liveness.c | 22 +-
kernel/bpf/loops.c | 196 +++++++++++++++++
kernel/bpf/states.c | 184 ++++++++++++++--
kernel/bpf/verifier.c | 233 +++++++++++++++++++++
tools/testing/selftests/bpf/prog_tests/verifier.c | 4 +
.../selftests/bpf/progs/verifier_budget_report.c | 175 ++++++++++++++++
.../selftests/bpf/progs/verifier_live_stack.c | 2 +-
.../selftests/bpf/progs/verifier_loop_hierarchy.c | 233 +++++++++++++++++++++
11 files changed, 1050 insertions(+), 45 deletions(-)
---
base-commit: 8496d9020ff37a33c2a7b2fc84350fd03ffbde78
change-id: 20260525-better-1m-reporting-1d795a21cf72
^ permalink raw reply [flat|nested] 22+ messages in thread* [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-06-01 5:50 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman ` (4 subsequent siblings) 5 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman Tests for multiple analyses performed by the verifier need the verifier log to contain analysis results alongside the program disassembly. This patch moves log level 2 program dump from bpf_compute_live_registers() to a standalone function called from bpf_check(), in order to provide a common logging function for such analyses (and thus avoid printing program disassembly multiple times). This is a preparatory refactoring, subsequent commit extends this log with loop hierarchy. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- kernel/bpf/liveness.c | 22 +-------------- kernel/bpf/verifier.c | 31 ++++++++++++++++++++++ .../selftests/bpf/progs/verifier_live_stack.c | 2 +- 3 files changed, 33 insertions(+), 22 deletions(-) diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c index 0aadfbae0acc..16d030fdd83a 100644 --- a/kernel/bpf/liveness.c +++ b/kernel/bpf/liveness.c @@ -2209,7 +2209,7 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env) struct bpf_insn *insns = env->prog->insnsi; struct insn_live_regs *state; int insn_cnt = env->prog->len; - int err = 0, i, j; + int err = 0, i; bool changed; /* Use the following algorithm: @@ -2270,26 +2270,6 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env) for (i = 0; i < insn_cnt; ++i) insn_aux[i].live_regs_before = state[i].in; - if (env->log.level & BPF_LOG_LEVEL2) { - verbose(env, "Live regs before insn:\n"); - for (i = 0; i < insn_cnt; ++i) { - if (env->insn_aux_data[i].scc) - verbose(env, "%3d ", env->insn_aux_data[i].scc); - else - verbose(env, " "); - verbose(env, "%3d: ", i); - for (j = BPF_REG_0; j < BPF_REG_10; ++j) - if (insn_aux[i].live_regs_before & BIT(j)) - verbose(env, "%d", j); - else - verbose(env, "."); - verbose(env, " "); - bpf_verbose_insn(env, &insns[i]); - if (bpf_is_ldimm64(&insns[i])) - i++; - } - } - out: kvfree(state); return err; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index c8d980fdd709..3d569e36128c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -19619,6 +19619,34 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, return 0; } +/* Various log level 2 information about the program */ +static void log_program(struct bpf_verifier_env *env) +{ + struct bpf_insn_aux_data *insn_aux = env->insn_aux_data; + struct bpf_insn *insns = env->prog->insnsi; + u32 insn_cnt = env->prog->len; + u32 i, j; + + verbose(env, "Program dump (scc? insn#: live_regs_before):\n"); + for (i = 0; i < insn_cnt; ++i) { + verbose_linfo(env, i, " ; "); + if (env->insn_aux_data[i].scc) + verbose(env, "%3d ", env->insn_aux_data[i].scc); + else + verbose(env, " "); + verbose(env, "%3d: ", i); + for (j = BPF_REG_0; j < BPF_REG_10; ++j) + if (insn_aux[i].live_regs_before & BIT(j)) + verbose(env, "%d", j); + else + verbose(env, "."); + verbose(env, " "); + bpf_verbose_insn(env, &insns[i]); + if (bpf_is_ldimm64(&insns[i])) + i++; + } +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_attr *attr_log) { @@ -19771,6 +19799,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, if (ret < 0) goto skip_full_check; + if (env->log.level & BPF_LOG_LEVEL2) + log_program(env); + ret = mark_fastcall_patterns(env); if (ret < 0) goto skip_full_check; diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c index 401152b2b64f..923572df4bbf 100644 --- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c +++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c @@ -64,7 +64,7 @@ SEC("socket") __log_level(2) __msg("stack use/def subprog#0 must_write_not_same_slot (d0,cs0):") __msg("6: (7b) *(u64 *)(r2 +0) = r0{{$}}") -__msg("Live regs before insn:") +__msg("Program dump") __naked void must_write_not_same_slot(void) { asm volatile ( -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman @ 2026-06-01 5:50 ` Emil Tsalapatis 0 siblings, 0 replies; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 5:50 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > Tests for multiple analyses performed by the verifier need the > verifier log to contain analysis results alongside the program > disassembly. This patch moves log level 2 program dump from > bpf_compute_live_registers() to a standalone function called from > bpf_check(), in order to provide a common logging function for such > analyses (and thus avoid printing program disassembly multiple times). > > This is a preparatory refactoring, subsequent commit extends this log > with loop hierarchy. > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> > --- > kernel/bpf/liveness.c | 22 +-------------- > kernel/bpf/verifier.c | 31 ++++++++++++++++++++++ > .../selftests/bpf/progs/verifier_live_stack.c | 2 +- > 3 files changed, 33 insertions(+), 22 deletions(-) > > diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c > index 0aadfbae0acc..16d030fdd83a 100644 > --- a/kernel/bpf/liveness.c > +++ b/kernel/bpf/liveness.c > @@ -2209,7 +2209,7 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env) > struct bpf_insn *insns = env->prog->insnsi; > struct insn_live_regs *state; > int insn_cnt = env->prog->len; > - int err = 0, i, j; > + int err = 0, i; > bool changed; > > /* Use the following algorithm: > @@ -2270,26 +2270,6 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env) > for (i = 0; i < insn_cnt; ++i) > insn_aux[i].live_regs_before = state[i].in; > > - if (env->log.level & BPF_LOG_LEVEL2) { > - verbose(env, "Live regs before insn:\n"); > - for (i = 0; i < insn_cnt; ++i) { > - if (env->insn_aux_data[i].scc) > - verbose(env, "%3d ", env->insn_aux_data[i].scc); > - else > - verbose(env, " "); > - verbose(env, "%3d: ", i); > - for (j = BPF_REG_0; j < BPF_REG_10; ++j) > - if (insn_aux[i].live_regs_before & BIT(j)) > - verbose(env, "%d", j); > - else > - verbose(env, "."); > - verbose(env, " "); > - bpf_verbose_insn(env, &insns[i]); > - if (bpf_is_ldimm64(&insns[i])) > - i++; > - } > - } > - > out: > kvfree(state); > return err; > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index c8d980fdd709..3d569e36128c 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -19619,6 +19619,34 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, > return 0; > } > > +/* Various log level 2 information about the program */ > +static void log_program(struct bpf_verifier_env *env) > +{ > + struct bpf_insn_aux_data *insn_aux = env->insn_aux_data; > + struct bpf_insn *insns = env->prog->insnsi; > + u32 insn_cnt = env->prog->len; > + u32 i, j; > + > + verbose(env, "Program dump (scc? insn#: live_regs_before):\n"); > + for (i = 0; i < insn_cnt; ++i) { > + verbose_linfo(env, i, " ; "); > + if (env->insn_aux_data[i].scc) > + verbose(env, "%3d ", env->insn_aux_data[i].scc); > + else > + verbose(env, " "); > + verbose(env, "%3d: ", i); > + for (j = BPF_REG_0; j < BPF_REG_10; ++j) > + if (insn_aux[i].live_regs_before & BIT(j)) > + verbose(env, "%d", j); > + else > + verbose(env, "."); > + verbose(env, " "); > + bpf_verbose_insn(env, &insns[i]); > + if (bpf_is_ldimm64(&insns[i])) > + i++; > + } > +} > + > int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > struct bpf_log_attr *attr_log) > { > @@ -19771,6 +19799,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > if (ret < 0) > goto skip_full_check; > > + if (env->log.level & BPF_LOG_LEVEL2) > + log_program(env); > + > ret = mark_fastcall_patterns(env); > if (ret < 0) > goto skip_full_check; > diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c > index 401152b2b64f..923572df4bbf 100644 > --- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c > +++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c > @@ -64,7 +64,7 @@ SEC("socket") > __log_level(2) > __msg("stack use/def subprog#0 must_write_not_same_slot (d0,cs0):") > __msg("6: (7b) *(u64 *)(r2 +0) = r0{{$}}") > -__msg("Live regs before insn:") > +__msg("Program dump") > __naked void must_write_not_same_slot(void) > { > asm volatile ( ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-06-01 19:12 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman ` (3 subsequent siblings) 5 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman Add an analysis phase computing the loops hierarchy to the bpf verifier. The algorithm is based on the paper: "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, adapted to be non-recursive. Record the loops hierarchy per instruction: - The field insn_aux_data->loop_header records the innermost loop header containing it. - The field insn_aux_data->loop records additional information about the loop if this instruction is a loop header (at the moment it only tracks irreduciblity of the loop, additional information will be added later). Subsequent patches use loop header entries as sampling points for callchain profiling, for better error reports when the 1M instruction budget is exhausted. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- include/linux/bpf_verifier.h | 10 +++ kernel/bpf/Makefile | 2 +- kernel/bpf/fixups.c | 5 ++ kernel/bpf/loops.c | 196 +++++++++++++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 10 ++- 5 files changed, 221 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 5cbad3b64130..d0a1fb135cbf 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -656,6 +656,10 @@ struct bpf_iarray { u32 items[]; }; +struct bpf_loop { + bool irreducible; +}; + struct bpf_insn_aux_data { union { enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ @@ -742,6 +746,10 @@ struct bpf_insn_aux_data { u16 const_reg_map_mask; u16 const_reg_subprog_mask; u32 const_reg_vals[10]; + /* index of a loop header of the innermost loop containing this instruction, -1 if none */ + s32 loop_header; + /* additional information about the loop if this instruction is a loop header */ + struct bpf_loop *loop; }; #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ @@ -1608,4 +1616,6 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env); int bpf_fixup_call_args(struct bpf_verifier_env *env); int bpf_do_misc_fixups(struct bpf_verifier_env *env); +int bpf_compute_loops(struct bpf_verifier_env *env); + #endif /* _LINUX_BPF_VERIFIER_H */ diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index 4dc41bf5780c..787d183ba5b4 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse endif CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy) -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o loops.o obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c index 12739add2dda..57294524dab8 100644 --- a/kernel/bpf/fixups.c +++ b/kernel/bpf/fixups.c @@ -450,6 +450,11 @@ void bpf_clear_insn_aux_data(struct bpf_verifier_env *env, int start, int len) aux_data[i].jt = NULL; } + if (aux_data[i].loop) { + kvfree(aux_data[i].loop); + aux_data[i].loop = NULL; + } + if (bpf_is_ldimm64(&insns[i])) i++; } diff --git a/kernel/bpf/loops.c b/kernel/bpf/loops.c new file mode 100644 index 000000000000..aec11248d048 --- /dev/null +++ b/kernel/bpf/loops.c @@ -0,0 +1,196 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ + +#include <linux/slab.h> +#include <linux/bpf_verifier.h> + +struct dfs_state { + u32 traversed:1; + u32 next_succ:31; +}; + +struct loops_dfs { + struct dfs_state *state; + int *dfs_pos; + int *stack; +}; + +static void mark_irreducible(struct bpf_verifier_env *env, int h) +{ + env->insn_aux_data[h].loop->irreducible = true; +} + +static int mark_header(struct bpf_verifier_env *env, int h) +{ + struct bpf_insn_aux_data *aux = env->insn_aux_data; + + if (!aux[h].loop) { + aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT); + if (!aux[h].loop) + return -ENOMEM; + } + return 0; +} + +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h) +{ + struct bpf_insn_aux_data *aux = env->insn_aux_data; + int *dfs_pos = dfs->dfs_pos; + int err, nh; + + err = mark_header(env, h); + if (err) + return err; + + /* Don't encode self-loops, otherwise can't reflect loops nesting structure. */ + if (n == h) + return 0; + + /* Make sure that loop headers up the chain are sorted by dfs_pos. */ + while (aux[n].loop_header != -1) { + nh = aux[n].loop_header; + if (nh == h) + return 0; + if (dfs_pos[nh] < dfs_pos[h]) { + aux[n].loop_header = h; + n = h; + h = nh; + } else { + n = nh; + } + } + aux[n].loop_header = h; + return 0; +} + +/* + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, + * adapted to be non-recursive. + */ +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs, + int subprog_idx) +{ + struct bpf_insn_aux_data *aux = env->insn_aux_data; + struct dfs_state *state = dfs->state; + int start = env->subprog_info[subprog_idx].start; + int *dfs_pos = dfs->dfs_pos; + int *stack = dfs->stack; + int i, s, h, err, cur, stack_sz; + struct bpf_iarray *succ; + + stack[0] = start; + state[start].traversed = true; + state[start].next_succ = 0; + dfs_pos[start] = 1; + stack_sz = 1; + i = 0; + do { + /* + * The algorithm should be very fast in practice, + * guard against pathological inputs, just in case. + */ + if (i++ == 1024) { + i = 0; + if (signal_pending(current)) + return -EAGAIN; + cond_resched(); + } + + cur = stack[stack_sz - 1]; + succ = bpf_insn_successors(env, cur); + if (state[cur].next_succ == succ->cnt) { + dfs_pos[cur] = 0; + stack_sz--; + continue; + } + s = succ->items[state[cur].next_succ]; + if (!state[s].traversed) { + /* Case A: start -> ... -> cur -> s [unxplored] */ + state[s].traversed = true; + state[s].next_succ = 0; + stack[stack_sz] = s; + dfs_pos[s] = stack_sz + 1; + stack_sz++; + continue; + } + /* 's' is fully explored at this point */ + if (dfs_pos[s]) { + /* + * start -> ... -> s -> cur --. + * ^ | + * '----------' + * Case B: 's' is in the current DFS path. + */ + err = assign_header(env, dfs, cur, s); + if (err) + return err; + } else if (aux[s].loop_header == -1) { + /* + * start -> ... -> ... -> s -> ... -> end + * | ^ + * '---> cur ---' + * Case C: 's' is explored, not in the current DFS path, + * and not a part of any loop. + */ + } else if (dfs_pos[aux[s].loop_header]) { + /* + * .----------------------. + * v | + * start -> ... -> h -> ... -> ... -> s --' + * | ^ + * '---> cur ---' + * Case D: 's' is explored, not in current DFS path, + * but it's innermost loop header is. + */ + err = assign_header(env, dfs, cur, aux[s].loop_header); + if (err) + return err; + } else { + /* case E */ + h = aux[s].loop_header; + mark_irreducible(env, h); + /* can also mark 's' as reentry, but no need for now */ + while (aux[h].loop_header != -1) { + h = aux[h].loop_header; + if (dfs_pos[h]) { + err = assign_header(env, dfs, cur, h); + if (err) + return err; + break; + } + mark_irreducible(env, h); + } + } + state[cur].next_succ++; + } while (stack_sz); + + return 0; +} + +int bpf_compute_loops(struct bpf_verifier_env *env) +{ + struct bpf_insn_aux_data *aux = env->insn_aux_data; + int i, err = 0, len = env->prog->len; + struct loops_dfs dfs = {}; + + dfs.dfs_pos = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT); + dfs.state = kvcalloc(len, sizeof(struct dfs_state), GFP_KERNEL_ACCOUNT); + dfs.stack = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT); + if (!dfs.dfs_pos || !dfs.state || !dfs.stack) { + err = -ENOMEM; + goto out; + } + for (i = 0; i < len; i++) + aux[i].loop_header = -1; + for (i = 0; i < env->subprog_cnt; i++) { + err = compute_loops_in_subprog(env, &dfs, i); + if (err) + goto out; + } + +out: + kvfree(dfs.dfs_pos); + kvfree(dfs.stack); + kvfree(dfs.state); + return err; +} diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3d569e36128c..2b3584712ad2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -19627,13 +19627,17 @@ static void log_program(struct bpf_verifier_env *env) u32 insn_cnt = env->prog->len; u32 i, j; - verbose(env, "Program dump (scc? insn#: live_regs_before):\n"); + verbose(env, "Program dump (scc? loop_header? insn#: live_regs_before):\n"); for (i = 0; i < insn_cnt; ++i) { verbose_linfo(env, i, " ; "); if (env->insn_aux_data[i].scc) verbose(env, "%3d ", env->insn_aux_data[i].scc); else verbose(env, " "); + if (env->insn_aux_data[i].loop_header >= 0) + verbose(env, "%3d ", env->insn_aux_data[i].loop_header); + else + verbose(env, " "); verbose(env, "%3d: ", i); for (j = BPF_REG_0; j < BPF_REG_10; ++j) if (insn_aux[i].live_regs_before & BIT(j)) @@ -19791,6 +19795,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, if (ret < 0) goto skip_full_check; + ret = bpf_compute_loops(env); + if (ret < 0) + goto skip_full_check; + ret = bpf_compute_scc(env); if (ret < 0) goto skip_full_check; -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman @ 2026-06-01 19:12 ` Emil Tsalapatis 2026-06-01 19:22 ` Eduard Zingerman 0 siblings, 1 reply; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 19:12 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > Add an analysis phase computing the loops hierarchy to the bpf > verifier. The algorithm is based on the paper: > "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, > adapted to be non-recursive. > > Record the loops hierarchy per instruction: > - The field insn_aux_data->loop_header records the innermost loop > header containing it. > - The field insn_aux_data->loop records additional information about > the loop if this instruction is a loop header (at the moment it only > tracks irreduciblity of the loop, additional information will be > added later). > > Subsequent patches use loop header entries as sampling points for > callchain profiling, for better error reports when the 1M instruction > budget is exhausted. > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Nits below. > --- > include/linux/bpf_verifier.h | 10 +++ > kernel/bpf/Makefile | 2 +- > kernel/bpf/fixups.c | 5 ++ > kernel/bpf/loops.c | 196 +++++++++++++++++++++++++++++++++++++++++++ > kernel/bpf/verifier.c | 10 ++- > 5 files changed, 221 insertions(+), 2 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 5cbad3b64130..d0a1fb135cbf 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -656,6 +656,10 @@ struct bpf_iarray { > u32 items[]; > }; > > +struct bpf_loop { > + bool irreducible; > +}; > + > struct bpf_insn_aux_data { > union { > enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ > @@ -742,6 +746,10 @@ struct bpf_insn_aux_data { > u16 const_reg_map_mask; > u16 const_reg_subprog_mask; > u32 const_reg_vals[10]; > + /* index of a loop header of the innermost loop containing this instruction, -1 if none */ > + s32 loop_header; > + /* additional information about the loop if this instruction is a loop header */ > + struct bpf_loop *loop; > }; > > #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ > @@ -1608,4 +1616,6 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env); > int bpf_fixup_call_args(struct bpf_verifier_env *env); > int bpf_do_misc_fixups(struct bpf_verifier_env *env); > > +int bpf_compute_loops(struct bpf_verifier_env *env); > + > #endif /* _LINUX_BPF_VERIFIER_H */ > diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile > index 4dc41bf5780c..787d183ba5b4 100644 > --- a/kernel/bpf/Makefile > +++ b/kernel/bpf/Makefile > @@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse > endif > CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy) > > -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o > +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o loops.o > obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o > obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o > obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o > diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c > index 12739add2dda..57294524dab8 100644 > --- a/kernel/bpf/fixups.c > +++ b/kernel/bpf/fixups.c > @@ -450,6 +450,11 @@ void bpf_clear_insn_aux_data(struct bpf_verifier_env *env, int start, int len) > aux_data[i].jt = NULL; > } > > + if (aux_data[i].loop) { > + kvfree(aux_data[i].loop); > + aux_data[i].loop = NULL; > + } > + > if (bpf_is_ldimm64(&insns[i])) > i++; > } > diff --git a/kernel/bpf/loops.c b/kernel/bpf/loops.c > new file mode 100644 > index 000000000000..aec11248d048 > --- /dev/null > +++ b/kernel/bpf/loops.c > @@ -0,0 +1,196 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ > + > +#include <linux/slab.h> > +#include <linux/bpf_verifier.h> > + > +struct dfs_state { > + u32 traversed:1; > + u32 next_succ:31; > +}; > + > +struct loops_dfs { > + struct dfs_state *state; > + int *dfs_pos; > + int *stack; > +}; > + > +static void mark_irreducible(struct bpf_verifier_env *env, int h) > +{ > + env->insn_aux_data[h].loop->irreducible = true; > +} > + > +static int mark_header(struct bpf_verifier_env *env, int h) Maybe a more descriptive name? Not sure what we are marking the header as. > +{ > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > + > + if (!aux[h].loop) { > + aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT); > + if (!aux[h].loop) > + return -ENOMEM; > + } > + return 0; > +} > + > +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h) > +{ > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > + int *dfs_pos = dfs->dfs_pos; > + int err, nh; > + > + err = mark_header(env, h); > + if (err) > + return err; > + > + /* Don't encode self-loops, otherwise can't reflect loops nesting structure. */ > + if (n == h) > + return 0; > + > + /* Make sure that loop headers up the chain are sorted by dfs_pos. */ > + while (aux[n].loop_header != -1) { > + nh = aux[n].loop_header; > + if (nh == h) > + return 0; > + if (dfs_pos[nh] < dfs_pos[h]) { > + aux[n].loop_header = h; > + n = h; > + h = nh; > + } else { > + n = nh; > + } > + } > + aux[n].loop_header = h; > + return 0; > +} > + > +/* > + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, > + * adapted to be non-recursive. > + */ > +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs, > + int subprog_idx) > +{ > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > + struct dfs_state *state = dfs->state; > + int start = env->subprog_info[subprog_idx].start; > + int *dfs_pos = dfs->dfs_pos; > + int *stack = dfs->stack; > + int i, s, h, err, cur, stack_sz; > + struct bpf_iarray *succ; > + > + stack[0] = start; > + state[start].traversed = true; > + state[start].next_succ = 0; > + dfs_pos[start] = 1; > + stack_sz = 1; > + i = 0; > + do { > + /* > + * The algorithm should be very fast in practice, > + * guard against pathological inputs, just in case. > + */ > + if (i++ == 1024) { > + i = 0; Nit: Maybe if (!(++i % 1024)) ? > + if (signal_pending(current)) > + return -EAGAIN; > + cond_resched(); > + } > + > + cur = stack[stack_sz - 1]; > + succ = bpf_insn_successors(env, cur); > + if (state[cur].next_succ == succ->cnt) { > + dfs_pos[cur] = 0; > + stack_sz--; > + continue; > + } > + s = succ->items[state[cur].next_succ]; > + if (!state[s].traversed) { > + /* Case A: start -> ... -> cur -> s [unxplored] */ Typo: unxplored -> unexplored > + state[s].traversed = true; > + state[s].next_succ = 0; > + stack[stack_sz] = s; > + dfs_pos[s] = stack_sz + 1; > + stack_sz++; > + continue; > + } > + /* 's' is fully explored at this point */ > + if (dfs_pos[s]) { > + /* > + * start -> ... -> s -> cur --. > + * ^ | > + * '----------' > + * Case B: 's' is in the current DFS path. > + */ > + err = assign_header(env, dfs, cur, s); > + if (err) > + return err; > + } else if (aux[s].loop_header == -1) { > + /* > + * start -> ... -> ... -> s -> ... -> end > + * | ^ > + * '---> cur ---' > + * Case C: 's' is explored, not in the current DFS path, > + * and not a part of any loop. > + */ > + } else if (dfs_pos[aux[s].loop_header]) { > + /* > + * .----------------------. > + * v | > + * start -> ... -> h -> ... -> ... -> s --' > + * | ^ > + * '---> cur ---' > + * Case D: 's' is explored, not in current DFS path, > + * but it's innermost loop header is. Typo: it's -> its > + */ > + err = assign_header(env, dfs, cur, aux[s].loop_header); > + if (err) > + return err; > + } else { > + /* case E */ > + h = aux[s].loop_header; > + mark_irreducible(env, h); > + /* can also mark 's' as reentry, but no need for now */ > + while (aux[h].loop_header != -1) { > + h = aux[h].loop_header; > + if (dfs_pos[h]) { > + err = assign_header(env, dfs, cur, h); > + if (err) > + return err; > + break; > + } > + mark_irreducible(env, h); > + } > + } > + state[cur].next_succ++; > + } while (stack_sz); > + > + return 0; > +} > + > +int bpf_compute_loops(struct bpf_verifier_env *env) > +{ > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > + int i, err = 0, len = env->prog->len; > + struct loops_dfs dfs = {}; > + > + dfs.dfs_pos = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT); > + dfs.state = kvcalloc(len, sizeof(struct dfs_state), GFP_KERNEL_ACCOUNT); > + dfs.stack = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT); > + if (!dfs.dfs_pos || !dfs.state || !dfs.stack) { > + err = -ENOMEM; > + goto out; > + } > + for (i = 0; i < len; i++) > + aux[i].loop_header = -1; > + for (i = 0; i < env->subprog_cnt; i++) { > + err = compute_loops_in_subprog(env, &dfs, i); > + if (err) > + goto out; > + } > + > +out: > + kvfree(dfs.dfs_pos); > + kvfree(dfs.stack); > + kvfree(dfs.state); > + return err; > +} > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 3d569e36128c..2b3584712ad2 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -19627,13 +19627,17 @@ static void log_program(struct bpf_verifier_env *env) > u32 insn_cnt = env->prog->len; > u32 i, j; > > - verbose(env, "Program dump (scc? insn#: live_regs_before):\n"); > + verbose(env, "Program dump (scc? loop_header? insn#: live_regs_before):\n"); > for (i = 0; i < insn_cnt; ++i) { > verbose_linfo(env, i, " ; "); > if (env->insn_aux_data[i].scc) > verbose(env, "%3d ", env->insn_aux_data[i].scc); > else > verbose(env, " "); > + if (env->insn_aux_data[i].loop_header >= 0) > + verbose(env, "%3d ", env->insn_aux_data[i].loop_header); > + else > + verbose(env, " "); > verbose(env, "%3d: ", i); > for (j = BPF_REG_0; j < BPF_REG_10; ++j) > if (insn_aux[i].live_regs_before & BIT(j)) > @@ -19791,6 +19795,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > if (ret < 0) > goto skip_full_check; > > + ret = bpf_compute_loops(env); > + if (ret < 0) > + goto skip_full_check; > + > ret = bpf_compute_scc(env); > if (ret < 0) > goto skip_full_check; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy 2026-06-01 19:12 ` Emil Tsalapatis @ 2026-06-01 19:22 ` Eduard Zingerman 2026-06-01 19:29 ` Emil Tsalapatis 0 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-06-01 19:22 UTC (permalink / raw) To: Emil Tsalapatis, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Mon, 2026-06-01 at 15:12 -0400, Emil Tsalapatis wrote: [...] > > +static int mark_header(struct bpf_verifier_env *env, int h) > > Maybe a more descriptive name? Not sure what we are marking the header > as. The idea is to mark instruction at index 'h' as a header, hence mark_header(). Can change to 'mark_as_header()', open to other suggestions. I'll add a comment on how this is different from assign_header() (which restores hierarchy, as far as I remember). > > +{ > > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > > + > > + if (!aux[h].loop) { > > + aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT); > > + if (!aux[h].loop) > > + return -ENOMEM; > > + } > > + return 0; > > +} > > + > > +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h) > > +{ > > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > > + int *dfs_pos = dfs->dfs_pos; > > + int err, nh; > > + > > + err = mark_header(env, h); > > + if (err) > > + return err; > > + > > + /* Don't encode self-loops, otherwise can't reflect loops nesting structure. */ > > + if (n == h) > > + return 0; > > + > > + /* Make sure that loop headers up the chain are sorted by dfs_pos. */ > > + while (aux[n].loop_header != -1) { > > + nh = aux[n].loop_header; > > + if (nh == h) > > + return 0; > > + if (dfs_pos[nh] < dfs_pos[h]) { > > + aux[n].loop_header = h; > > + n = h; > > + h = nh; > > + } else { > > + n = nh; > > + } > > + } > > + aux[n].loop_header = h; > > + return 0; > > +} > > + > > +/* > > + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, > > + * adapted to be non-recursive. > > + */ > > +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs, > > + int subprog_idx) > > +{ > > + struct bpf_insn_aux_data *aux = env->insn_aux_data; > > + struct dfs_state *state = dfs->state; > > + int start = env->subprog_info[subprog_idx].start; > > + int *dfs_pos = dfs->dfs_pos; > > + int *stack = dfs->stack; > > + int i, s, h, err, cur, stack_sz; > > + struct bpf_iarray *succ; > > + > > + stack[0] = start; > > + state[start].traversed = true; > > + state[start].next_succ = 0; > > + dfs_pos[start] = 1; > > + stack_sz = 1; > > + i = 0; > > + do { > > + /* > > + * The algorithm should be very fast in practice, > > + * guard against pathological inputs, just in case. > > + */ > > + if (i++ == 1024) { > > + i = 0; > Nit: Maybe > > if (!(++i % 1024)) > > ? Ack. > > + if (signal_pending(current)) > > + return -EAGAIN; > > + cond_resched(); > > + } > > + > > + cur = stack[stack_sz - 1]; > > + succ = bpf_insn_successors(env, cur); > > + if (state[cur].next_succ == succ->cnt) { > > + dfs_pos[cur] = 0; > > + stack_sz--; > > + continue; > > + } > > + s = succ->items[state[cur].next_succ]; > > + if (!state[s].traversed) { > > + /* Case A: start -> ... -> cur -> s [unxplored] */ > Typo: unxplored -> unexplored Ack. > > + state[s].traversed = true; > > + state[s].next_succ = 0; > > + stack[stack_sz] = s; > > + dfs_pos[s] = stack_sz + 1; > > + stack_sz++; > > + continue; > > + } > > + /* 's' is fully explored at this point */ > > + if (dfs_pos[s]) { > > + /* > > + * start -> ... -> s -> cur --. > > + * ^ | > > + * '----------' > > + * Case B: 's' is in the current DFS path. > > + */ > > + err = assign_header(env, dfs, cur, s); > > + if (err) > > + return err; > > + } else if (aux[s].loop_header == -1) { > > + /* > > + * start -> ... -> ... -> s -> ... -> end > > + * | ^ > > + * '---> cur ---' > > + * Case C: 's' is explored, not in the current DFS path, > > + * and not a part of any loop. > > + */ > > + } else if (dfs_pos[aux[s].loop_header]) { > > + /* > > + * .----------------------. > > + * v | > > + * start -> ... -> h -> ... -> ... -> s --' > > + * | ^ > > + * '---> cur ---' > > + * Case D: 's' is explored, not in current DFS path, > > + * but it's innermost loop header is. > Typo: it's -> its Ack. [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy 2026-06-01 19:22 ` Eduard Zingerman @ 2026-06-01 19:29 ` Emil Tsalapatis 0 siblings, 0 replies; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 19:29 UTC (permalink / raw) To: Eduard Zingerman, Emil Tsalapatis, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Mon Jun 1, 2026 at 3:22 PM EDT, Eduard Zingerman wrote: > On Mon, 2026-06-01 at 15:12 -0400, Emil Tsalapatis wrote: > > [...] > >> > +static int mark_header(struct bpf_verifier_env *env, int h) >> >> Maybe a more descriptive name? Not sure what we are marking the header >> as. > > The idea is to mark instruction at index 'h' as a header, hence mark_header(). > Can change to 'mark_as_header()', open to other suggestions. > I'll add a comment on how this is different from assign_header() > (which restores hierarchy, as far as I remember). > mark_as_header() sounds much better, thanks. >> > +{ >> > + struct bpf_insn_aux_data *aux = env->insn_aux_data; >> > + >> > + if (!aux[h].loop) { >> > + aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT); >> > + if (!aux[h].loop) >> > + return -ENOMEM; >> > + } >> > + return 0; >> > +} >> > + >> > +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h) >> > +{ >> > + struct bpf_insn_aux_data *aux = env->insn_aux_data; >> > + int *dfs_pos = dfs->dfs_pos; >> > + int err, nh; >> > + >> > + err = mark_header(env, h); >> > + if (err) >> > + return err; >> > + >> > + /* Don't encode self-loops, otherwise can't reflect loops nesting structure. */ >> > + if (n == h) >> > + return 0; >> > + >> > + /* Make sure that loop headers up the chain are sorted by dfs_pos. */ >> > + while (aux[n].loop_header != -1) { >> > + nh = aux[n].loop_header; >> > + if (nh == h) >> > + return 0; >> > + if (dfs_pos[nh] < dfs_pos[h]) { >> > + aux[n].loop_header = h; >> > + n = h; >> > + h = nh; >> > + } else { >> > + n = nh; >> > + } >> > + } >> > + aux[n].loop_header = h; >> > + return 0; >> > +} >> > + >> > +/* >> > + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al, >> > + * adapted to be non-recursive. >> > + */ >> > +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs, >> > + int subprog_idx) >> > +{ >> > + struct bpf_insn_aux_data *aux = env->insn_aux_data; >> > + struct dfs_state *state = dfs->state; >> > + int start = env->subprog_info[subprog_idx].start; >> > + int *dfs_pos = dfs->dfs_pos; >> > + int *stack = dfs->stack; >> > + int i, s, h, err, cur, stack_sz; >> > + struct bpf_iarray *succ; >> > + >> > + stack[0] = start; >> > + state[start].traversed = true; >> > + state[start].next_succ = 0; >> > + dfs_pos[start] = 1; >> > + stack_sz = 1; >> > + i = 0; >> > + do { >> > + /* >> > + * The algorithm should be very fast in practice, >> > + * guard against pathological inputs, just in case. >> > + */ >> > + if (i++ == 1024) { >> > + i = 0; >> Nit: Maybe >> >> if (!(++i % 1024)) >> >> ? > > Ack. > >> > + if (signal_pending(current)) >> > + return -EAGAIN; >> > + cond_resched(); >> > + } >> > + >> > + cur = stack[stack_sz - 1]; >> > + succ = bpf_insn_successors(env, cur); >> > + if (state[cur].next_succ == succ->cnt) { >> > + dfs_pos[cur] = 0; >> > + stack_sz--; >> > + continue; >> > + } >> > + s = succ->items[state[cur].next_succ]; >> > + if (!state[s].traversed) { >> > + /* Case A: start -> ... -> cur -> s [unxplored] */ >> Typo: unxplored -> unexplored > > Ack. > >> > + state[s].traversed = true; >> > + state[s].next_succ = 0; >> > + stack[stack_sz] = s; >> > + dfs_pos[s] = stack_sz + 1; >> > + stack_sz++; >> > + continue; >> > + } >> > + /* 's' is fully explored at this point */ >> > + if (dfs_pos[s]) { >> > + /* >> > + * start -> ... -> s -> cur --. >> > + * ^ | >> > + * '----------' >> > + * Case B: 's' is in the current DFS path. >> > + */ >> > + err = assign_header(env, dfs, cur, s); >> > + if (err) >> > + return err; >> > + } else if (aux[s].loop_header == -1) { >> > + /* >> > + * start -> ... -> ... -> s -> ... -> end >> > + * | ^ >> > + * '---> cur ---' >> > + * Case C: 's' is explored, not in the current DFS path, >> > + * and not a part of any loop. >> > + */ >> > + } else if (dfs_pos[aux[s].loop_header]) { >> > + /* >> > + * .----------------------. >> > + * v | >> > + * start -> ... -> h -> ... -> ... -> s --' >> > + * | ^ >> > + * '---> cur ---' >> > + * Case D: 's' is explored, not in current DFS path, >> > + * but it's innermost loop header is. >> Typo: it's -> its > > Ack. > > [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-05-27 8:50 ` sashiko-bot 2026-06-01 19:37 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman ` (2 subsequent siblings) 5 siblings, 2 replies; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman Test cases covering the following branches in bpf_compute_loops(): - Case B: simple backedge creating a loop (loop_single) - Case B: two independent loops (loop_two_independent) - Case B + D: nested loops where inner header's loop_header points to outer (loop_nested) - Case C: diamond CFG with no loops (fwd_edges_no_loop) - Case D: sibling inner loops within one outer loop (loop_nested_siblings) - Three levels of loop nesting (loop_three_levels) - Loop with if-else body containing forward branches (loop_with_if_else) Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- tools/testing/selftests/bpf/prog_tests/verifier.c | 2 + .../selftests/bpf/progs/verifier_loop_hierarchy.c | 233 +++++++++++++++++++++ 2 files changed, 235 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 219ff2969868..35d92d176c1f 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -57,6 +57,7 @@ #include "verifier_live_stack.skel.h" #include "verifier_liveness_exp.skel.h" #include "verifier_load_acquire.skel.h" +#include "verifier_loop_hierarchy.skel.h" #include "verifier_loops1.skel.h" #include "verifier_lwt.skel.h" #include "verifier_map_in_map.skel.h" @@ -208,6 +209,7 @@ void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); } void test_verifier_linked_scalars(void) { RUN(verifier_linked_scalars); } void test_verifier_live_stack(void) { RUN(verifier_live_stack); } void test_verifier_liveness_exp(void) { RUN(verifier_liveness_exp); } +void test_verifier_loop_hierarchy(void) { RUN(verifier_loop_hierarchy); } void test_verifier_loops1(void) { RUN(verifier_loops1); } void test_verifier_lwt(void) { RUN(verifier_lwt); } void test_verifier_map_in_map(void) { RUN(verifier_map_in_map); } diff --git a/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c new file mode 100644 index 000000000000..84cabfa42485 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c @@ -0,0 +1,233 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +/* + * kernel/bpf/loops.c:compute_loops() distinguish between + * the following cases: + * - B: backedge -> simple loop + * - C: cross edge to non-loop node -> no-op + * - D: edge to node whose header is in DFS path -> nested loop + * - E: edge to node whose header is NOT in DFS path -> irreducible + * + * Below test cases cover the above branches in various combinations. + */ + +/* Case B: single bounded loop. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 1 1: {{.*}} (07) r0 += 1") +__msg(" 1 1 2: {{.*}} (a5) if r0 < 0xa goto pc-2") +__msg(" 3: {{.*}} (95) exit") +__naked void loop_single(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + if r0 < 10 goto 1b; \ + exit; \ +" ::: __clobber_all); +} + +/* Case B: two independent loops at the same nesting level. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 2 1: {{.*}} (07) r0 += 1") +__msg(" 2 1 2: {{.*}} (a5) if r0 < 0xa goto pc-2") +__msg(" 3: {{.*}} (b7) r1 = 0") +__msg(" 1 4: {{.*}} (07) r1 += 1") +__msg(" 1 4 5: {{.*}} (a5) if r1 < 0xa goto pc-2") +__msg(" 6: {{.*}} (95) exit") +__naked void loop_two_independent(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + if r0 < 10 goto 1b; \ + r1 = 0; \ +2: r1 += 1; \ + if r1 < 10 goto 2b; \ + exit; \ +" ::: __clobber_all); +} + +/* Case B + D: nested loops. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 1 1: {{.*}} (07) r0 += 1") /* outer loop header */ +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") /* outer loop insn */ +__msg(" 1 1 3: {{.*}} (07) r1 += 1") /* inner loop header */ +__msg(" 1 3 4: {{.*}} (a5) if r1 < 0x5 goto pc-2") /* inner loop insn */ +__msg(" 1 1 5: {{.*}} (a5) if r0 < 0xa goto pc-5") /* outer loop insn */ +__msg(" 6: {{.*}} (95) exit") +__naked void loop_nested(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + r1 = 0; \ +2: r1 += 1; \ + if r1 < 5 goto 2b; \ + if r0 < 10 goto 1b; \ + exit; \ +" ::: __clobber_all); +} + +/* Case C: forward edges, no loops. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (85) call bpf_get_prandom_u32") +__msg(" 1: {{.*}} (25) if r0 > 0x0 goto pc+2") +__msg(" 2: {{.*}} (b7) r0 = 2") +__msg(" 3: {{.*}} (05) goto pc+1") +__msg(" 4: {{.*}} (b7) r0 = 3") +__msg(" 5: {{.*}} (95) exit") +__naked void fwd_edges_no_loop(void) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + if r0 > 0 goto 1f; \ + r0 = 2; \ + goto 2f; \ +1: r0 = 3; \ +2: exit; \ +" : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +/* Case B + D: two sibling inner loops within one outer loop. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 1 1: {{.*}} (07) r0 += 1") +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") +__msg(" 1 1 3: {{.*}} (07) r1 += 1") +__msg(" 1 3 4: {{.*}} (a5) if r1 < 0x5 goto pc-2") +__msg(" 1 1 5: {{.*}} (b7) r2 = 0") +__msg(" 1 1 6: {{.*}} (07) r2 += 1") +__msg(" 1 6 7: {{.*}} (a5) if r2 < 0x5 goto pc-2") +__msg(" 1 1 8: {{.*}} (a5) if r0 < 0xa goto pc-8") +__msg(" 9: {{.*}} (95) exit") +__naked void loop_nested_siblings(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + r1 = 0; \ +2: r1 += 1; \ + if r1 < 5 goto 2b; \ + r2 = 0; \ +3: r2 += 1; \ + if r2 < 5 goto 3b; \ + if r0 < 10 goto 1b; \ + exit; \ +" ::: __clobber_all); +} + +/* Three levels of nesting. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 1 1: {{.*}} (07) r0 += 1") +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") +__msg(" 1 1 3: {{.*}} (07) r1 += 1") +__msg(" 1 3 4: {{.*}} (b7) r2 = 0") +__msg(" 1 3 5: {{.*}} (07) r2 += 1") +__msg(" 1 5 6: {{.*}} (a5) if r2 < 0x3 goto pc-2") +__msg(" 1 3 7: {{.*}} (a5) if r1 < 0x5 goto pc-5") +__msg(" 1 1 8: {{.*}} (a5) if r0 < 0xa goto pc-8") +__msg(" 9: {{.*}} (95) exit") +__naked void loop_three_levels(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + r1 = 0; \ +2: r1 += 1; \ + r2 = 0; \ +3: r2 += 1; \ + if r2 < 3 goto 3b; \ + if r1 < 5 goto 2b; \ + if r0 < 10 goto 1b; \ + exit; \ +" ::: __clobber_all); +} + +/* Loop with an if-else body (forward branch inside loop, Case C). */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (b7) r0 = 0") +__msg(" 1 1: {{.*}} (07) r0 += 1") +__msg(" 1 1 2: {{.*}} (bf) r1 = r0") +__msg(" 1 1 3: {{.*}} (25) if r1 > 0x5 goto pc+1") +__msg(" 1 1 4: {{.*}} (b7) r1 = 1") +__msg(" 1 1 5: {{.*}} (0f) r0 += r1") +__msg(" 1 1 6: {{.*}} (a5) if r0 < 0x64 goto pc-6") +__msg(" 7: {{.*}} (95) exit") +__naked void loop_with_if_else(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + r1 = r0; \ + if r1 > 5 goto 2f; \ + r1 = 1; \ +2: r0 += r1; \ + if r0 < 100 goto 1b; \ + exit; \ +" ::: __clobber_all); +} + +/* Case E: irreducible loop. */ +SEC("socket") +__success +__log_level(2) +__msg("Program dump") +__msg(" 0: {{.*}} (85) call bpf_get_prandom_u32") +__msg(" 1: {{.*}} (b7) r1 = 0") +__msg(" 2: {{.*}} (25) if r0 > 0x5 goto pc+2") +__msg(" 1 3: {{.*}} (b7) r1 = 1") +__msg(" 1 3 4: {{.*}} (05) goto pc+1") +__msg(" 5: {{.*}} (b7) r1 = 2") +__msg(" 1 3 6: {{.*}} (0f) r0 += r1") +__msg(" 1 3 7: {{.*}} (a5) if r0 < 0x10 goto pc-5") +__msg(" 8: {{.*}} (95) exit") +__naked void loop_irreducible(void) +{ + asm volatile (" \ + call %[bpf_get_prandom_u32]; \ + r1 = 0; \ + if r0 > 5 goto 2f; \ +1: r1 = 1; \ + goto 3f; \ +2: r1 = 2; \ +3: r0 += r1; \ + if r0 < 16 goto 1b; \ + exit; \ +" : + : __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +char _license[] SEC("license") = "GPL"; -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman @ 2026-05-27 8:50 ` sashiko-bot 2026-06-01 19:37 ` Emil Tsalapatis 1 sibling, 0 replies; 22+ messages in thread From: sashiko-bot @ 2026-05-27 8:50 UTC (permalink / raw) To: Eduard Zingerman; +Cc: bpf Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [Low] The commit message omits the test case for irreducible loops (Case E), which is implemented in the patch. -- commit 0c9a2a0bab3cd380638bc61a05140fb934767741 Author: Eduard Zingerman <eddyz87@gmail.com> selftests/bpf: test cases for loop hierarchy computation This commit adds test cases for loop hierarchy computation to the BPF selftests. It covers various branches in bpf_compute_loops() to ensure correct handling of single, nested, independent, and irreducible loops. > Test cases covering the following branches in bpf_compute_loops(): > - Case B: simple backedge creating a loop (loop_single) > - Case B: two independent loops (loop_two_independent) > - Case B + D: nested loops where inner header's loop_header points to > outer (loop_nested) > - Case C: diamond CFG with no loops (fwd_edges_no_loop) > - Case D: sibling inner loops within one outer loop > (loop_nested_siblings) > - Three levels of loop nesting (loop_three_levels) > - Loop with if-else body containing forward branches > (loop_with_if_else) [Severity: Low] This isn't a bug, but could the commit message also mention the test case for irreducible loops? The patch includes the implementation for loop_irreducible() (Case E) in the code, but it appears to be omitted from this list. -- Sashiko AI review · https://sashiko.dev/#/patchset/20260527-better-1m-reporting-v3-0-b3ede0588a75@gmail.com?part=3 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman 2026-05-27 8:50 ` sashiko-bot @ 2026-06-01 19:37 ` Emil Tsalapatis 2026-06-01 19:44 ` Eduard Zingerman 1 sibling, 1 reply; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 19:37 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > Test cases covering the following branches in bpf_compute_loops(): > - Case B: simple backedge creating a loop (loop_single) > - Case B: two independent loops (loop_two_independent) > - Case B + D: nested loops where inner header's loop_header points to > outer (loop_nested) > - Case C: diamond CFG with no loops (fwd_edges_no_loop) > - Case D: sibling inner loops within one outer loop > (loop_nested_siblings) > - Three levels of loop nesting (loop_three_levels) > - Loop with if-else body containing forward branches > (loop_with_if_else) > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> I was debating if it's worth adding the edge case where the instruction is its own loop header. It is so obviously wrong that the only issue would be the verifier handling it gracefully, and we can probably skip it. > --- > tools/testing/selftests/bpf/prog_tests/verifier.c | 2 + > .../selftests/bpf/progs/verifier_loop_hierarchy.c | 233 +++++++++++++++++++++ > 2 files changed, 235 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c > index 219ff2969868..35d92d176c1f 100644 > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c > @@ -57,6 +57,7 @@ > #include "verifier_live_stack.skel.h" > #include "verifier_liveness_exp.skel.h" > #include "verifier_load_acquire.skel.h" > +#include "verifier_loop_hierarchy.skel.h" > #include "verifier_loops1.skel.h" > #include "verifier_lwt.skel.h" > #include "verifier_map_in_map.skel.h" > @@ -208,6 +209,7 @@ void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); } > void test_verifier_linked_scalars(void) { RUN(verifier_linked_scalars); } > void test_verifier_live_stack(void) { RUN(verifier_live_stack); } > void test_verifier_liveness_exp(void) { RUN(verifier_liveness_exp); } > +void test_verifier_loop_hierarchy(void) { RUN(verifier_loop_hierarchy); } > void test_verifier_loops1(void) { RUN(verifier_loops1); } > void test_verifier_lwt(void) { RUN(verifier_lwt); } > void test_verifier_map_in_map(void) { RUN(verifier_map_in_map); } > diff --git a/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c > new file mode 100644 > index 000000000000..84cabfa42485 > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c > @@ -0,0 +1,233 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ > + > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > +#include "bpf_misc.h" > + > +/* > + * kernel/bpf/loops.c:compute_loops() distinguish between > + * the following cases: > + * - B: backedge -> simple loop > + * - C: cross edge to non-loop node -> no-op > + * - D: edge to node whose header is in DFS path -> nested loop > + * - E: edge to node whose header is NOT in DFS path -> irreducible > + * > + * Below test cases cover the above branches in various combinations. > + */ > + > +/* Case B: single bounded loop. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 1 1: {{.*}} (07) r0 += 1") > +__msg(" 1 1 2: {{.*}} (a5) if r0 < 0xa goto pc-2") > +__msg(" 3: {{.*}} (95) exit") > +__naked void loop_single(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + if r0 < 10 goto 1b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Case B: two independent loops at the same nesting level. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 2 1: {{.*}} (07) r0 += 1") > +__msg(" 2 1 2: {{.*}} (a5) if r0 < 0xa goto pc-2") > +__msg(" 3: {{.*}} (b7) r1 = 0") > +__msg(" 1 4: {{.*}} (07) r1 += 1") > +__msg(" 1 4 5: {{.*}} (a5) if r1 < 0xa goto pc-2") > +__msg(" 6: {{.*}} (95) exit") > +__naked void loop_two_independent(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + if r0 < 10 goto 1b; \ > + r1 = 0; \ > +2: r1 += 1; \ > + if r1 < 10 goto 2b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Case B + D: nested loops. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 1 1: {{.*}} (07) r0 += 1") /* outer loop header */ > +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") /* outer loop insn */ > +__msg(" 1 1 3: {{.*}} (07) r1 += 1") /* inner loop header */ > +__msg(" 1 3 4: {{.*}} (a5) if r1 < 0x5 goto pc-2") /* inner loop insn */ > +__msg(" 1 1 5: {{.*}} (a5) if r0 < 0xa goto pc-5") /* outer loop insn */ > +__msg(" 6: {{.*}} (95) exit") > +__naked void loop_nested(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + r1 = 0; \ > +2: r1 += 1; \ > + if r1 < 5 goto 2b; \ > + if r0 < 10 goto 1b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Case C: forward edges, no loops. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (85) call bpf_get_prandom_u32") > +__msg(" 1: {{.*}} (25) if r0 > 0x0 goto pc+2") > +__msg(" 2: {{.*}} (b7) r0 = 2") > +__msg(" 3: {{.*}} (05) goto pc+1") > +__msg(" 4: {{.*}} (b7) r0 = 3") > +__msg(" 5: {{.*}} (95) exit") > +__naked void fwd_edges_no_loop(void) > +{ > + asm volatile (" \ > + call %[bpf_get_prandom_u32]; \ > + if r0 > 0 goto 1f; \ > + r0 = 2; \ > + goto 2f; \ > +1: r0 = 3; \ > +2: exit; \ > +" : > + : __imm(bpf_get_prandom_u32) > + : __clobber_all); > +} > + > +/* Case B + D: two sibling inner loops within one outer loop. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 1 1: {{.*}} (07) r0 += 1") > +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") > +__msg(" 1 1 3: {{.*}} (07) r1 += 1") > +__msg(" 1 3 4: {{.*}} (a5) if r1 < 0x5 goto pc-2") > +__msg(" 1 1 5: {{.*}} (b7) r2 = 0") > +__msg(" 1 1 6: {{.*}} (07) r2 += 1") > +__msg(" 1 6 7: {{.*}} (a5) if r2 < 0x5 goto pc-2") > +__msg(" 1 1 8: {{.*}} (a5) if r0 < 0xa goto pc-8") > +__msg(" 9: {{.*}} (95) exit") > +__naked void loop_nested_siblings(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + r1 = 0; \ > +2: r1 += 1; \ > + if r1 < 5 goto 2b; \ > + r2 = 0; \ > +3: r2 += 1; \ > + if r2 < 5 goto 3b; \ > + if r0 < 10 goto 1b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Three levels of nesting. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 1 1: {{.*}} (07) r0 += 1") > +__msg(" 1 1 2: {{.*}} (b7) r1 = 0") > +__msg(" 1 1 3: {{.*}} (07) r1 += 1") > +__msg(" 1 3 4: {{.*}} (b7) r2 = 0") > +__msg(" 1 3 5: {{.*}} (07) r2 += 1") > +__msg(" 1 5 6: {{.*}} (a5) if r2 < 0x3 goto pc-2") > +__msg(" 1 3 7: {{.*}} (a5) if r1 < 0x5 goto pc-5") > +__msg(" 1 1 8: {{.*}} (a5) if r0 < 0xa goto pc-8") > +__msg(" 9: {{.*}} (95) exit") > +__naked void loop_three_levels(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + r1 = 0; \ > +2: r1 += 1; \ > + r2 = 0; \ > +3: r2 += 1; \ > + if r2 < 3 goto 3b; \ > + if r1 < 5 goto 2b; \ > + if r0 < 10 goto 1b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Loop with an if-else body (forward branch inside loop, Case C). */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (b7) r0 = 0") > +__msg(" 1 1: {{.*}} (07) r0 += 1") > +__msg(" 1 1 2: {{.*}} (bf) r1 = r0") > +__msg(" 1 1 3: {{.*}} (25) if r1 > 0x5 goto pc+1") > +__msg(" 1 1 4: {{.*}} (b7) r1 = 1") > +__msg(" 1 1 5: {{.*}} (0f) r0 += r1") > +__msg(" 1 1 6: {{.*}} (a5) if r0 < 0x64 goto pc-6") > +__msg(" 7: {{.*}} (95) exit") > +__naked void loop_with_if_else(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + r1 = r0; \ > + if r1 > 5 goto 2f; \ > + r1 = 1; \ > +2: r0 += r1; \ > + if r0 < 100 goto 1b; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +/* Case E: irreducible loop. */ > +SEC("socket") > +__success > +__log_level(2) > +__msg("Program dump") > +__msg(" 0: {{.*}} (85) call bpf_get_prandom_u32") > +__msg(" 1: {{.*}} (b7) r1 = 0") > +__msg(" 2: {{.*}} (25) if r0 > 0x5 goto pc+2") > +__msg(" 1 3: {{.*}} (b7) r1 = 1") > +__msg(" 1 3 4: {{.*}} (05) goto pc+1") > +__msg(" 5: {{.*}} (b7) r1 = 2") > +__msg(" 1 3 6: {{.*}} (0f) r0 += r1") > +__msg(" 1 3 7: {{.*}} (a5) if r0 < 0x10 goto pc-5") > +__msg(" 8: {{.*}} (95) exit") > +__naked void loop_irreducible(void) > +{ > + asm volatile (" \ > + call %[bpf_get_prandom_u32]; \ > + r1 = 0; \ > + if r0 > 5 goto 2f; \ > +1: r1 = 1; \ > + goto 3f; \ > +2: r1 = 2; \ > +3: r0 += r1; \ > + if r0 < 16 goto 1b; \ > + exit; \ > +" : > + : __imm(bpf_get_prandom_u32) > + : __clobber_all); > +} > + > +char _license[] SEC("license") = "GPL"; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation 2026-06-01 19:37 ` Emil Tsalapatis @ 2026-06-01 19:44 ` Eduard Zingerman 0 siblings, 0 replies; 22+ messages in thread From: Eduard Zingerman @ 2026-06-01 19:44 UTC (permalink / raw) To: Emil Tsalapatis, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Mon, 2026-06-01 at 15:37 -0400, Emil Tsalapatis wrote: > On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > > Test cases covering the following branches in bpf_compute_loops(): > > - Case B: simple backedge creating a loop (loop_single) > > - Case B: two independent loops (loop_two_independent) > > - Case B + D: nested loops where inner header's loop_header points to > > outer (loop_nested) > > - Case C: diamond CFG with no loops (fwd_edges_no_loop) > > - Case D: sibling inner loops within one outer loop > > (loop_nested_siblings) > > - Three levels of loop nesting (loop_three_levels) > > - Loop with if-else body containing forward branches > > (loop_with_if_else) > > > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> > > I was debating if it's worth adding the edge case where the instruction is > its own loop header. It is so obviously wrong that the only issue would > be the verifier handling it gracefully, and we can probably skip it. I'll add the this test case, just to make sure that the algorithm behaves correctly. [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman ` (2 preceding siblings ...) 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-05-29 10:23 ` Jiri Olsa 2026-06-01 19:50 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman 5 siblings, 2 replies; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman When the verifier exceeds the 1M instruction budget, the current error output shows a random execution trace that happens to be active when the limit is hit, which is not very helpful for debugging. To rectify this, employ a profiler-inspired technique: collect and count "callchain" stack traces that the verifier visits during program validation and report the top 3 hot traces. Sample callchains at instructions marked as force checkpoints and loop header entries. When 1M instructions is met, print a report containing: - callchains verifier visited most often; - disassembly of the subprograms referenced from the printed callchains. Here is an example of the report: budget_nested_call_loop(): ; asm volatile (" \ @ verifier_budget_report.c:44 0: (b7) r0 = 0 1: (85) call pc+1 2: (95) exit budget_nested_call_loop__inner(): ; asm volatile (" \ @ verifier_budget_report.c:54 3: (b7) r0 = 0 4: (07) r0 += 1 5: (05) goto pc-2 #1 most visited simulated stacktrace (visited 499999 times): budget_nested_call_loop/1 (.../verifier_budget_report.c:44) budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54) Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- include/linux/bpf_verifier.h | 14 ++++ kernel/bpf/verifier.c | 155 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 169 insertions(+) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index d0a1fb135cbf..347a155d8e21 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -9,6 +9,8 @@ #include <linux/filter.h> /* for MAX_BPF_STACK */ #include <linux/tnum.h> #include <linux/cnum.h> +#include <linux/hashtable.h> +#include <linux/jhash.h> /* Maximum variable offset umax_value permitted when resolving memory accesses. * In practice this is far bigger than any realistic pointer offset; this limit @@ -660,6 +662,17 @@ struct bpf_loop { bool irreducible; }; +struct bpf_callchain { + u32 insn_idx[MAX_CALL_FRAMES]; + u32 curframe; +}; + +struct bpf_callchain_entry { + struct hlist_node node; + struct bpf_callchain cc; + u64 count; +}; + struct bpf_insn_aux_data { union { enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ @@ -1043,6 +1056,7 @@ struct bpf_verifier_env { u32 scc_cnt; struct bpf_iarray *succ; struct bpf_iarray *gotox_tmp_buf; + DECLARE_HASHTABLE(callchain_htab, 8); }; static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2b3584712ad2..54b7ad65b7fc 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17247,6 +17247,152 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in return INSN_IDX_UPDATED; } +static void compute_callchain(struct bpf_verifier_state *st, struct bpf_callchain *cc) +{ + int i; + + cc->curframe = st->curframe; + for (i = 0; i <= st->curframe; i++) + cc->insn_idx[i] = bpf_frame_insn_idx(st, i); +} + +static int update_callchain_profile(struct bpf_verifier_env *env, struct bpf_verifier_state *st) +{ + struct bpf_callchain_entry *entry; + struct bpf_callchain cc = {}; + u32 hash; + + compute_callchain(st, &cc); + hash = jhash2(cc.insn_idx, cc.curframe + 1, cc.curframe + 1); + hash_for_each_possible(env->callchain_htab, entry, node, hash) { + if (entry->cc.curframe == cc.curframe && + !memcmp(entry->cc.insn_idx, cc.insn_idx, (cc.curframe + 1) * sizeof(u32))) { + entry->count++; + return 0; + } + } + entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); + if (!entry) + return -ENOMEM; + entry->cc = cc; + entry->count = 1; + hash_add(env->callchain_htab, &entry->node, hash); + return 0; +} + +static void free_callchain_profile(struct bpf_verifier_env *env) +{ + struct bpf_callchain_entry *entry; + struct hlist_node *tmp; + int bkt; + + hash_for_each_safe(env->callchain_htab, bkt, tmp, entry, node) { + hash_del(&entry->node); + kfree(entry); + } +} + +static void print_callchain_entry(struct bpf_verifier_env *env, + struct bpf_callchain_entry *entry, int idx) +{ + struct bpf_callchain *cc = &entry->cc; + const struct bpf_line_info *linfo; + struct bpf_subprog_info *sub; + int i, insn_idx; + + verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n", + idx, entry->count); + for (i = 0; i <= cc->curframe; i++) { + insn_idx = cc->insn_idx[i]; + sub = bpf_find_containing_subprog(env, insn_idx); + linfo = bpf_find_linfo(env->prog, insn_idx); + if (sub->name) + verbose(env, " %s/%d", sub->name, insn_idx); + else + verbose(env, " subprog#%td/%d", sub - env->subprog_info, insn_idx); + if (linfo) + verbose(env, " (%s:%u)", + btf_name_by_offset(env->prog->aux->btf, linfo->file_name_off), + BPF_LINE_INFO_LINE_NUM(linfo->line_col)); + verbose(env, "\n"); + } +} + +static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub) +{ + u32 i, end = (sub + 1)->start; + + if (sub->name) + verbose(env, "%s():\n", sub->name); + else + verbose(env, "subprog#%td:\n", sub - env->subprog_info); + env->prev_linfo = NULL; + for (i = sub->start; i < end; i++) { + verbose_linfo(env, i, " ; "); + verbose(env, " %d: ", i); + bpf_verbose_insn(env, &env->prog->insnsi[i]); + if (bpf_is_ldimm64(&env->prog->insnsi[i])) + i++; + } +} + +/* + * Print several most visited simulated stack traces, + * and a disasembly of related subprograms. + */ +static void print_hotspots(struct bpf_verifier_env *env) +{ + DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {}; + struct bpf_callchain_entry *top[3] = {}; + struct bpf_callchain_entry *entry; + struct bpf_subprog_info *sub; + int i, j, bkt, nr_top = 0; + + if (!(env->log.level & BPF_LOG_LEVEL)) + return; + + /* Collect the hottest callchains */ + hash_for_each(env->callchain_htab, bkt, entry, node) { + for (i = 0; i < 3; i++) { + if (!top[i] || entry->count > top[i]->count) { + memmove(&top[i + 1], &top[i], (2 - i) * sizeof(top[0])); + top[i] = entry; + break; + } + } + } + + for (i = 0; i < 3 && top[i]; i++) + nr_top++; + + if (!nr_top) + return; + + if (!(env->log.level & BPF_LOG_LEVEL2)) + bpf_vlog_reset(&env->log, 0); + + /* Identify callchain subprograms and print disasm of those */ + for (i = 0; i < nr_top; i++) { + struct bpf_callchain *cc = &top[i]->cc; + + for (j = 0; j <= cc->curframe; j++) { + sub = bpf_find_containing_subprog(env, cc->insn_idx[j]); + __set_bit(sub - env->subprog_info, printed_subs); + } + } + + for_each_set_bit(i, printed_subs, env->subprog_cnt) { + disasm_subprog(env, &env->subprog_info[i]); + verbose(env, "\n"); + } + + /* Print the hot callchains */ + for (i = 0; i < nr_top; i++) { + print_callchain_entry(env, top[i], i + 1); + verbose(env, "\n"); + } +} + static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) { int err; @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env) insn_aux = &env->insn_aux_data[env->insn_idx]; if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { + print_hotspots(env); verbose(env, "BPF program is too large. Processed %d insn\n", env->insn_processed); @@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env) state->last_insn_idx = env->prev_insn_idx; state->insn_idx = env->insn_idx; + if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) { + err = update_callchain_profile(env, state); + if (err) + return err; + } + if (bpf_is_prune_point(env, env->insn_idx)) { err = bpf_is_state_visited(env, env->insn_idx); if (err < 0) @@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, return -ENOMEM; env->bt.env = env; + hash_init(env->callchain_htab); len = (*prog)->len; env->insn_aux_data = @@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, bpf_clear_insn_aux_data(env, 0, env->prog->len); vfree(env->insn_aux_data); err_free_env: + free_callchain_profile(env); bpf_stack_liveness_free(env); kvfree(env->cfg.insn_postorder); kvfree(env->scc_info); -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman @ 2026-05-29 10:23 ` Jiri Olsa 2026-05-29 18:44 ` Eduard Zingerman 2026-06-01 19:50 ` Emil Tsalapatis 1 sibling, 1 reply; 22+ messages in thread From: Jiri Olsa @ 2026-05-29 10:23 UTC (permalink / raw) To: Eduard Zingerman Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote: SNIP > + > static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > { > int err; > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env) > insn_aux = &env->insn_aux_data[env->insn_idx]; > > if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { > + print_hotspots(env); hi, this is really cool feature, I've already used it to fix issues :) Would it be possible to make this configurable? like to be able to get this output for program that passed and display more than 3 hottest callchains? thanks, jirka > verbose(env, > "BPF program is too large. Processed %d insn\n", > env->insn_processed); > @@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env) > state->last_insn_idx = env->prev_insn_idx; > state->insn_idx = env->insn_idx; > > + if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) { > + err = update_callchain_profile(env, state); > + if (err) > + return err; > + } > + > if (bpf_is_prune_point(env, env->insn_idx)) { > err = bpf_is_state_visited(env, env->insn_idx); > if (err < 0) > @@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > return -ENOMEM; > > env->bt.env = env; > + hash_init(env->callchain_htab); > > len = (*prog)->len; > env->insn_aux_data = > @@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > bpf_clear_insn_aux_data(env, 0, env->prog->len); > vfree(env->insn_aux_data); > err_free_env: > + free_callchain_profile(env); > bpf_stack_liveness_free(env); > kvfree(env->cfg.insn_postorder); > kvfree(env->scc_info); > > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-05-29 10:23 ` Jiri Olsa @ 2026-05-29 18:44 ` Eduard Zingerman 2026-05-30 9:34 ` Jiri Olsa 0 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-05-29 18:44 UTC (permalink / raw) To: Jiri Olsa Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song On Fri, 2026-05-29 at 12:23 +0200, Jiri Olsa wrote: > On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote: > > SNIP > > > + > > static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > > { > > int err; > > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env) > > insn_aux = &env->insn_aux_data[env->insn_idx]; > > > > if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { > > + print_hotspots(env); > > hi, > this is really cool feature, I've already used it to fix issues :) Hi Jiri, thank you for the feedback! Anything you find lacking? After discussion with Alexei (who is somewhat skeptical of the whole idea), I plan to add: - samples for values residing in the diverging registers - walk back from iter_next call using jump history and print 5-10 instructions modifying the register in question (in a manner similar to backtracking done for mark chain precision). > Would it be possible to make this configurable? like to be able > to get this output for program that passed and display more than > 3 hottest callchains? For sure, there are several options for a way to toggle this print out: - unconditionally make it a part of v1 log; - add it as a bit in bpf_attr->log_level. I'd say that v1 log is a better option here. As for number of callchains printed, I'd like to avoid tweaking bpf_attr. So this leaves several options: - make it top 5-7-10, just hardcode a bigger number. - add a threshold value specifying how many times a callchain should be visited to get printed. E.g. only print callchains visited more than 1000 times, if such don't exist then put threshold to 100, if such don't exist, print top 3. Wdyt? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-05-29 18:44 ` Eduard Zingerman @ 2026-05-30 9:34 ` Jiri Olsa 0 siblings, 0 replies; 22+ messages in thread From: Jiri Olsa @ 2026-05-30 9:34 UTC (permalink / raw) To: Eduard Zingerman Cc: Jiri Olsa, bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song On Fri, May 29, 2026 at 11:44:39AM -0700, Eduard Zingerman wrote: > On Fri, 2026-05-29 at 12:23 +0200, Jiri Olsa wrote: > > On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote: > > > > SNIP > > > > > + > > > static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > > > { > > > int err; > > > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env) > > > insn_aux = &env->insn_aux_data[env->insn_idx]; > > > > > > if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { > > > + print_hotspots(env); > > > > hi, > > this is really cool feature, I've already used it to fix issues :) > > Hi Jiri, thank you for the feedback! > Anything you find lacking? so in my case the first 2 stacktraces identified data copy deep in the code, which was wrongly done with standard loop instead of bpf_iter, so it led me directly to the problem In another run I got the 'Most varying ...' output and it wasn't that obvious for me how helpful that could be, but I did not check on that in detail yet > > After discussion with Alexei (who is somewhat skeptical of the whole idea), > I plan to add: > - samples for values residing in the diverging registers > - walk back from iter_next call using jump history and > print 5-10 instructions modifying the register in question > (in a manner similar to backtracking done for mark chain precision). > > > Would it be possible to make this configurable? like to be able > > to get this output for program that passed and display more than > > 3 hottest callchains? > > For sure, there are several options for a way to toggle this print out: > - unconditionally make it a part of v1 log; > - add it as a bit in bpf_attr->log_level. > > I'd say that v1 log is a better option here. sounds good, I think it's negligible compared to the rest of the log > > As for number of callchains printed, I'd like to avoid tweaking I guess it's better to have it hardcoded, no need to configure that > bpf_attr. So this leaves several options: > - make it top 5-7-10, just hardcode a bigger number. > - add a threshold value specifying how many times a callchain should > be visited to get printed. E.g. only print callchains visited more > than 1000 times, if such don't exist then put threshold to 100, > if such don't exist, print top 3. +1 to just print top X stacks thanks, jirka ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman 2026-05-29 10:23 ` Jiri Olsa @ 2026-06-01 19:50 ` Emil Tsalapatis 2026-06-01 19:55 ` Eduard Zingerman 1 sibling, 1 reply; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 19:50 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > When the verifier exceeds the 1M instruction budget, the current error > output shows a random execution trace that happens to be active when > the limit is hit, which is not very helpful for debugging. > > To rectify this, employ a profiler-inspired technique: > collect and count "callchain" stack traces that the verifier visits > during program validation and report the top 3 hot traces. Sample > callchains at instructions marked as force checkpoints and loop header > entries. > > When 1M instructions is met, print a report containing: > - callchains verifier visited most often; > - disassembly of the subprograms referenced from the printed > callchains. > > Here is an example of the report: > > budget_nested_call_loop(): > ; asm volatile (" \ @ verifier_budget_report.c:44 > 0: (b7) r0 = 0 > 1: (85) call pc+1 > 2: (95) exit > > budget_nested_call_loop__inner(): > ; asm volatile (" \ @ verifier_budget_report.c:54 > 3: (b7) r0 = 0 > 4: (07) r0 += 1 > 5: (05) goto pc-2 > > #1 most visited simulated stacktrace (visited 499999 times): > budget_nested_call_loop/1 (.../verifier_budget_report.c:44) > budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54) > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Minor suggestion: Can we make the N of top-N a #define for clarity and to make it easy to change during hacking? > --- > include/linux/bpf_verifier.h | 14 ++++ > kernel/bpf/verifier.c | 155 +++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 169 insertions(+) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index d0a1fb135cbf..347a155d8e21 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -9,6 +9,8 @@ > #include <linux/filter.h> /* for MAX_BPF_STACK */ > #include <linux/tnum.h> > #include <linux/cnum.h> > +#include <linux/hashtable.h> > +#include <linux/jhash.h> > > /* Maximum variable offset umax_value permitted when resolving memory accesses. > * In practice this is far bigger than any realistic pointer offset; this limit > @@ -660,6 +662,17 @@ struct bpf_loop { > bool irreducible; > }; > > +struct bpf_callchain { > + u32 insn_idx[MAX_CALL_FRAMES]; > + u32 curframe; > +}; > + > +struct bpf_callchain_entry { > + struct hlist_node node; > + struct bpf_callchain cc; > + u64 count; > +}; > + > struct bpf_insn_aux_data { > union { > enum bpf_reg_type ptr_type; /* pointer type for load/store insns */ > @@ -1043,6 +1056,7 @@ struct bpf_verifier_env { > u32 scc_cnt; > struct bpf_iarray *succ; > struct bpf_iarray *gotox_tmp_buf; > + DECLARE_HASHTABLE(callchain_htab, 8); > }; > > static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog) > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 2b3584712ad2..54b7ad65b7fc 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -17247,6 +17247,152 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in > return INSN_IDX_UPDATED; > } > > +static void compute_callchain(struct bpf_verifier_state *st, struct bpf_callchain *cc) > +{ > + int i; > + > + cc->curframe = st->curframe; > + for (i = 0; i <= st->curframe; i++) > + cc->insn_idx[i] = bpf_frame_insn_idx(st, i); > +} > + > +static int update_callchain_profile(struct bpf_verifier_env *env, struct bpf_verifier_state *st) > +{ > + struct bpf_callchain_entry *entry; > + struct bpf_callchain cc = {}; > + u32 hash; > + > + compute_callchain(st, &cc); > + hash = jhash2(cc.insn_idx, cc.curframe + 1, cc.curframe + 1); > + hash_for_each_possible(env->callchain_htab, entry, node, hash) { > + if (entry->cc.curframe == cc.curframe && > + !memcmp(entry->cc.insn_idx, cc.insn_idx, (cc.curframe + 1) * sizeof(u32))) { > + entry->count++; > + return 0; > + } > + } > + entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); > + if (!entry) > + return -ENOMEM; > + entry->cc = cc; > + entry->count = 1; > + hash_add(env->callchain_htab, &entry->node, hash); > + return 0; > +} > + > +static void free_callchain_profile(struct bpf_verifier_env *env) > +{ > + struct bpf_callchain_entry *entry; > + struct hlist_node *tmp; > + int bkt; > + > + hash_for_each_safe(env->callchain_htab, bkt, tmp, entry, node) { > + hash_del(&entry->node); > + kfree(entry); > + } > +} > + > +static void print_callchain_entry(struct bpf_verifier_env *env, > + struct bpf_callchain_entry *entry, int idx) > +{ > + struct bpf_callchain *cc = &entry->cc; > + const struct bpf_line_info *linfo; > + struct bpf_subprog_info *sub; > + int i, insn_idx; > + > + verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n", > + idx, entry->count); > + for (i = 0; i <= cc->curframe; i++) { > + insn_idx = cc->insn_idx[i]; > + sub = bpf_find_containing_subprog(env, insn_idx); > + linfo = bpf_find_linfo(env->prog, insn_idx); > + if (sub->name) > + verbose(env, " %s/%d", sub->name, insn_idx); > + else > + verbose(env, " subprog#%td/%d", sub - env->subprog_info, insn_idx); > + if (linfo) > + verbose(env, " (%s:%u)", > + btf_name_by_offset(env->prog->aux->btf, linfo->file_name_off), > + BPF_LINE_INFO_LINE_NUM(linfo->line_col)); > + verbose(env, "\n"); > + } > +} > + > +static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub) > +{ > + u32 i, end = (sub + 1)->start; > + > + if (sub->name) > + verbose(env, "%s():\n", sub->name); > + else > + verbose(env, "subprog#%td:\n", sub - env->subprog_info); > + env->prev_linfo = NULL; > + for (i = sub->start; i < end; i++) { > + verbose_linfo(env, i, " ; "); > + verbose(env, " %d: ", i); > + bpf_verbose_insn(env, &env->prog->insnsi[i]); > + if (bpf_is_ldimm64(&env->prog->insnsi[i])) > + i++; > + } > +} > + > +/* > + * Print several most visited simulated stack traces, > + * and a disasembly of related subprograms. > + */ > +static void print_hotspots(struct bpf_verifier_env *env) > +{ > + DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {}; > + struct bpf_callchain_entry *top[3] = {}; > + struct bpf_callchain_entry *entry; > + struct bpf_subprog_info *sub; > + int i, j, bkt, nr_top = 0; > + > + if (!(env->log.level & BPF_LOG_LEVEL)) > + return; > + > + /* Collect the hottest callchains */ > + hash_for_each(env->callchain_htab, bkt, entry, node) { > + for (i = 0; i < 3; i++) { > + if (!top[i] || entry->count > top[i]->count) { > + memmove(&top[i + 1], &top[i], (2 - i) * sizeof(top[0])); > + top[i] = entry; > + break; > + } > + } > + } > + > + for (i = 0; i < 3 && top[i]; i++) > + nr_top++; > + > + if (!nr_top) > + return; > + > + if (!(env->log.level & BPF_LOG_LEVEL2)) > + bpf_vlog_reset(&env->log, 0); > + > + /* Identify callchain subprograms and print disasm of those */ > + for (i = 0; i < nr_top; i++) { > + struct bpf_callchain *cc = &top[i]->cc; > + > + for (j = 0; j <= cc->curframe; j++) { > + sub = bpf_find_containing_subprog(env, cc->insn_idx[j]); > + __set_bit(sub - env->subprog_info, printed_subs); > + } > + } > + > + for_each_set_bit(i, printed_subs, env->subprog_cnt) { > + disasm_subprog(env, &env->subprog_info[i]); > + verbose(env, "\n"); > + } > + > + /* Print the hot callchains */ > + for (i = 0; i < nr_top; i++) { > + print_callchain_entry(env, top[i], i + 1); > + verbose(env, "\n"); > + } > +} > + > static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > { > int err; > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env) > insn_aux = &env->insn_aux_data[env->insn_idx]; > > if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { > + print_hotspots(env); > verbose(env, > "BPF program is too large. Processed %d insn\n", > env->insn_processed); > @@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env) > state->last_insn_idx = env->prev_insn_idx; > state->insn_idx = env->insn_idx; > > + if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) { > + err = update_callchain_profile(env, state); > + if (err) > + return err; > + } > + > if (bpf_is_prune_point(env, env->insn_idx)) { > err = bpf_is_state_visited(env, env->insn_idx); > if (err < 0) > @@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > return -ENOMEM; > > env->bt.env = env; > + hash_init(env->callchain_htab); > > len = (*prog)->len; > env->insn_aux_data = > @@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, > bpf_clear_insn_aux_data(env, 0, env->prog->len); > vfree(env->insn_aux_data); > err_free_env: > + free_callchain_profile(env); > bpf_stack_liveness_free(env); > kvfree(env->cfg.insn_postorder); > kvfree(env->scc_info); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met 2026-06-01 19:50 ` Emil Tsalapatis @ 2026-06-01 19:55 ` Eduard Zingerman 0 siblings, 0 replies; 22+ messages in thread From: Eduard Zingerman @ 2026-06-01 19:55 UTC (permalink / raw) To: Emil Tsalapatis, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Mon, 2026-06-01 at 15:50 -0400, Emil Tsalapatis wrote: > On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > > When the verifier exceeds the 1M instruction budget, the current error > > output shows a random execution trace that happens to be active when > > the limit is hit, which is not very helpful for debugging. > > > > To rectify this, employ a profiler-inspired technique: > > collect and count "callchain" stack traces that the verifier visits > > during program validation and report the top 3 hot traces. Sample > > callchains at instructions marked as force checkpoints and loop header > > entries. > > > > When 1M instructions is met, print a report containing: > > - callchains verifier visited most often; > > - disassembly of the subprograms referenced from the printed > > callchains. > > > > Here is an example of the report: > > > > budget_nested_call_loop(): > > ; asm volatile (" \ @ verifier_budget_report.c:44 > > 0: (b7) r0 = 0 > > 1: (85) call pc+1 > > 2: (95) exit > > > > budget_nested_call_loop__inner(): > > ; asm volatile (" \ @ verifier_budget_report.c:54 > > 3: (b7) r0 = 0 > > 4: (07) r0 += 1 > > 5: (05) goto pc-2 > > > > #1 most visited simulated stacktrace (visited 499999 times): > > budget_nested_call_loop/1 (.../verifier_budget_report.c:44) > > budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54) > > > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> > > Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> > > Minor suggestion: Can we make the N of top-N a #define for clarity and > to make it easy to change during hacking? Yes, makes sense. [...] ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman ` (3 preceding siblings ...) 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-06-01 21:29 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman 5 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman When a hot callchain ends at a iter_next, may_goto, or callback-calling instruction, compare cached states from explored_states to identify which registers or stack slots most frequently differ between states. Report the top 3 most varying locations. The full states diff computation would require some statistical analysis similar to k-means. To keep things simple, approximate this by modifying states_equal() to return the first location where old and current states differ, and counting most frequent diff locations. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- include/linux/bpf_verifier.h | 15 ++++ kernel/bpf/states.c | 184 +++++++++++++++++++++++++++++++++++++------ kernel/bpf/verifier.c | 57 +++++++++++--- 3 files changed, 225 insertions(+), 31 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 347a155d8e21..cc83359774f5 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -662,6 +662,17 @@ struct bpf_loop { bool irreducible; }; +struct bpf_state_diff { + u8 slot; + u8 frame; + enum { + DIFF_OTHER, + DIFF_REG, + DIFF_STACK, + DIFF_ARG, + } kind; +}; + struct bpf_callchain { u32 insn_idx[MAX_CALL_FRAMES]; u32 curframe; @@ -1631,5 +1642,9 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env); int bpf_do_misc_fixups(struct bpf_verifier_env *env); int bpf_compute_loops(struct bpf_verifier_env *env); +int bpf_sample_state_diffs(struct bpf_verifier_env *env, + struct bpf_callchain *cc, + struct bpf_state_diff *top_diffs, + int *nr_diffs); #endif /* _LINUX_BPF_VERIFIER_H */ diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c index 877338136009..b1256658189d 100644 --- a/kernel/bpf/states.c +++ b/kernel/bpf/states.c @@ -4,6 +4,7 @@ #include <linux/bpf_verifier.h> #include <linux/cnum.h> #include <linux/filter.h> +#include <linux/sort.h> #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args) @@ -696,7 +697,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env, static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, struct bpf_func_state *cur, struct bpf_idmap *idmap, - enum exact_level exact) + enum exact_level exact, struct bpf_state_diff *diff) { int i, spi; @@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, old_type = STACK_INVALID; if (cur_type == STACK_POISON) cur_type = STACK_INVALID; - if (i >= cur->allocated_stack || old_type != cur_type) + if (i >= cur->allocated_stack || old_type != cur_type) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } } if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID || @@ -735,8 +739,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, /* explored stack has more populated slots than current stack * and these slots were used */ - if (i >= cur->allocated_stack) + if (i >= cur->allocated_stack) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } /* * 64 and 32-bit scalar spills vs MISC/INVALID slots and vice versa. @@ -748,8 +755,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, old_reg = scalar_reg_for_stack(env, &old->stack[spi], im); cur_reg = scalar_reg_for_stack(env, &cur->stack[spi], im); if (old_reg && cur_reg) { - if (!regsafe(env, old_reg, cur_reg, idmap, exact)) + if (!regsafe(env, old_reg, cur_reg, idmap, exact)) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } i += (im == 0 ? BPF_REG_SIZE - 1 : 3); continue; } @@ -763,13 +773,16 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, cur->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_ZERO) continue; if (old->stack[spi].slot_type[i % BPF_REG_SIZE] != - cur->stack[spi].slot_type[i % BPF_REG_SIZE]) + cur->stack[spi].slot_type[i % BPF_REG_SIZE]) { /* Ex: old explored (safe) state has STACK_SPILL in * this stack slot, but current has STACK_MISC -> * this verifier states are not equivalent, * return false to continue verification of this path */ + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) continue; /* Both old and cur are having same slot_type */ @@ -786,16 +799,22 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, * return false to continue verification of this path */ if (!regsafe(env, &old->stack[spi].spilled_ptr, - &cur->stack[spi].spilled_ptr, idmap, exact)) + &cur->stack[spi].spilled_ptr, idmap, exact)) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } break; case STACK_DYNPTR: old_reg = &old->stack[spi].spilled_ptr; cur_reg = &cur->stack[spi].spilled_ptr; if (old_reg->dynptr.type != cur_reg->dynptr.type || old_reg->dynptr.first_slot != cur_reg->dynptr.first_slot || - !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) + !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } break; case STACK_ITER: old_reg = &old->stack[spi].spilled_ptr; @@ -810,15 +829,21 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, old_reg->iter.btf_id != cur_reg->iter.btf_id || old_reg->iter.state != cur_reg->iter.state || /* ignore {old_reg,cur_reg}->iter.depth, see above */ - !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) + !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } break; case STACK_IRQ_FLAG: old_reg = &old->stack[spi].spilled_ptr; cur_reg = &cur->stack[spi].spilled_ptr; if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap) || - old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) + old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) { + diff->slot = spi; + diff->kind = DIFF_STACK; return false; + } break; case STACK_MISC: case STACK_ZERO: @@ -827,6 +852,8 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, continue; /* Ensure that new unhandled slot types return false by default */ default: + diff->slot = spi; + diff->kind = DIFF_STACK; return false; } } @@ -839,7 +866,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, */ static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old, struct bpf_func_state *cur, struct bpf_idmap *idmap, - enum exact_level exact) + enum exact_level exact, struct bpf_state_diff *diff) { int i, nslots; @@ -852,8 +879,11 @@ static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state * &old->stack_arg_regs[i] : ¬_init; cur_arg = i < cur->out_stack_arg_cnt ? &cur->stack_arg_regs[i] : ¬_init; - if (!regsafe(env, old_arg, cur_arg, idmap, exact)) + if (!regsafe(env, old_arg, cur_arg, idmap, exact)) { + diff->slot = i; + diff->kind = DIFF_ARG; return false; + } } return true; @@ -933,7 +963,8 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c * the current state will reach 'bpf_exit' instruction safely */ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old, - struct bpf_func_state *cur, u32 insn_idx, enum exact_level exact) + struct bpf_func_state *cur, u32 insn_idx, + enum exact_level exact, struct bpf_state_diff *diff) { u16 live_regs = env->insn_aux_data[insn_idx].live_regs_before; u16 i; @@ -947,13 +978,16 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat for (i = 0; i < MAX_BPF_REG; i++) if (((1 << i) & live_regs) && !regsafe(env, &old->regs[i], &cur->regs[i], - &env->idmap_scratch, exact)) + &env->idmap_scratch, exact)) { + diff->slot = i; + diff->kind = DIFF_REG; return false; + } - if (!stacksafe(env, old, cur, &env->idmap_scratch, exact)) + if (!stacksafe(env, old, cur, &env->idmap_scratch, exact, diff)) return false; - if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact)) + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact, diff)) return false; return true; @@ -970,11 +1004,14 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env) static bool states_equal(struct bpf_verifier_env *env, struct bpf_verifier_state *old, struct bpf_verifier_state *cur, - enum exact_level exact) + enum exact_level exact, + struct bpf_state_diff *diff) { u32 insn_idx; int i; + diff->kind = DIFF_OTHER; + if (old->curframe != cur->curframe) return false; @@ -999,8 +1036,11 @@ static bool states_equal(struct bpf_verifier_env *env, insn_idx = bpf_frame_insn_idx(old, i); if (old->frame[i]->callsite != cur->frame[i]->callsite) return false; - if (!func_states_equal(env, old->frame[i], cur->frame[i], insn_idx, exact)) + if (!func_states_equal(env, old->frame[i], cur->frame[i], + insn_idx, exact, diff)) { + diff->frame = i; return false; + } } return true; } @@ -1231,6 +1271,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) struct bpf_verifier_state_list *new_sl; struct bpf_verifier_state_list *sl; struct bpf_verifier_state *cur = env->cur_state, *new; + struct bpf_state_diff diff = {}; bool force_new_state, add_new_state, loop; int n, err, states_cnt = 0; struct list_head *pos, *tmp, *head; @@ -1320,7 +1361,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) * => unsafe memory access at 11 would not be caught. */ if (is_iter_next_insn(env, insn_idx)) { - if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { + if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { struct bpf_func_state *cur_frame; struct bpf_reg_state *iter_state, *iter_reg; int spi; @@ -1345,13 +1386,13 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) } if (is_may_goto_insn_at(env, insn_idx)) { if (sl->state.may_goto_depth != cur->may_goto_depth && - states_equal(env, &sl->state, cur, RANGE_WITHIN)) { + states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { loop = true; goto hit; } } if (bpf_calls_callback(env, insn_idx)) { - if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { + if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { loop = true; goto hit; } @@ -1359,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) } /* attempt to detect infinite loop to avoid unnecessary doomed work */ if (states_maybe_looping(&sl->state, cur) && - states_equal(env, &sl->state, cur, EXACT) && + states_equal(env, &sl->state, cur, EXACT, &diff) && !iter_active_depths_differ(&sl->state, cur) && sl->state.may_goto_depth == cur->may_goto_depth && sl->state.callback_unroll_depth == cur->callback_unroll_depth) { @@ -1392,7 +1433,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) } /* See comments for mark_all_regs_read_and_precise() */ loop = incomplete_read_marks(env, &sl->state); - if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT)) { + if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT, &diff)) { hit: sl->hit_cnt++; @@ -1588,3 +1629,102 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) list_add(&new_sl->node, head); return 0; } + +static bool callchain_matches_state(struct bpf_callchain *cc, + struct bpf_verifier_state *st) +{ + int i; + + if (st->curframe != cc->curframe) + return false; + for (i = 0; i < (int)cc->curframe; i++) + if (st->frame[i + 1]->callsite != cc->insn_idx[i]) + return false; + return st->insn_idx == cc->insn_idx[cc->curframe]; +} + +struct state_diff_cnt { + struct bpf_state_diff diff; + u32 cnt; +}; + +static int state_diff_cmp(const void *a, const void *b) +{ + return cmp_int(((struct state_diff_cnt *)b)->cnt, ((struct state_diff_cnt *)a)->cnt); +} + +static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b) +{ + return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind; +} + +int bpf_sample_state_diffs(struct bpf_verifier_env *env, + struct bpf_callchain *cc, + struct bpf_state_diff *top_diffs, + int *nr_diffs) +{ + struct bpf_verifier_state_list *sl_i, *sl_j; + struct state_diff_cnt *diff_cnts = NULL; + struct list_head *pos_i, *pos_j, *head; + u32 leaf_insn, callsite, hash_idx; + int i, cap = 0, nr_locs = 0; + + leaf_insn = cc->insn_idx[cc->curframe]; + callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC; + hash_idx = (leaf_insn ^ callsite) % env->prog->len; + head = &env->explored_states[hash_idx]; + + /* + * Single bucket accumulates up to 64 states, no cond_resched() necessary. + * See limits in is_state_visited(). + */ + list_for_each(pos_i, head) { + sl_i = container_of(pos_i, struct bpf_verifier_state_list, node); + if (!callchain_matches_state(cc, &sl_i->state)) + continue; + list_for_each(pos_j, head) { + struct bpf_state_diff diff = {}; + + if (pos_i == pos_j) + continue; + sl_j = container_of(pos_j, struct bpf_verifier_state_list, node); + if (!callchain_matches_state(cc, &sl_j->state)) + continue; + if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff)) + continue; + if (diff.kind == DIFF_OTHER) + continue; + for (i = 0; i < nr_locs; i++) { + if (state_diff_eq(&diff_cnts[i].diff, &diff)) { + diff_cnts[i].cnt++; + goto next; + } + } + if (nr_locs == cap) { + int new_cap = cap ? cap * 2 : 16; + struct state_diff_cnt *new; + + new = kvrealloc(diff_cnts, new_cap * sizeof(*new), + GFP_KERNEL_ACCOUNT); + if (!new) { + kvfree(diff_cnts); + return -ENOMEM; + } + memset(new + cap, 0, (new_cap - cap) * sizeof(*new)); + diff_cnts = new; + cap = new_cap; + } + diff_cnts[nr_locs].diff = diff; + diff_cnts[nr_locs].cnt = 1; + nr_locs++; +next:; + } + } + + sort(diff_cnts, nr_locs, sizeof(*diff_cnts), state_diff_cmp, NULL); + *nr_diffs = min(nr_locs, *nr_diffs); + for (i = 0; i < *nr_diffs; i++) + top_diffs[i] = diff_cnts[i].diff; + kvfree(diff_cnts); + return 0; +} diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 54b7ad65b7fc..d09c014462f1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17292,13 +17292,13 @@ static void free_callchain_profile(struct bpf_verifier_env *env) } } -static void print_callchain_entry(struct bpf_verifier_env *env, - struct bpf_callchain_entry *entry, int idx) +static int print_callchain_entry(struct bpf_verifier_env *env, + struct bpf_callchain_entry *entry, int idx) { struct bpf_callchain *cc = &entry->cc; const struct bpf_line_info *linfo; struct bpf_subprog_info *sub; - int i, insn_idx; + int i, err, insn_idx; verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n", idx, entry->count); @@ -17316,6 +17316,38 @@ static void print_callchain_entry(struct bpf_verifier_env *env, BPF_LINE_INFO_LINE_NUM(linfo->line_col)); verbose(env, "\n"); } + + insn_idx = cc->insn_idx[cc->curframe]; + if (bpf_is_force_checkpoint(env, insn_idx)) { + struct bpf_state_diff top_diffs[3]; + int nr_diffs = ARRAY_SIZE(top_diffs); + + err = bpf_sample_state_diffs(env, cc, top_diffs, &nr_diffs); + if (err) + return err; + for (i = 0; i < nr_diffs; i++) { + struct bpf_state_diff *d = &top_diffs[i]; + + switch (d->kind) { + case DIFF_REG: + verbose(env, " Most varying: R%d (frame %d)\n", + d->slot, d->frame); + break; + case DIFF_STACK: + verbose(env, " Most varying: fp-%d (frame %d)\n", + (d->slot + 1) * BPF_REG_SIZE, d->frame); + break; + case DIFF_ARG: + verbose(env, " Most varying: arg#%d (frame %d)\n", + d->slot, d->frame); + break; + default: + /* shouldn't really happen */ + continue; + } + } + } + return 0; } static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub) @@ -17340,16 +17372,16 @@ static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info * Print several most visited simulated stack traces, * and a disasembly of related subprograms. */ -static void print_hotspots(struct bpf_verifier_env *env) +static int print_hotspots(struct bpf_verifier_env *env) { DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {}; struct bpf_callchain_entry *top[3] = {}; struct bpf_callchain_entry *entry; struct bpf_subprog_info *sub; - int i, j, bkt, nr_top = 0; + int i, j, err, bkt, nr_top = 0; if (!(env->log.level & BPF_LOG_LEVEL)) - return; + return 0; /* Collect the hottest callchains */ hash_for_each(env->callchain_htab, bkt, entry, node) { @@ -17366,7 +17398,7 @@ static void print_hotspots(struct bpf_verifier_env *env) nr_top++; if (!nr_top) - return; + return 0; if (!(env->log.level & BPF_LOG_LEVEL2)) bpf_vlog_reset(&env->log, 0); @@ -17388,9 +17420,14 @@ static void print_hotspots(struct bpf_verifier_env *env) /* Print the hot callchains */ for (i = 0; i < nr_top; i++) { - print_callchain_entry(env, top[i], i + 1); + err = print_callchain_entry(env, top[i], i + 1); + if (err) + return err; + verbose(env, "\n"); } + + return 0; } static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) @@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env) insn_aux = &env->insn_aux_data[env->insn_idx]; if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { - print_hotspots(env); + err = print_hotspots(env); + if (err) + return err; verbose(env, "BPF program is too large. Processed %d insn\n", env->insn_processed); -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman @ 2026-06-01 21:29 ` Emil Tsalapatis 2026-06-01 21:38 ` Eduard Zingerman 0 siblings, 1 reply; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 21:29 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > When a hot callchain ends at a iter_next, may_goto, or > callback-calling instruction, compare cached states from > explored_states to identify which registers or stack slots most > frequently differ between states. Report the top 3 most varying > locations. > > The full states diff computation would require some statistical > analysis similar to k-means. To keep things simple, approximate this > by modifying states_equal() to return the first location where old and > current states differ, and counting most frequent diff locations. > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> One nit below. > --- > include/linux/bpf_verifier.h | 15 ++++ > kernel/bpf/states.c | 184 +++++++++++++++++++++++++++++++++++++------ > kernel/bpf/verifier.c | 57 +++++++++++--- > 3 files changed, 225 insertions(+), 31 deletions(-) > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h > index 347a155d8e21..cc83359774f5 100644 > --- a/include/linux/bpf_verifier.h > +++ b/include/linux/bpf_verifier.h > @@ -662,6 +662,17 @@ struct bpf_loop { > bool irreducible; > }; > > +struct bpf_state_diff { > + u8 slot; > + u8 frame; > + enum { > + DIFF_OTHER, > + DIFF_REG, > + DIFF_STACK, > + DIFF_ARG, > + } kind; > +}; > + > struct bpf_callchain { > u32 insn_idx[MAX_CALL_FRAMES]; > u32 curframe; > @@ -1631,5 +1642,9 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env); > int bpf_do_misc_fixups(struct bpf_verifier_env *env); > > int bpf_compute_loops(struct bpf_verifier_env *env); > +int bpf_sample_state_diffs(struct bpf_verifier_env *env, > + struct bpf_callchain *cc, > + struct bpf_state_diff *top_diffs, > + int *nr_diffs); > > #endif /* _LINUX_BPF_VERIFIER_H */ > diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c > index 877338136009..b1256658189d 100644 > --- a/kernel/bpf/states.c > +++ b/kernel/bpf/states.c > @@ -4,6 +4,7 @@ > #include <linux/bpf_verifier.h> > #include <linux/cnum.h> > #include <linux/filter.h> > +#include <linux/sort.h> > > #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args) > > @@ -696,7 +697,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env, > > static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > struct bpf_func_state *cur, struct bpf_idmap *idmap, > - enum exact_level exact) > + enum exact_level exact, struct bpf_state_diff *diff) > { > int i, spi; > > @@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > old_type = STACK_INVALID; > if (cur_type == STACK_POISON) > cur_type = STACK_INVALID; > - if (i >= cur->allocated_stack || old_type != cur_type) > + if (i >= cur->allocated_stack || old_type != cur_type) { This new logging of divergent stack slots is identical throughout stack safe. Can we factor it out into an error label at the very end? > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > } > > if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID || > @@ -735,8 +739,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > /* explored stack has more populated slots than current stack > * and these slots were used > */ > - if (i >= cur->allocated_stack) > + if (i >= cur->allocated_stack) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > > /* > * 64 and 32-bit scalar spills vs MISC/INVALID slots and vice versa. > @@ -748,8 +755,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > old_reg = scalar_reg_for_stack(env, &old->stack[spi], im); > cur_reg = scalar_reg_for_stack(env, &cur->stack[spi], im); > if (old_reg && cur_reg) { > - if (!regsafe(env, old_reg, cur_reg, idmap, exact)) > + if (!regsafe(env, old_reg, cur_reg, idmap, exact)) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > i += (im == 0 ? BPF_REG_SIZE - 1 : 3); > continue; > } > @@ -763,13 +773,16 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > cur->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_ZERO) > continue; > if (old->stack[spi].slot_type[i % BPF_REG_SIZE] != > - cur->stack[spi].slot_type[i % BPF_REG_SIZE]) > + cur->stack[spi].slot_type[i % BPF_REG_SIZE]) { > /* Ex: old explored (safe) state has STACK_SPILL in > * this stack slot, but current has STACK_MISC -> > * this verifier states are not equivalent, > * return false to continue verification of this path > */ > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1) > continue; > /* Both old and cur are having same slot_type */ > @@ -786,16 +799,22 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > * return false to continue verification of this path > */ > if (!regsafe(env, &old->stack[spi].spilled_ptr, > - &cur->stack[spi].spilled_ptr, idmap, exact)) > + &cur->stack[spi].spilled_ptr, idmap, exact)) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > break; > case STACK_DYNPTR: > old_reg = &old->stack[spi].spilled_ptr; > cur_reg = &cur->stack[spi].spilled_ptr; > if (old_reg->dynptr.type != cur_reg->dynptr.type || > old_reg->dynptr.first_slot != cur_reg->dynptr.first_slot || > - !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) > + !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > break; > case STACK_ITER: > old_reg = &old->stack[spi].spilled_ptr; > @@ -810,15 +829,21 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > old_reg->iter.btf_id != cur_reg->iter.btf_id || > old_reg->iter.state != cur_reg->iter.state || > /* ignore {old_reg,cur_reg}->iter.depth, see above */ > - !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) > + !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > break; > case STACK_IRQ_FLAG: > old_reg = &old->stack[spi].spilled_ptr; > cur_reg = &cur->stack[spi].spilled_ptr; > if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap) || > - old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) > + old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) { > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > + } > break; > case STACK_MISC: > case STACK_ZERO: > @@ -827,6 +852,8 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > continue; > /* Ensure that new unhandled slot types return false by default */ > default: > + diff->slot = spi; > + diff->kind = DIFF_STACK; > return false; > } > } > @@ -839,7 +866,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > */ > static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old, > struct bpf_func_state *cur, struct bpf_idmap *idmap, > - enum exact_level exact) > + enum exact_level exact, struct bpf_state_diff *diff) > { > int i, nslots; > > @@ -852,8 +879,11 @@ static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state * > &old->stack_arg_regs[i] : ¬_init; > cur_arg = i < cur->out_stack_arg_cnt ? > &cur->stack_arg_regs[i] : ¬_init; > - if (!regsafe(env, old_arg, cur_arg, idmap, exact)) > + if (!regsafe(env, old_arg, cur_arg, idmap, exact)) { > + diff->slot = i; > + diff->kind = DIFF_ARG; > return false; > + } > } > > return true; > @@ -933,7 +963,8 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c > * the current state will reach 'bpf_exit' instruction safely > */ > static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old, > - struct bpf_func_state *cur, u32 insn_idx, enum exact_level exact) > + struct bpf_func_state *cur, u32 insn_idx, > + enum exact_level exact, struct bpf_state_diff *diff) > { > u16 live_regs = env->insn_aux_data[insn_idx].live_regs_before; > u16 i; > @@ -947,13 +978,16 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat > for (i = 0; i < MAX_BPF_REG; i++) > if (((1 << i) & live_regs) && > !regsafe(env, &old->regs[i], &cur->regs[i], > - &env->idmap_scratch, exact)) > + &env->idmap_scratch, exact)) { > + diff->slot = i; > + diff->kind = DIFF_REG; > return false; > + } > > - if (!stacksafe(env, old, cur, &env->idmap_scratch, exact)) > + if (!stacksafe(env, old, cur, &env->idmap_scratch, exact, diff)) > return false; > > - if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact)) > + if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact, diff)) > return false; > > return true; > @@ -970,11 +1004,14 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env) > static bool states_equal(struct bpf_verifier_env *env, > struct bpf_verifier_state *old, > struct bpf_verifier_state *cur, > - enum exact_level exact) > + enum exact_level exact, > + struct bpf_state_diff *diff) > { > u32 insn_idx; > int i; > > + diff->kind = DIFF_OTHER; > + > if (old->curframe != cur->curframe) > return false; > > @@ -999,8 +1036,11 @@ static bool states_equal(struct bpf_verifier_env *env, > insn_idx = bpf_frame_insn_idx(old, i); > if (old->frame[i]->callsite != cur->frame[i]->callsite) > return false; > - if (!func_states_equal(env, old->frame[i], cur->frame[i], insn_idx, exact)) > + if (!func_states_equal(env, old->frame[i], cur->frame[i], > + insn_idx, exact, diff)) { > + diff->frame = i; > return false; > + } > } > return true; > } > @@ -1231,6 +1271,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > struct bpf_verifier_state_list *new_sl; > struct bpf_verifier_state_list *sl; > struct bpf_verifier_state *cur = env->cur_state, *new; > + struct bpf_state_diff diff = {}; > bool force_new_state, add_new_state, loop; > int n, err, states_cnt = 0; > struct list_head *pos, *tmp, *head; > @@ -1320,7 +1361,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > * => unsafe memory access at 11 would not be caught. > */ > if (is_iter_next_insn(env, insn_idx)) { > - if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { > + if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { > struct bpf_func_state *cur_frame; > struct bpf_reg_state *iter_state, *iter_reg; > int spi; > @@ -1345,13 +1386,13 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > } > if (is_may_goto_insn_at(env, insn_idx)) { > if (sl->state.may_goto_depth != cur->may_goto_depth && > - states_equal(env, &sl->state, cur, RANGE_WITHIN)) { > + states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { > loop = true; > goto hit; > } > } > if (bpf_calls_callback(env, insn_idx)) { > - if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) { > + if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) { > loop = true; > goto hit; > } > @@ -1359,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > } > /* attempt to detect infinite loop to avoid unnecessary doomed work */ > if (states_maybe_looping(&sl->state, cur) && > - states_equal(env, &sl->state, cur, EXACT) && > + states_equal(env, &sl->state, cur, EXACT, &diff) && > !iter_active_depths_differ(&sl->state, cur) && > sl->state.may_goto_depth == cur->may_goto_depth && > sl->state.callback_unroll_depth == cur->callback_unroll_depth) { > @@ -1392,7 +1433,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > } > /* See comments for mark_all_regs_read_and_precise() */ > loop = incomplete_read_marks(env, &sl->state); > - if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT)) { > + if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT, &diff)) { > hit: > sl->hit_cnt++; > > @@ -1588,3 +1629,102 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx) > list_add(&new_sl->node, head); > return 0; > } > + > +static bool callchain_matches_state(struct bpf_callchain *cc, > + struct bpf_verifier_state *st) > +{ > + int i; > + > + if (st->curframe != cc->curframe) > + return false; > + for (i = 0; i < (int)cc->curframe; i++) > + if (st->frame[i + 1]->callsite != cc->insn_idx[i]) > + return false; > + return st->insn_idx == cc->insn_idx[cc->curframe]; > +} > + > +struct state_diff_cnt { > + struct bpf_state_diff diff; > + u32 cnt; > +}; > + > +static int state_diff_cmp(const void *a, const void *b) > +{ > + return cmp_int(((struct state_diff_cnt *)b)->cnt, ((struct state_diff_cnt *)a)->cnt); > +} > + > +static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b) > +{ > + return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind; > +} > + > +int bpf_sample_state_diffs(struct bpf_verifier_env *env, > + struct bpf_callchain *cc, > + struct bpf_state_diff *top_diffs, > + int *nr_diffs) > +{ > + struct bpf_verifier_state_list *sl_i, *sl_j; > + struct state_diff_cnt *diff_cnts = NULL; > + struct list_head *pos_i, *pos_j, *head; > + u32 leaf_insn, callsite, hash_idx; > + int i, cap = 0, nr_locs = 0; > + > + leaf_insn = cc->insn_idx[cc->curframe]; > + callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC; > + hash_idx = (leaf_insn ^ callsite) % env->prog->len; > + head = &env->explored_states[hash_idx]; > + > + /* > + * Single bucket accumulates up to 64 states, no cond_resched() necessary. > + * See limits in is_state_visited(). > + */ > + list_for_each(pos_i, head) { > + sl_i = container_of(pos_i, struct bpf_verifier_state_list, node); > + if (!callchain_matches_state(cc, &sl_i->state)) > + continue; > + list_for_each(pos_j, head) { > + struct bpf_state_diff diff = {}; > + > + if (pos_i == pos_j) > + continue; > + sl_j = container_of(pos_j, struct bpf_verifier_state_list, node); > + if (!callchain_matches_state(cc, &sl_j->state)) > + continue; > + if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff)) > + continue; > + if (diff.kind == DIFF_OTHER) > + continue; > + for (i = 0; i < nr_locs; i++) { > + if (state_diff_eq(&diff_cnts[i].diff, &diff)) { > + diff_cnts[i].cnt++; > + goto next; > + } > + } > + if (nr_locs == cap) { > + int new_cap = cap ? cap * 2 : 16; > + struct state_diff_cnt *new; > + > + new = kvrealloc(diff_cnts, new_cap * sizeof(*new), > + GFP_KERNEL_ACCOUNT); > + if (!new) { > + kvfree(diff_cnts); > + return -ENOMEM; > + } > + memset(new + cap, 0, (new_cap - cap) * sizeof(*new)); > + diff_cnts = new; > + cap = new_cap; > + } > + diff_cnts[nr_locs].diff = diff; > + diff_cnts[nr_locs].cnt = 1; > + nr_locs++; > +next:; > + } > + } > + > + sort(diff_cnts, nr_locs, sizeof(*diff_cnts), state_diff_cmp, NULL); > + *nr_diffs = min(nr_locs, *nr_diffs); > + for (i = 0; i < *nr_diffs; i++) > + top_diffs[i] = diff_cnts[i].diff; > + kvfree(diff_cnts); > + return 0; > +} > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 54b7ad65b7fc..d09c014462f1 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -17292,13 +17292,13 @@ static void free_callchain_profile(struct bpf_verifier_env *env) > } > } > > -static void print_callchain_entry(struct bpf_verifier_env *env, > - struct bpf_callchain_entry *entry, int idx) > +static int print_callchain_entry(struct bpf_verifier_env *env, > + struct bpf_callchain_entry *entry, int idx) > { > struct bpf_callchain *cc = &entry->cc; > const struct bpf_line_info *linfo; > struct bpf_subprog_info *sub; > - int i, insn_idx; > + int i, err, insn_idx; > > verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n", > idx, entry->count); > @@ -17316,6 +17316,38 @@ static void print_callchain_entry(struct bpf_verifier_env *env, > BPF_LINE_INFO_LINE_NUM(linfo->line_col)); > verbose(env, "\n"); > } > + > + insn_idx = cc->insn_idx[cc->curframe]; > + if (bpf_is_force_checkpoint(env, insn_idx)) { > + struct bpf_state_diff top_diffs[3]; > + int nr_diffs = ARRAY_SIZE(top_diffs); > + > + err = bpf_sample_state_diffs(env, cc, top_diffs, &nr_diffs); > + if (err) > + return err; > + for (i = 0; i < nr_diffs; i++) { > + struct bpf_state_diff *d = &top_diffs[i]; > + > + switch (d->kind) { > + case DIFF_REG: > + verbose(env, " Most varying: R%d (frame %d)\n", > + d->slot, d->frame); > + break; > + case DIFF_STACK: > + verbose(env, " Most varying: fp-%d (frame %d)\n", > + (d->slot + 1) * BPF_REG_SIZE, d->frame); > + break; > + case DIFF_ARG: > + verbose(env, " Most varying: arg#%d (frame %d)\n", > + d->slot, d->frame); > + break; > + default: > + /* shouldn't really happen */ > + continue; > + } > + } > + } > + return 0; > } > > static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub) > @@ -17340,16 +17372,16 @@ static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info > * Print several most visited simulated stack traces, > * and a disasembly of related subprograms. > */ > -static void print_hotspots(struct bpf_verifier_env *env) > +static int print_hotspots(struct bpf_verifier_env *env) > { > DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {}; > struct bpf_callchain_entry *top[3] = {}; > struct bpf_callchain_entry *entry; > struct bpf_subprog_info *sub; > - int i, j, bkt, nr_top = 0; > + int i, j, err, bkt, nr_top = 0; > > if (!(env->log.level & BPF_LOG_LEVEL)) > - return; > + return 0; > > /* Collect the hottest callchains */ > hash_for_each(env->callchain_htab, bkt, entry, node) { > @@ -17366,7 +17398,7 @@ static void print_hotspots(struct bpf_verifier_env *env) > nr_top++; > > if (!nr_top) > - return; > + return 0; > > if (!(env->log.level & BPF_LOG_LEVEL2)) > bpf_vlog_reset(&env->log, 0); > @@ -17388,9 +17420,14 @@ static void print_hotspots(struct bpf_verifier_env *env) > > /* Print the hot callchains */ > for (i = 0; i < nr_top; i++) { > - print_callchain_entry(env, top[i], i + 1); > + err = print_callchain_entry(env, top[i], i + 1); > + if (err) > + return err; > + > verbose(env, "\n"); > } > + > + return 0; > } > > static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) > @@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env) > insn_aux = &env->insn_aux_data[env->insn_idx]; > > if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { > - print_hotspots(env); > + err = print_hotspots(env); > + if (err) > + return err; > verbose(env, > "BPF program is too large. Processed %d insn\n", > env->insn_processed); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains 2026-06-01 21:29 ` Emil Tsalapatis @ 2026-06-01 21:38 ` Eduard Zingerman 0 siblings, 0 replies; 22+ messages in thread From: Eduard Zingerman @ 2026-06-01 21:38 UTC (permalink / raw) To: Emil Tsalapatis, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Mon, 2026-06-01 at 17:29 -0400, Emil Tsalapatis wrote: [...] > > @@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, > > old_type = STACK_INVALID; > > if (cur_type == STACK_POISON) > > cur_type = STACK_INVALID; > > - if (i >= cur->allocated_stack || old_type != cur_type) > > + if (i >= cur->allocated_stack || old_type != cur_type) { > > This new logging of divergent stack slots is identical throughout stack safe. Can we factor > it out into an error label at the very end? Yes, makes sense. > > + diff->slot = spi; > > + diff->kind = DIFF_STACK; > > return false; > > + } > > } > > > > if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID || [...[ ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman ` (4 preceding siblings ...) 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman @ 2026-05-27 7:29 ` Eduard Zingerman 2026-06-01 19:55 ` Emil Tsalapatis 5 siblings, 1 reply; 22+ messages in thread From: Eduard Zingerman @ 2026-05-27 7:29 UTC (permalink / raw) To: bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song, Eduard Zingerman Add verifier selftests that exhaust the 1M instruction budget and verify that the new profiling report is produced: - nested function calls with inner loop, check that stack trace is print correctly; - two loops on the same level, check that both are reported; - iterator-based loop with non-converging register, check that register is in the report; - iterator-based loop with non-converging stack slot, check that stack slot is in the report. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- tools/testing/selftests/bpf/prog_tests/verifier.c | 2 + .../selftests/bpf/progs/verifier_budget_report.c | 175 +++++++++++++++++++++ 2 files changed, 177 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index 35d92d176c1f..61a732a499b0 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -14,6 +14,7 @@ #include "verifier_basic_stack.skel.h" #include "verifier_bitfield_write.skel.h" #include "verifier_bounds.skel.h" +#include "verifier_budget_report.skel.h" #include "verifier_bounds_deduction.skel.h" #include "verifier_bounds_deduction_non_const.skel.h" #include "verifier_bounds_mix_sign_unsign.skel.h" @@ -169,6 +170,7 @@ void test_verifier_bounds(void) { RUN(verifier_bounds); } void test_verifier_bounds_deduction(void) { RUN(verifier_bounds_deduction); } void test_verifier_bounds_deduction_non_const(void) { RUN(verifier_bounds_deduction_non_const); } void test_verifier_bounds_mix_sign_unsign(void) { RUN(verifier_bounds_mix_sign_unsign); } +void test_verifier_budget_report(void) { RUN(verifier_budget_report); } void test_verifier_bpf_get_stack(void) { RUN(verifier_bpf_get_stack); } void test_verifier_bpf_trap(void) { RUN(verifier_bpf_trap); } void test_verifier_bswap(void) { RUN(verifier_bswap); } diff --git a/tools/testing/selftests/bpf/progs/verifier_budget_report.c b/tools/testing/selftests/bpf/progs/verifier_budget_report.c new file mode 100644 index 000000000000..7b87e53831c2 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_budget_report.c @@ -0,0 +1,175 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +SEC("tracepoint") +__failure +__log_level(1) +__msg("two_level_calls():") +__msg(" 0: (85) call pc+1") +__msg(" 1: (95) exit") +__msg("two_level_calls__l1():") +__msg(" 2: (85) call pc+1") +__msg(" 3: (95) exit") +__msg("two_level_calls__l2():") +__msg(" 4: (b7) r0 = 0") +__msg(" 5: (07) r0 += 1") +__msg(" 6: (05) goto pc-2") +__msg("#1 most visited simulated stacktrace (visited 499999 times):") +__msg(" two_level_calls/0 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg(" two_level_calls__l1/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg(" two_level_calls__l2/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg("BPF program is too large. Processed 1000001 insn") +__naked void two_level_calls(void) +{ + asm volatile (" \ + call two_level_calls__l1; \ + exit; \ +" ::: __clobber_all); +} + +__used +static __naked void two_level_calls__l1(void) +{ + asm volatile (" \ + call two_level_calls__l2; \ + exit; \ +" ::: __clobber_all); +} + +__used +static __naked void two_level_calls__l2(void) +{ + asm volatile (" \ + r0 = 0; \ +1: r0 += 1; \ + goto 1b; \ +" ::: __clobber_all); +} + +SEC("tracepoint") +__failure +__log_level(1) +__msg("two_loops():") +__msg(" 0: (b7) r6 = 250000") +__msg(" 1: (b7) r0 = 0") +__msg(" 2: (07) r0 += 1") +__msg(" 3: (ad) if r0 < r6 goto pc-2") +__msg(" 4: (b7) r0 = 0") +__msg(" 5: (07) r0 += 1") +__msg(" 6: (05) goto pc-2") +__msg("#1 most visited simulated stacktrace (visited 250000 times):") +__msg(" two_loops/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg("#2 most visited simulated stacktrace (visited 249999 times):") +__msg(" two_loops/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg("BPF program is too large") +__naked void two_loops(void) +{ + asm volatile (" \ + r6 = 250000; \ + r0 = 0; \ +1: r0 += 1; \ + if r0 < r6 goto 1b; \ + r0 = 0; \ +2: r0 += 1; \ + goto 2b; \ +" ::: __clobber_all); +} + +SEC("socket") +__failure +__log_level(1) +__flag(BPF_F_TEST_STATE_FREQ) +__msg("iter_loop():") +__msg(" 8: (bf) r1 = r6") +__msg(" 9: (85) call bpf_iter_num_next") +__msg(" 10: (15) if r0 == 0x0 goto pc+3") +__msg(" 11: (07) r7 += 1") +__msg(" 12: (25) if r7 > 0xf4240 goto pc+1") +__msg(" 13: (05) goto pc-6") +__msg("#1 most visited simulated stacktrace (visited 142856 times):") +__msg(" iter_loop/8 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg("#2 most visited simulated stacktrace (visited 142856 times):") +__msg(" iter_loop/9 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg(" Most varying: R7 (frame 0)") +__msg("BPF program is too large. Processed 1000001 insn") +__naked void iter_loop(void) +{ + asm volatile (" \ + r1 = r10; \ + r1 += -16; \ + w2 = 0; \ + w3 = 1000000; \ + call %[bpf_iter_num_new]; \ + r7 = 0; \ + r6 = r10; \ + r6 += -16; \ +1: r1 = r6; \ + call %[bpf_iter_num_next]; \ + if r0 == 0 goto 2f; \ + r7 += 1; \ + if r7 > 1000000 goto 2f; \ + goto 1b; \ +2: r1 = r6; \ + call %[bpf_iter_num_destroy]; \ + r0 = r7; \ + exit; \ +" : + : __imm(bpf_iter_num_new), + __imm(bpf_iter_num_next), + __imm(bpf_iter_num_destroy) + : __clobber_all); +} + +SEC("socket") +__failure +__log_level(1) +__flag(BPF_F_TEST_STATE_FREQ) +__msg("iter_stack_varying():") +__msg(" 10: (85) call bpf_iter_num_next") +__msg("#1 most visited simulated stacktrace (visited 111110 times):") +__msg(" iter_stack_varying/10 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") +__msg(" Most varying: fp-24 (frame 0)") +__msg("BPF program is too large. Processed 1000001 insn") +__naked void iter_stack_varying(void) +{ + asm volatile (" \ + r1 = r10; \ + r1 += -16; \ + w2 = 0; \ + w3 = 1000000; \ + call %[bpf_iter_num_new]; \ + r8 = 0; \ + *(u64 *)(r10 - 24) = r8; \ + r6 = r10; \ + r6 += -16; \ +1: r1 = r6; \ + call %[bpf_iter_num_next]; \ + if r0 == 0 goto 2f; \ + r8 = *(u64 *)(r10 - 24); \ + r8 += 1; \ + *(u64 *)(r10 - 24) = r8; \ + if r8 > 1000000 goto 2f; \ + goto 1b; \ +2: r1 = r6; \ + call %[bpf_iter_num_destroy]; \ + r0 = *(u64 *)(r10 - 24); \ + exit; \ +" : + : __imm(bpf_iter_num_new), + __imm(bpf_iter_num_next), + __imm(bpf_iter_num_destroy) + : __clobber_all); +} + +void iter_num_btf_fixup(void) +{ + bpf_iter_num_new(0, 0, 0); + bpf_iter_num_next(0); + bpf_iter_num_destroy(0); +} + +char _license[] SEC("license") = "GPL"; -- 2.53.0 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman @ 2026-06-01 19:55 ` Emil Tsalapatis 0 siblings, 0 replies; 22+ messages in thread From: Emil Tsalapatis @ 2026-06-01 19:55 UTC (permalink / raw) To: Eduard Zingerman, bpf, ast Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote: > Add verifier selftests that exhaust the 1M instruction budget and > verify that the new profiling report is produced: > - nested function calls with inner loop, check that stack trace is > print correctly; > - two loops on the same level, check that both are reported; > - iterator-based loop with non-converging register, check that > register is in the report; > - iterator-based loop with non-converging stack slot, check that stack > slot is in the report. > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> > --- > tools/testing/selftests/bpf/prog_tests/verifier.c | 2 + > .../selftests/bpf/progs/verifier_budget_report.c | 175 +++++++++++++++++++++ > 2 files changed, 177 insertions(+) > > diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c > index 35d92d176c1f..61a732a499b0 100644 > --- a/tools/testing/selftests/bpf/prog_tests/verifier.c > +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c > @@ -14,6 +14,7 @@ > #include "verifier_basic_stack.skel.h" > #include "verifier_bitfield_write.skel.h" > #include "verifier_bounds.skel.h" > +#include "verifier_budget_report.skel.h" > #include "verifier_bounds_deduction.skel.h" > #include "verifier_bounds_deduction_non_const.skel.h" > #include "verifier_bounds_mix_sign_unsign.skel.h" > @@ -169,6 +170,7 @@ void test_verifier_bounds(void) { RUN(verifier_bounds); } > void test_verifier_bounds_deduction(void) { RUN(verifier_bounds_deduction); } > void test_verifier_bounds_deduction_non_const(void) { RUN(verifier_bounds_deduction_non_const); } > void test_verifier_bounds_mix_sign_unsign(void) { RUN(verifier_bounds_mix_sign_unsign); } > +void test_verifier_budget_report(void) { RUN(verifier_budget_report); } > void test_verifier_bpf_get_stack(void) { RUN(verifier_bpf_get_stack); } > void test_verifier_bpf_trap(void) { RUN(verifier_bpf_trap); } > void test_verifier_bswap(void) { RUN(verifier_bswap); } > diff --git a/tools/testing/selftests/bpf/progs/verifier_budget_report.c b/tools/testing/selftests/bpf/progs/verifier_budget_report.c > new file mode 100644 > index 000000000000..7b87e53831c2 > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/verifier_budget_report.c > @@ -0,0 +1,175 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ > + > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > +#include "bpf_misc.h" > + > +SEC("tracepoint") > +__failure > +__log_level(1) > +__msg("two_level_calls():") > +__msg(" 0: (85) call pc+1") > +__msg(" 1: (95) exit") > +__msg("two_level_calls__l1():") > +__msg(" 2: (85) call pc+1") > +__msg(" 3: (95) exit") > +__msg("two_level_calls__l2():") > +__msg(" 4: (b7) r0 = 0") > +__msg(" 5: (07) r0 += 1") > +__msg(" 6: (05) goto pc-2") > +__msg("#1 most visited simulated stacktrace (visited 499999 times):") > +__msg(" two_level_calls/0 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg(" two_level_calls__l1/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg(" two_level_calls__l2/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg("BPF program is too large. Processed 1000001 insn") > +__naked void two_level_calls(void) > +{ > + asm volatile (" \ > + call two_level_calls__l1; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +__used > +static __naked void two_level_calls__l1(void) > +{ > + asm volatile (" \ > + call two_level_calls__l2; \ > + exit; \ > +" ::: __clobber_all); > +} > + > +__used > +static __naked void two_level_calls__l2(void) > +{ > + asm volatile (" \ > + r0 = 0; \ > +1: r0 += 1; \ > + goto 1b; \ > +" ::: __clobber_all); > +} > + > +SEC("tracepoint") > +__failure > +__log_level(1) > +__msg("two_loops():") > +__msg(" 0: (b7) r6 = 250000") > +__msg(" 1: (b7) r0 = 0") > +__msg(" 2: (07) r0 += 1") > +__msg(" 3: (ad) if r0 < r6 goto pc-2") > +__msg(" 4: (b7) r0 = 0") > +__msg(" 5: (07) r0 += 1") > +__msg(" 6: (05) goto pc-2") > +__msg("#1 most visited simulated stacktrace (visited 250000 times):") > +__msg(" two_loops/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg("#2 most visited simulated stacktrace (visited 249999 times):") > +__msg(" two_loops/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg("BPF program is too large") > +__naked void two_loops(void) > +{ > + asm volatile (" \ > + r6 = 250000; \ > + r0 = 0; \ > +1: r0 += 1; \ > + if r0 < r6 goto 1b; \ > + r0 = 0; \ > +2: r0 += 1; \ > + goto 2b; \ > +" ::: __clobber_all); > +} > + > +SEC("socket") > +__failure > +__log_level(1) > +__flag(BPF_F_TEST_STATE_FREQ) > +__msg("iter_loop():") > +__msg(" 8: (bf) r1 = r6") > +__msg(" 9: (85) call bpf_iter_num_next") > +__msg(" 10: (15) if r0 == 0x0 goto pc+3") > +__msg(" 11: (07) r7 += 1") > +__msg(" 12: (25) if r7 > 0xf4240 goto pc+1") > +__msg(" 13: (05) goto pc-6") > +__msg("#1 most visited simulated stacktrace (visited 142856 times):") > +__msg(" iter_loop/8 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg("#2 most visited simulated stacktrace (visited 142856 times):") > +__msg(" iter_loop/9 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg(" Most varying: R7 (frame 0)") > +__msg("BPF program is too large. Processed 1000001 insn") > +__naked void iter_loop(void) > +{ > + asm volatile (" \ > + r1 = r10; \ > + r1 += -16; \ > + w2 = 0; \ > + w3 = 1000000; \ > + call %[bpf_iter_num_new]; \ > + r7 = 0; \ > + r6 = r10; \ > + r6 += -16; \ > +1: r1 = r6; \ > + call %[bpf_iter_num_next]; \ > + if r0 == 0 goto 2f; \ > + r7 += 1; \ > + if r7 > 1000000 goto 2f; \ > + goto 1b; \ > +2: r1 = r6; \ > + call %[bpf_iter_num_destroy]; \ > + r0 = r7; \ > + exit; \ > +" : > + : __imm(bpf_iter_num_new), > + __imm(bpf_iter_num_next), > + __imm(bpf_iter_num_destroy) > + : __clobber_all); > +} > + > +SEC("socket") > +__failure > +__log_level(1) > +__flag(BPF_F_TEST_STATE_FREQ) > +__msg("iter_stack_varying():") > +__msg(" 10: (85) call bpf_iter_num_next") > +__msg("#1 most visited simulated stacktrace (visited 111110 times):") > +__msg(" iter_stack_varying/10 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})") > +__msg(" Most varying: fp-24 (frame 0)") > +__msg("BPF program is too large. Processed 1000001 insn") > +__naked void iter_stack_varying(void) > +{ > + asm volatile (" \ > + r1 = r10; \ > + r1 += -16; \ > + w2 = 0; \ > + w3 = 1000000; \ > + call %[bpf_iter_num_new]; \ > + r8 = 0; \ > + *(u64 *)(r10 - 24) = r8; \ > + r6 = r10; \ > + r6 += -16; \ > +1: r1 = r6; \ > + call %[bpf_iter_num_next]; \ > + if r0 == 0 goto 2f; \ > + r8 = *(u64 *)(r10 - 24); \ > + r8 += 1; \ > + *(u64 *)(r10 - 24) = r8; \ > + if r8 > 1000000 goto 2f; \ > + goto 1b; \ > +2: r1 = r6; \ > + call %[bpf_iter_num_destroy]; \ > + r0 = *(u64 *)(r10 - 24); \ > + exit; \ > +" : > + : __imm(bpf_iter_num_new), > + __imm(bpf_iter_num_next), > + __imm(bpf_iter_num_destroy) > + : __clobber_all); > +} > + > +void iter_num_btf_fixup(void) > +{ > + bpf_iter_num_new(0, 0, 0); > + bpf_iter_num_next(0); > + bpf_iter_num_destroy(0); > +} > + > +char _license[] SEC("license") = "GPL"; ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2026-06-01 21:38 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-27 7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman 2026-06-01 5:50 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman 2026-06-01 19:12 ` Emil Tsalapatis 2026-06-01 19:22 ` Eduard Zingerman 2026-06-01 19:29 ` Emil Tsalapatis 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman 2026-05-27 8:50 ` sashiko-bot 2026-06-01 19:37 ` Emil Tsalapatis 2026-06-01 19:44 ` Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman 2026-05-29 10:23 ` Jiri Olsa 2026-05-29 18:44 ` Eduard Zingerman 2026-05-30 9:34 ` Jiri Olsa 2026-06-01 19:50 ` Emil Tsalapatis 2026-06-01 19:55 ` Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman 2026-06-01 21:29 ` Emil Tsalapatis 2026-06-01 21:38 ` Eduard Zingerman 2026-05-27 7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman 2026-06-01 19:55 ` Emil Tsalapatis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox