[PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit

BPF List
 help / color / mirror / Atom feed

* [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit
@ 2026-05-27  7:29 Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

When the BPF verifier exceeds the 1M instruction budget, the current
error output shows a random execution trace that happens to be active
at the moment, which is not very helpful for debugging.

This series improves the error report using a profiler-inspired
approach: collect and count "callchain" stack traces that the verifier
visits during program validation, and report the top 3 hottest traces
when the budget is exhausted. To minimize performance an memory impact
of such profiling, only collect samples when verifier visits loop
headers, iterator next, may_goto and callback-calling instructions.

For callchains ending at iterator next, may_goto, or callback-calling
instructions, identify which registers or stack slots most frequently
differ between cached and current states.

Here is an example of the report for scx lavd_dispatch (an old
version), with verifier limited to 200K instructions to trigger the
error:

  lavd_dispatch():
    ; void BPF_STRUCT_OPS(lavd_dispatch, s32 cpu, struct task_struct *prev) @ main.bpf.c:889
    ... disassembly ...

  consume_task():
    ; bool consume_task(u64 cpu_dsq_id, u64 cpdom_dsq_id) @ balance.bpf.c:410
    ... disassembly ...

  #1 most visited simulated stacktrace (visited 1807 times):
    lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
    consume_task/2715 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:316)

  #2 most visited simulated stacktrace (visited 1682 times):
    lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
    consume_task/2994 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:386)

  #3 most visited simulated stacktrace (visited 8 times):
    lavd_dispatch/255 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1022)
      Most varying: R7 (frame 0)

  BPF program is too large. Processed 200001 insn

Changelog:
v2 -> v3 (bots):
  - Compare leaf instruction index in callchain_matches_state().
  - Use cmp_int() in stat_diff_cmp() to avoid overflow warnings.
  - Add a comment in bpf_sample_state_diffs() on why it can't stale.
  - Add cond_resched() call in bpf_compute_loops() to guard against
    pathological inputs.
v1 -> v2 (bots):
  - Use kvfree() in bpf_compute_loops().
  - Adjust fwd_edges_no_loop test case to avoid dead code elimination
    converting 'if' to 'goto'.
  - Use GFP_KERNEL_ACCOUNT for callchain entry allocation in
    update_callchain_profile().
  - Zero-initialize 'cc' in update_callchain_profile() to avoid
    copying uninitialized stack memory to the heap.
  - Use %td instead of %ld for ptrdiff_t format specifier in
    print_callchain_entry() and disasm_subprog().
  - Size printed_subs bitmap as BPF_MAX_SUBPROGS + 2 to account for
    fake and exception subprograms.
  - Fix bpf_sample_state_diffs() inner loop to iterate from head
    instead of pos_i, avoiding container_of() on the dummy list head.
  - Add DIFF_OTHER to distinguish states that differ because of idmap
    or other inconsistencies.

v1: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v1-0-51e4f2c59780@gmail.com/T/
v2: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v2-0-e7ec61c45d41@gmail.com/T/
---
Eduard Zingerman (6):
      bpf: move live registers and scc printout to a standalone function
      bpf: compute loops hierarchy
      selftests/bpf: test cases for loop hierarchy computation
      bpf: report hot simulated callchains when 1M instructions limit is met
      bpf: report register diff summary for hot callchains
      selftests/bpf: test budget exhaustion profiling report

 include/linux/bpf_verifier.h                       |  39 ++++
 kernel/bpf/Makefile                                |   2 +-
 kernel/bpf/fixups.c                                |   5 +
 kernel/bpf/liveness.c                              |  22 +-
 kernel/bpf/loops.c                                 | 196 +++++++++++++++++
 kernel/bpf/states.c                                | 184 ++++++++++++++--
 kernel/bpf/verifier.c                              | 233 +++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/verifier.c  |   4 +
 .../selftests/bpf/progs/verifier_budget_report.c   | 175 ++++++++++++++++
 .../selftests/bpf/progs/verifier_live_stack.c      |   2 +-
 .../selftests/bpf/progs/verifier_loop_hierarchy.c  | 233 +++++++++++++++++++++
 11 files changed, 1050 insertions(+), 45 deletions(-)
---
base-commit: 8496d9020ff37a33c2a7b2fc84350fd03ffbde78
change-id: 20260525-better-1m-reporting-1d795a21cf72

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-06-01  5:50   ` Emil Tsalapatis
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Tests for multiple analyses performed by the verifier need the
verifier log to contain analysis results alongside the program
disassembly. This patch moves log level 2 program dump from
bpf_compute_live_registers() to a standalone function called from
bpf_check(), in order to provide a common logging function for such
analyses (and thus avoid printing program disassembly multiple times).

This is a preparatory refactoring, subsequent commit extends this log
with loop hierarchy.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/liveness.c                              | 22 +--------------
 kernel/bpf/verifier.c                              | 31 ++++++++++++++++++++++
 .../selftests/bpf/progs/verifier_live_stack.c      |  2 +-
 3 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index 0aadfbae0acc..16d030fdd83a 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -2209,7 +2209,7 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env)
 	struct bpf_insn *insns = env->prog->insnsi;
 	struct insn_live_regs *state;
 	int insn_cnt = env->prog->len;
-	int err = 0, i, j;
+	int err = 0, i;
 	bool changed;
 
 	/* Use the following algorithm:
@@ -2270,26 +2270,6 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env)
 	for (i = 0; i < insn_cnt; ++i)
 		insn_aux[i].live_regs_before = state[i].in;
 
-	if (env->log.level & BPF_LOG_LEVEL2) {
-		verbose(env, "Live regs before insn:\n");
-		for (i = 0; i < insn_cnt; ++i) {
-			if (env->insn_aux_data[i].scc)
-				verbose(env, "%3d ", env->insn_aux_data[i].scc);
-			else
-				verbose(env, "    ");
-			verbose(env, "%3d: ", i);
-			for (j = BPF_REG_0; j < BPF_REG_10; ++j)
-				if (insn_aux[i].live_regs_before & BIT(j))
-					verbose(env, "%d", j);
-				else
-					verbose(env, ".");
-			verbose(env, " ");
-			bpf_verbose_insn(env, &insns[i]);
-			if (bpf_is_ldimm64(&insns[i]))
-				i++;
-		}
-	}
-
 out:
 	kvfree(state);
 	return err;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c8d980fdd709..3d569e36128c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19619,6 +19619,34 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	return 0;
 }
 
+/* Various log level 2 information about the program */
+static void log_program(struct bpf_verifier_env *env)
+{
+	struct bpf_insn_aux_data *insn_aux = env->insn_aux_data;
+	struct bpf_insn *insns = env->prog->insnsi;
+	u32 insn_cnt = env->prog->len;
+	u32 i, j;
+
+	verbose(env, "Program dump (scc? insn#: live_regs_before):\n");
+	for (i = 0; i < insn_cnt; ++i) {
+		verbose_linfo(env, i, "    ; ");
+		if (env->insn_aux_data[i].scc)
+			verbose(env, "%3d ", env->insn_aux_data[i].scc);
+		else
+			verbose(env, "    ");
+		verbose(env, "%3d: ", i);
+		for (j = BPF_REG_0; j < BPF_REG_10; ++j)
+			if (insn_aux[i].live_regs_before & BIT(j))
+				verbose(env, "%d", j);
+			else
+				verbose(env, ".");
+		verbose(env, " ");
+		bpf_verbose_insn(env, &insns[i]);
+		if (bpf_is_ldimm64(&insns[i]))
+			i++;
+	}
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
 	      struct bpf_log_attr *attr_log)
 {
@@ -19771,6 +19799,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
 	if (ret < 0)
 		goto skip_full_check;
 
+	if (env->log.level & BPF_LOG_LEVEL2)
+		log_program(env);
+
 	ret = mark_fastcall_patterns(env);
 	if (ret < 0)
 		goto skip_full_check;
diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
index 401152b2b64f..923572df4bbf 100644
--- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
@@ -64,7 +64,7 @@ SEC("socket")
 __log_level(2)
 __msg("stack use/def subprog#0 must_write_not_same_slot (d0,cs0):")
 __msg("6: (7b) *(u64 *)(r2 +0) = r0{{$}}")
-__msg("Live regs before insn:")
+__msg("Program dump")
 __naked void must_write_not_same_slot(void)
 {
 	asm volatile (

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
@ 2026-06-01  5:50   ` Emil Tsalapatis
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01  5:50 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> Tests for multiple analyses performed by the verifier need the
> verifier log to contain analysis results alongside the program
> disassembly. This patch moves log level 2 program dump from
> bpf_compute_live_registers() to a standalone function called from
> bpf_check(), in order to provide a common logging function for such
> analyses (and thus avoid printing program disassembly multiple times).
>
> This is a preparatory refactoring, subsequent commit extends this log
> with loop hierarchy.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>  kernel/bpf/liveness.c                              | 22 +--------------
>  kernel/bpf/verifier.c                              | 31 ++++++++++++++++++++++
>  .../selftests/bpf/progs/verifier_live_stack.c      |  2 +-
>  3 files changed, 33 insertions(+), 22 deletions(-)
>
> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
> index 0aadfbae0acc..16d030fdd83a 100644
> --- a/kernel/bpf/liveness.c
> +++ b/kernel/bpf/liveness.c
> @@ -2209,7 +2209,7 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env)
>  	struct bpf_insn *insns = env->prog->insnsi;
>  	struct insn_live_regs *state;
>  	int insn_cnt = env->prog->len;
> -	int err = 0, i, j;
> +	int err = 0, i;
>  	bool changed;
>  
>  	/* Use the following algorithm:
> @@ -2270,26 +2270,6 @@ int bpf_compute_live_registers(struct bpf_verifier_env *env)
>  	for (i = 0; i < insn_cnt; ++i)
>  		insn_aux[i].live_regs_before = state[i].in;
>  
> -	if (env->log.level & BPF_LOG_LEVEL2) {
> -		verbose(env, "Live regs before insn:\n");
> -		for (i = 0; i < insn_cnt; ++i) {
> -			if (env->insn_aux_data[i].scc)
> -				verbose(env, "%3d ", env->insn_aux_data[i].scc);
> -			else
> -				verbose(env, "    ");
> -			verbose(env, "%3d: ", i);
> -			for (j = BPF_REG_0; j < BPF_REG_10; ++j)
> -				if (insn_aux[i].live_regs_before & BIT(j))
> -					verbose(env, "%d", j);
> -				else
> -					verbose(env, ".");
> -			verbose(env, " ");
> -			bpf_verbose_insn(env, &insns[i]);
> -			if (bpf_is_ldimm64(&insns[i]))
> -				i++;
> -		}
> -	}
> -
>  out:
>  	kvfree(state);
>  	return err;
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c8d980fdd709..3d569e36128c 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19619,6 +19619,34 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>  	return 0;
>  }
>  
> +/* Various log level 2 information about the program */
> +static void log_program(struct bpf_verifier_env *env)
> +{
> +	struct bpf_insn_aux_data *insn_aux = env->insn_aux_data;
> +	struct bpf_insn *insns = env->prog->insnsi;
> +	u32 insn_cnt = env->prog->len;
> +	u32 i, j;
> +
> +	verbose(env, "Program dump (scc? insn#: live_regs_before):\n");
> +	for (i = 0; i < insn_cnt; ++i) {
> +		verbose_linfo(env, i, "    ; ");
> +		if (env->insn_aux_data[i].scc)
> +			verbose(env, "%3d ", env->insn_aux_data[i].scc);
> +		else
> +			verbose(env, "    ");
> +		verbose(env, "%3d: ", i);
> +		for (j = BPF_REG_0; j < BPF_REG_10; ++j)
> +			if (insn_aux[i].live_regs_before & BIT(j))
> +				verbose(env, "%d", j);
> +			else
> +				verbose(env, ".");
> +		verbose(env, " ");
> +		bpf_verbose_insn(env, &insns[i]);
> +		if (bpf_is_ldimm64(&insns[i]))
> +			i++;
> +	}
> +}
> +
>  int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  	      struct bpf_log_attr *attr_log)
>  {
> @@ -19771,6 +19799,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  	if (ret < 0)
>  		goto skip_full_check;
>  
> +	if (env->log.level & BPF_LOG_LEVEL2)
> +		log_program(env);
> +
>  	ret = mark_fastcall_patterns(env);
>  	if (ret < 0)
>  		goto skip_full_check;
> diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
> index 401152b2b64f..923572df4bbf 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
> @@ -64,7 +64,7 @@ SEC("socket")
>  __log_level(2)
>  __msg("stack use/def subprog#0 must_write_not_same_slot (d0,cs0):")
>  __msg("6: (7b) *(u64 *)(r2 +0) = r0{{$}}")
> -__msg("Live regs before insn:")
> +__msg("Program dump")
>  __naked void must_write_not_same_slot(void)
>  {
>  	asm volatile (


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-06-01 19:12   ` Emil Tsalapatis
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Add an analysis phase computing the loops hierarchy to the bpf
verifier. The algorithm is based on the paper:
"A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
adapted to be non-recursive.

Record the loops hierarchy per instruction:
- The field insn_aux_data->loop_header records the innermost loop
  header containing it.
- The field insn_aux_data->loop records additional information about
  the loop if this instruction is a loop header (at the moment it only
  tracks irreduciblity of the loop, additional information will be
  added later).

Subsequent patches use loop header entries as sampling points for
callchain profiling, for better error reports when the 1M instruction
budget is exhausted.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  10 +++
 kernel/bpf/Makefile          |   2 +-
 kernel/bpf/fixups.c          |   5 ++
 kernel/bpf/loops.c           | 196 +++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/verifier.c        |  10 ++-
 5 files changed, 221 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 5cbad3b64130..d0a1fb135cbf 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -656,6 +656,10 @@ struct bpf_iarray {
 	u32 items[];
 };
 
+struct bpf_loop {
+	bool irreducible;
+};
+
 struct bpf_insn_aux_data {
 	union {
 		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
@@ -742,6 +746,10 @@ struct bpf_insn_aux_data {
 	u16 const_reg_map_mask;
 	u16 const_reg_subprog_mask;
 	u32 const_reg_vals[10];
+	/* index of a loop header of the innermost loop containing this instruction, -1 if none */
+	s32 loop_header;
+	/* additional information about the loop if this instruction is a loop header */
+	struct bpf_loop *loop;
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
@@ -1608,4 +1616,6 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env);
 int bpf_fixup_call_args(struct bpf_verifier_env *env);
 int bpf_do_misc_fixups(struct bpf_verifier_env *env);
 
+int bpf_compute_loops(struct bpf_verifier_env *env);
+
 #endif /* _LINUX_BPF_VERIFIER_H */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 4dc41bf5780c..787d183ba5b4 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
 endif
 CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
 
-obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o
+obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o loops.o
 obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
 obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
 obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o
diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
index 12739add2dda..57294524dab8 100644
--- a/kernel/bpf/fixups.c
+++ b/kernel/bpf/fixups.c
@@ -450,6 +450,11 @@ void bpf_clear_insn_aux_data(struct bpf_verifier_env *env, int start, int len)
 			aux_data[i].jt = NULL;
 		}
 
+		if (aux_data[i].loop) {
+			kvfree(aux_data[i].loop);
+			aux_data[i].loop = NULL;
+		}
+
 		if (bpf_is_ldimm64(&insns[i]))
 			i++;
 	}
diff --git a/kernel/bpf/loops.c b/kernel/bpf/loops.c
new file mode 100644
index 000000000000..aec11248d048
--- /dev/null
+++ b/kernel/bpf/loops.c
@@ -0,0 +1,196 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/slab.h>
+#include <linux/bpf_verifier.h>
+
+struct dfs_state {
+	u32 traversed:1;
+	u32 next_succ:31;
+};
+
+struct loops_dfs {
+	struct dfs_state *state;
+	int *dfs_pos;
+	int *stack;
+};
+
+static void mark_irreducible(struct bpf_verifier_env *env, int h)
+{
+	env->insn_aux_data[h].loop->irreducible = true;
+}
+
+static int mark_header(struct bpf_verifier_env *env, int h)
+{
+	struct bpf_insn_aux_data *aux = env->insn_aux_data;
+
+	if (!aux[h].loop) {
+		aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT);
+		if (!aux[h].loop)
+			return -ENOMEM;
+	}
+	return 0;
+}
+
+static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h)
+{
+	struct bpf_insn_aux_data *aux = env->insn_aux_data;
+	int *dfs_pos = dfs->dfs_pos;
+	int err, nh;
+
+	err = mark_header(env, h);
+	if (err)
+		return err;
+
+	/* Don't encode self-loops, otherwise can't reflect loops nesting structure. */
+	if (n == h)
+		return 0;
+
+	/* Make sure that loop headers up the chain are sorted by dfs_pos. */
+	while (aux[n].loop_header != -1) {
+		nh = aux[n].loop_header;
+		if (nh == h)
+			return 0;
+		if (dfs_pos[nh] < dfs_pos[h]) {
+			aux[n].loop_header = h;
+			n = h;
+			h = nh;
+		} else {
+			n = nh;
+		}
+	}
+	aux[n].loop_header = h;
+	return 0;
+}
+
+/*
+ * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
+ * adapted to be non-recursive.
+ */
+static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs,
+				    int subprog_idx)
+{
+	struct bpf_insn_aux_data *aux = env->insn_aux_data;
+	struct dfs_state *state = dfs->state;
+	int start = env->subprog_info[subprog_idx].start;
+	int *dfs_pos = dfs->dfs_pos;
+	int *stack = dfs->stack;
+	int i, s, h, err, cur, stack_sz;
+	struct bpf_iarray *succ;
+
+	stack[0] = start;
+	state[start].traversed = true;
+	state[start].next_succ = 0;
+	dfs_pos[start] = 1;
+	stack_sz = 1;
+	i = 0;
+	do {
+		/*
+		 * The algorithm should be very fast in practice,
+		 * guard against pathological inputs, just in case.
+		 */
+		if (i++ == 1024) {
+			i = 0;
+			if (signal_pending(current))
+				return -EAGAIN;
+			cond_resched();
+		}
+
+		cur = stack[stack_sz - 1];
+		succ = bpf_insn_successors(env, cur);
+		if (state[cur].next_succ == succ->cnt) {
+			dfs_pos[cur] = 0;
+			stack_sz--;
+			continue;
+		}
+		s = succ->items[state[cur].next_succ];
+		if (!state[s].traversed) {
+			/* Case A:  start -> ... -> cur -> s [unxplored] */
+			state[s].traversed = true;
+			state[s].next_succ = 0;
+			stack[stack_sz] = s;
+			dfs_pos[s] = stack_sz + 1;
+			stack_sz++;
+			continue;
+		}
+		/* 's' is fully explored at this point */
+		if (dfs_pos[s]) {
+			/*
+			 * start -> ... -> s -> cur --.
+			 *                 ^          |
+			 *                 '----------'
+			 * Case B: 's' is in the current DFS path.
+			 */
+			err = assign_header(env, dfs, cur, s);
+			if (err)
+				return err;
+		} else if (aux[s].loop_header == -1) {
+			/*
+			 * start -> ... -> ... -> s -> ... -> end
+			 *           |            ^
+			 *           '---> cur ---'
+			 * Case C: 's' is explored, not in the current DFS path,
+			 * and not a part of any loop.
+			 */
+		} else if (dfs_pos[aux[s].loop_header]) {
+			/*
+			 *                 .----------------------.
+			 *                 v                      |
+			 * start -> ... -> h -> ... -> ... -> s --'
+			 *                       |            ^
+			 *	                 '---> cur ---'
+			 * Case D: 's' is explored, not in current DFS path,
+			 * but it's innermost loop header is.
+			 */
+			err = assign_header(env, dfs, cur, aux[s].loop_header);
+			if (err)
+				return err;
+		} else {
+			/* case E */
+			h = aux[s].loop_header;
+			mark_irreducible(env, h);
+			/* can also mark 's' as reentry, but no need for now */
+			while (aux[h].loop_header != -1) {
+				h = aux[h].loop_header;
+				if (dfs_pos[h]) {
+					err = assign_header(env, dfs, cur, h);
+					if (err)
+						return err;
+					break;
+				}
+				mark_irreducible(env, h);
+			}
+		}
+		state[cur].next_succ++;
+	} while (stack_sz);
+
+	return 0;
+}
+
+int bpf_compute_loops(struct bpf_verifier_env *env)
+{
+	struct bpf_insn_aux_data *aux = env->insn_aux_data;
+	int i, err = 0, len = env->prog->len;
+	struct loops_dfs dfs = {};
+
+	dfs.dfs_pos = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT);
+	dfs.state = kvcalloc(len, sizeof(struct dfs_state), GFP_KERNEL_ACCOUNT);
+	dfs.stack = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT);
+	if (!dfs.dfs_pos || !dfs.state || !dfs.stack) {
+		err = -ENOMEM;
+		goto out;
+	}
+	for (i = 0; i < len; i++)
+		aux[i].loop_header = -1;
+	for (i = 0; i < env->subprog_cnt; i++) {
+		err = compute_loops_in_subprog(env, &dfs, i);
+		if (err)
+			goto out;
+	}
+
+out:
+	kvfree(dfs.dfs_pos);
+	kvfree(dfs.stack);
+	kvfree(dfs.state);
+	return err;
+}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3d569e36128c..2b3584712ad2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19627,13 +19627,17 @@ static void log_program(struct bpf_verifier_env *env)
 	u32 insn_cnt = env->prog->len;
 	u32 i, j;
 
-	verbose(env, "Program dump (scc? insn#: live_regs_before):\n");
+	verbose(env, "Program dump (scc? loop_header? insn#: live_regs_before):\n");
 	for (i = 0; i < insn_cnt; ++i) {
 		verbose_linfo(env, i, "    ; ");
 		if (env->insn_aux_data[i].scc)
 			verbose(env, "%3d ", env->insn_aux_data[i].scc);
 		else
 			verbose(env, "    ");
+		if (env->insn_aux_data[i].loop_header >= 0)
+			verbose(env, "%3d ", env->insn_aux_data[i].loop_header);
+		else
+			verbose(env, "    ");
 		verbose(env, "%3d: ", i);
 		for (j = BPF_REG_0; j < BPF_REG_10; ++j)
 			if (insn_aux[i].live_regs_before & BIT(j))
@@ -19791,6 +19795,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = bpf_compute_loops(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = bpf_compute_scc(env);
 	if (ret < 0)
 		goto skip_full_check;

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman
@ 2026-06-01 19:12   ` Emil Tsalapatis
  2026-06-01 19:22     ` Eduard Zingerman
  0 siblings, 1 reply; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 19:12 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> Add an analysis phase computing the loops hierarchy to the bpf
> verifier. The algorithm is based on the paper:
> "A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
> adapted to be non-recursive.
>
> Record the loops hierarchy per instruction:
> - The field insn_aux_data->loop_header records the innermost loop
>   header containing it.
> - The field insn_aux_data->loop records additional information about
>   the loop if this instruction is a loop header (at the moment it only
>   tracks irreduciblity of the loop, additional information will be
>   added later).
>
> Subsequent patches use loop header entries as sampling points for
> callchain profiling, for better error reports when the 1M instruction
> budget is exhausted.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

Nits below.
> ---
>  include/linux/bpf_verifier.h |  10 +++
>  kernel/bpf/Makefile          |   2 +-
>  kernel/bpf/fixups.c          |   5 ++
>  kernel/bpf/loops.c           | 196 +++++++++++++++++++++++++++++++++++++++++++
>  kernel/bpf/verifier.c        |  10 ++-
>  5 files changed, 221 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 5cbad3b64130..d0a1fb135cbf 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -656,6 +656,10 @@ struct bpf_iarray {
>  	u32 items[];
>  };
>  
> +struct bpf_loop {
> +	bool irreducible;
> +};
> +
>  struct bpf_insn_aux_data {
>  	union {
>  		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
> @@ -742,6 +746,10 @@ struct bpf_insn_aux_data {
>  	u16 const_reg_map_mask;
>  	u16 const_reg_subprog_mask;
>  	u32 const_reg_vals[10];
> +	/* index of a loop header of the innermost loop containing this instruction, -1 if none */
> +	s32 loop_header;
> +	/* additional information about the loop if this instruction is a loop header */
> +	struct bpf_loop *loop;
>  };
>  
>  #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
> @@ -1608,4 +1616,6 @@ int bpf_jit_subprogs(struct bpf_verifier_env *env);
>  int bpf_fixup_call_args(struct bpf_verifier_env *env);
>  int bpf_do_misc_fixups(struct bpf_verifier_env *env);
>  
> +int bpf_compute_loops(struct bpf_verifier_env *env);
> +
>  #endif /* _LINUX_BPF_VERIFIER_H */
> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> index 4dc41bf5780c..787d183ba5b4 100644
> --- a/kernel/bpf/Makefile
> +++ b/kernel/bpf/Makefile
> @@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse
>  endif
>  CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
>  
> -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o
> +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o cnum.o log.o token.o liveness.o const_fold.o loops.o
>  obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
>  obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
>  obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o
> diff --git a/kernel/bpf/fixups.c b/kernel/bpf/fixups.c
> index 12739add2dda..57294524dab8 100644
> --- a/kernel/bpf/fixups.c
> +++ b/kernel/bpf/fixups.c
> @@ -450,6 +450,11 @@ void bpf_clear_insn_aux_data(struct bpf_verifier_env *env, int start, int len)
>  			aux_data[i].jt = NULL;
>  		}
>  
> +		if (aux_data[i].loop) {
> +			kvfree(aux_data[i].loop);
> +			aux_data[i].loop = NULL;
> +		}
> +
>  		if (bpf_is_ldimm64(&insns[i]))
>  			i++;
>  	}
> diff --git a/kernel/bpf/loops.c b/kernel/bpf/loops.c
> new file mode 100644
> index 000000000000..aec11248d048
> --- /dev/null
> +++ b/kernel/bpf/loops.c
> @@ -0,0 +1,196 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
> +
> +#include <linux/slab.h>
> +#include <linux/bpf_verifier.h>
> +
> +struct dfs_state {
> +	u32 traversed:1;
> +	u32 next_succ:31;
> +};
> +
> +struct loops_dfs {
> +	struct dfs_state *state;
> +	int *dfs_pos;
> +	int *stack;
> +};
> +
> +static void mark_irreducible(struct bpf_verifier_env *env, int h)
> +{
> +	env->insn_aux_data[h].loop->irreducible = true;
> +}
> +
> +static int mark_header(struct bpf_verifier_env *env, int h)

Maybe a more descriptive name? Not sure what we are marking the header
as.

> +{
> +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> +
> +	if (!aux[h].loop) {
> +		aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT);
> +		if (!aux[h].loop)
> +			return -ENOMEM;
> +	}
> +	return 0;
> +}
> +
> +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h)
> +{
> +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> +	int *dfs_pos = dfs->dfs_pos;
> +	int err, nh;
> +
> +	err = mark_header(env, h);
> +	if (err)
> +		return err;
> +
> +	/* Don't encode self-loops, otherwise can't reflect loops nesting structure. */
> +	if (n == h)
> +		return 0;
> +
> +	/* Make sure that loop headers up the chain are sorted by dfs_pos. */
> +	while (aux[n].loop_header != -1) {
> +		nh = aux[n].loop_header;
> +		if (nh == h)
> +			return 0;
> +		if (dfs_pos[nh] < dfs_pos[h]) {
> +			aux[n].loop_header = h;
> +			n = h;
> +			h = nh;
> +		} else {
> +			n = nh;
> +		}
> +	}
> +	aux[n].loop_header = h;
> +	return 0;
> +}
> +
> +/*
> + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
> + * adapted to be non-recursive.
> + */
> +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs,
> +				    int subprog_idx)
> +{
> +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> +	struct dfs_state *state = dfs->state;
> +	int start = env->subprog_info[subprog_idx].start;
> +	int *dfs_pos = dfs->dfs_pos;
> +	int *stack = dfs->stack;
> +	int i, s, h, err, cur, stack_sz;
> +	struct bpf_iarray *succ;
> +
> +	stack[0] = start;
> +	state[start].traversed = true;
> +	state[start].next_succ = 0;
> +	dfs_pos[start] = 1;
> +	stack_sz = 1;
> +	i = 0;
> +	do {
> +		/*
> +		 * The algorithm should be very fast in practice,
> +		 * guard against pathological inputs, just in case.
> +		 */
> +		if (i++ == 1024) {
> +			i = 0;
Nit: Maybe 

if (!(++i % 1024))

?

> +			if (signal_pending(current))
> +				return -EAGAIN;
> +			cond_resched();
> +		}
> +
> +		cur = stack[stack_sz - 1];
> +		succ = bpf_insn_successors(env, cur);
> +		if (state[cur].next_succ == succ->cnt) {
> +			dfs_pos[cur] = 0;
> +			stack_sz--;
> +			continue;
> +		}
> +		s = succ->items[state[cur].next_succ];
> +		if (!state[s].traversed) {
> +			/* Case A:  start -> ... -> cur -> s [unxplored] */
Typo: unxplored -> unexplored
> +			state[s].traversed = true;
> +			state[s].next_succ = 0;
> +			stack[stack_sz] = s;
> +			dfs_pos[s] = stack_sz + 1;
> +			stack_sz++;
> +			continue;
> +		}
> +		/* 's' is fully explored at this point */
> +		if (dfs_pos[s]) {
> +			/*
> +			 * start -> ... -> s -> cur --.
> +			 *                 ^          |
> +			 *                 '----------'
> +			 * Case B: 's' is in the current DFS path.
> +			 */
> +			err = assign_header(env, dfs, cur, s);
> +			if (err)
> +				return err;
> +		} else if (aux[s].loop_header == -1) {
> +			/*
> +			 * start -> ... -> ... -> s -> ... -> end
> +			 *           |            ^
> +			 *           '---> cur ---'
> +			 * Case C: 's' is explored, not in the current DFS path,
> +			 * and not a part of any loop.
> +			 */
> +		} else if (dfs_pos[aux[s].loop_header]) {
> +			/*
> +			 *                 .----------------------.
> +			 *                 v                      |
> +			 * start -> ... -> h -> ... -> ... -> s --'
> +			 *                       |            ^
> +			 *	                 '---> cur ---'
> +			 * Case D: 's' is explored, not in current DFS path,
> +			 * but it's innermost loop header is.
Typo: it's -> its
> +			 */
> +			err = assign_header(env, dfs, cur, aux[s].loop_header);
> +			if (err)
> +				return err;
> +		} else {
> +			/* case E */
> +			h = aux[s].loop_header;
> +			mark_irreducible(env, h);
> +			/* can also mark 's' as reentry, but no need for now */
> +			while (aux[h].loop_header != -1) {
> +				h = aux[h].loop_header;
> +				if (dfs_pos[h]) {
> +					err = assign_header(env, dfs, cur, h);
> +					if (err)
> +						return err;
> +					break;
> +				}
> +				mark_irreducible(env, h);
> +			}
> +		}
> +		state[cur].next_succ++;
> +	} while (stack_sz);
> +
> +	return 0;
> +}
> +
> +int bpf_compute_loops(struct bpf_verifier_env *env)
> +{
> +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> +	int i, err = 0, len = env->prog->len;
> +	struct loops_dfs dfs = {};
> +
> +	dfs.dfs_pos = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT);
> +	dfs.state = kvcalloc(len, sizeof(struct dfs_state), GFP_KERNEL_ACCOUNT);
> +	dfs.stack = kvcalloc(len, sizeof(int), GFP_KERNEL_ACCOUNT);
> +	if (!dfs.dfs_pos || !dfs.state || !dfs.stack) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +	for (i = 0; i < len; i++)
> +		aux[i].loop_header = -1;
> +	for (i = 0; i < env->subprog_cnt; i++) {
> +		err = compute_loops_in_subprog(env, &dfs, i);
> +		if (err)
> +			goto out;
> +	}
> +
> +out:
> +	kvfree(dfs.dfs_pos);
> +	kvfree(dfs.stack);
> +	kvfree(dfs.state);
> +	return err;
> +}
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 3d569e36128c..2b3584712ad2 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19627,13 +19627,17 @@ static void log_program(struct bpf_verifier_env *env)
>  	u32 insn_cnt = env->prog->len;
>  	u32 i, j;
>  
> -	verbose(env, "Program dump (scc? insn#: live_regs_before):\n");
> +	verbose(env, "Program dump (scc? loop_header? insn#: live_regs_before):\n");
>  	for (i = 0; i < insn_cnt; ++i) {
>  		verbose_linfo(env, i, "    ; ");
>  		if (env->insn_aux_data[i].scc)
>  			verbose(env, "%3d ", env->insn_aux_data[i].scc);
>  		else
>  			verbose(env, "    ");
> +		if (env->insn_aux_data[i].loop_header >= 0)
> +			verbose(env, "%3d ", env->insn_aux_data[i].loop_header);
> +		else
> +			verbose(env, "    ");
>  		verbose(env, "%3d: ", i);
>  		for (j = BPF_REG_0; j < BPF_REG_10; ++j)
>  			if (insn_aux[i].live_regs_before & BIT(j))
> @@ -19791,6 +19795,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  	if (ret < 0)
>  		goto skip_full_check;
>  
> +	ret = bpf_compute_loops(env);
> +	if (ret < 0)
> +		goto skip_full_check;
> +
>  	ret = bpf_compute_scc(env);
>  	if (ret < 0)
>  		goto skip_full_check;


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy
  2026-06-01 19:12   ` Emil Tsalapatis
@ 2026-06-01 19:22     ` Eduard Zingerman
  2026-06-01 19:29       ` Emil Tsalapatis
  0 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-06-01 19:22 UTC (permalink / raw)
  To: Emil Tsalapatis, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Mon, 2026-06-01 at 15:12 -0400, Emil Tsalapatis wrote:

[...]

> > +static int mark_header(struct bpf_verifier_env *env, int h)
> 
> Maybe a more descriptive name? Not sure what we are marking the header
> as.

The idea is to mark instruction at index 'h' as a header, hence mark_header().
Can change to 'mark_as_header()', open to other suggestions.
I'll add a comment on how this is different from assign_header()
(which restores hierarchy, as far as I remember).

> > +{
> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> > +
> > +	if (!aux[h].loop) {
> > +		aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT);
> > +		if (!aux[h].loop)
> > +			return -ENOMEM;
> > +	}
> > +	return 0;
> > +}
> > +
> > +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h)
> > +{
> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> > +	int *dfs_pos = dfs->dfs_pos;
> > +	int err, nh;
> > +
> > +	err = mark_header(env, h);
> > +	if (err)
> > +		return err;
> > +
> > +	/* Don't encode self-loops, otherwise can't reflect loops nesting structure. */
> > +	if (n == h)
> > +		return 0;
> > +
> > +	/* Make sure that loop headers up the chain are sorted by dfs_pos. */
> > +	while (aux[n].loop_header != -1) {
> > +		nh = aux[n].loop_header;
> > +		if (nh == h)
> > +			return 0;
> > +		if (dfs_pos[nh] < dfs_pos[h]) {
> > +			aux[n].loop_header = h;
> > +			n = h;
> > +			h = nh;
> > +		} else {
> > +			n = nh;
> > +		}
> > +	}
> > +	aux[n].loop_header = h;
> > +	return 0;
> > +}
> > +
> > +/*
> > + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
> > + * adapted to be non-recursive.
> > + */
> > +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs,
> > +				    int subprog_idx)
> > +{
> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
> > +	struct dfs_state *state = dfs->state;
> > +	int start = env->subprog_info[subprog_idx].start;
> > +	int *dfs_pos = dfs->dfs_pos;
> > +	int *stack = dfs->stack;
> > +	int i, s, h, err, cur, stack_sz;
> > +	struct bpf_iarray *succ;
> > +
> > +	stack[0] = start;
> > +	state[start].traversed = true;
> > +	state[start].next_succ = 0;
> > +	dfs_pos[start] = 1;
> > +	stack_sz = 1;
> > +	i = 0;
> > +	do {
> > +		/*
> > +		 * The algorithm should be very fast in practice,
> > +		 * guard against pathological inputs, just in case.
> > +		 */
> > +		if (i++ == 1024) {
> > +			i = 0;
> Nit: Maybe 
> 
> if (!(++i % 1024))
> 
> ?

Ack.

> > +			if (signal_pending(current))
> > +				return -EAGAIN;
> > +			cond_resched();
> > +		}
> > +
> > +		cur = stack[stack_sz - 1];
> > +		succ = bpf_insn_successors(env, cur);
> > +		if (state[cur].next_succ == succ->cnt) {
> > +			dfs_pos[cur] = 0;
> > +			stack_sz--;
> > +			continue;
> > +		}
> > +		s = succ->items[state[cur].next_succ];
> > +		if (!state[s].traversed) {
> > +			/* Case A:  start -> ... -> cur -> s [unxplored] */
> Typo: unxplored -> unexplored

Ack.

> > +			state[s].traversed = true;
> > +			state[s].next_succ = 0;
> > +			stack[stack_sz] = s;
> > +			dfs_pos[s] = stack_sz + 1;
> > +			stack_sz++;
> > +			continue;
> > +		}
> > +		/* 's' is fully explored at this point */
> > +		if (dfs_pos[s]) {
> > +			/*
> > +			 * start -> ... -> s -> cur --.
> > +			 *                 ^          |
> > +			 *                 '----------'
> > +			 * Case B: 's' is in the current DFS path.
> > +			 */
> > +			err = assign_header(env, dfs, cur, s);
> > +			if (err)
> > +				return err;
> > +		} else if (aux[s].loop_header == -1) {
> > +			/*
> > +			 * start -> ... -> ... -> s -> ... -> end
> > +			 *           |            ^
> > +			 *           '---> cur ---'
> > +			 * Case C: 's' is explored, not in the current DFS path,
> > +			 * and not a part of any loop.
> > +			 */
> > +		} else if (dfs_pos[aux[s].loop_header]) {
> > +			/*
> > +			 *                 .----------------------.
> > +			 *                 v                      |
> > +			 * start -> ... -> h -> ... -> ... -> s --'
> > +			 *                       |            ^
> > +			 *	                 '---> cur ---'
> > +			 * Case D: 's' is explored, not in current DFS path,
> > +			 * but it's innermost loop header is.
> Typo: it's -> its

Ack.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy
  2026-06-01 19:22     ` Eduard Zingerman
@ 2026-06-01 19:29       ` Emil Tsalapatis
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 19:29 UTC (permalink / raw)
  To: Eduard Zingerman, Emil Tsalapatis, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Mon Jun 1, 2026 at 3:22 PM EDT, Eduard Zingerman wrote:
> On Mon, 2026-06-01 at 15:12 -0400, Emil Tsalapatis wrote:
>
> [...]
>
>> > +static int mark_header(struct bpf_verifier_env *env, int h)
>> 
>> Maybe a more descriptive name? Not sure what we are marking the header
>> as.
>
> The idea is to mark instruction at index 'h' as a header, hence mark_header().
> Can change to 'mark_as_header()', open to other suggestions.
> I'll add a comment on how this is different from assign_header()
> (which restores hierarchy, as far as I remember).
>

mark_as_header() sounds much better, thanks.

>> > +{
>> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
>> > +
>> > +	if (!aux[h].loop) {
>> > +		aux[h].loop = kvzalloc_obj(struct bpf_loop, GFP_KERNEL_ACCOUNT);
>> > +		if (!aux[h].loop)
>> > +			return -ENOMEM;
>> > +	}
>> > +	return 0;
>> > +}
>> > +
>> > +static int assign_header(struct bpf_verifier_env *env, struct loops_dfs *dfs, int n, int h)
>> > +{
>> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
>> > +	int *dfs_pos = dfs->dfs_pos;
>> > +	int err, nh;
>> > +
>> > +	err = mark_header(env, h);
>> > +	if (err)
>> > +		return err;
>> > +
>> > +	/* Don't encode self-loops, otherwise can't reflect loops nesting structure. */
>> > +	if (n == h)
>> > +		return 0;
>> > +
>> > +	/* Make sure that loop headers up the chain are sorted by dfs_pos. */
>> > +	while (aux[n].loop_header != -1) {
>> > +		nh = aux[n].loop_header;
>> > +		if (nh == h)
>> > +			return 0;
>> > +		if (dfs_pos[nh] < dfs_pos[h]) {
>> > +			aux[n].loop_header = h;
>> > +			n = h;
>> > +			h = nh;
>> > +		} else {
>> > +			n = nh;
>> > +		}
>> > +	}
>> > +	aux[n].loop_header = h;
>> > +	return 0;
>> > +}
>> > +
>> > +/*
>> > + * As described in "A New Algorithm for Identifying Loops in Decompilation" by Wei et al,
>> > + * adapted to be non-recursive.
>> > + */
>> > +static int compute_loops_in_subprog(struct bpf_verifier_env *env, struct loops_dfs *dfs,
>> > +				    int subprog_idx)
>> > +{
>> > +	struct bpf_insn_aux_data *aux = env->insn_aux_data;
>> > +	struct dfs_state *state = dfs->state;
>> > +	int start = env->subprog_info[subprog_idx].start;
>> > +	int *dfs_pos = dfs->dfs_pos;
>> > +	int *stack = dfs->stack;
>> > +	int i, s, h, err, cur, stack_sz;
>> > +	struct bpf_iarray *succ;
>> > +
>> > +	stack[0] = start;
>> > +	state[start].traversed = true;
>> > +	state[start].next_succ = 0;
>> > +	dfs_pos[start] = 1;
>> > +	stack_sz = 1;
>> > +	i = 0;
>> > +	do {
>> > +		/*
>> > +		 * The algorithm should be very fast in practice,
>> > +		 * guard against pathological inputs, just in case.
>> > +		 */
>> > +		if (i++ == 1024) {
>> > +			i = 0;
>> Nit: Maybe 
>> 
>> if (!(++i % 1024))
>> 
>> ?
>
> Ack.
>
>> > +			if (signal_pending(current))
>> > +				return -EAGAIN;
>> > +			cond_resched();
>> > +		}
>> > +
>> > +		cur = stack[stack_sz - 1];
>> > +		succ = bpf_insn_successors(env, cur);
>> > +		if (state[cur].next_succ == succ->cnt) {
>> > +			dfs_pos[cur] = 0;
>> > +			stack_sz--;
>> > +			continue;
>> > +		}
>> > +		s = succ->items[state[cur].next_succ];
>> > +		if (!state[s].traversed) {
>> > +			/* Case A:  start -> ... -> cur -> s [unxplored] */
>> Typo: unxplored -> unexplored
>
> Ack.
>
>> > +			state[s].traversed = true;
>> > +			state[s].next_succ = 0;
>> > +			stack[stack_sz] = s;
>> > +			dfs_pos[s] = stack_sz + 1;
>> > +			stack_sz++;
>> > +			continue;
>> > +		}
>> > +		/* 's' is fully explored at this point */
>> > +		if (dfs_pos[s]) {
>> > +			/*
>> > +			 * start -> ... -> s -> cur --.
>> > +			 *                 ^          |
>> > +			 *                 '----------'
>> > +			 * Case B: 's' is in the current DFS path.
>> > +			 */
>> > +			err = assign_header(env, dfs, cur, s);
>> > +			if (err)
>> > +				return err;
>> > +		} else if (aux[s].loop_header == -1) {
>> > +			/*
>> > +			 * start -> ... -> ... -> s -> ... -> end
>> > +			 *           |            ^
>> > +			 *           '---> cur ---'
>> > +			 * Case C: 's' is explored, not in the current DFS path,
>> > +			 * and not a part of any loop.
>> > +			 */
>> > +		} else if (dfs_pos[aux[s].loop_header]) {
>> > +			/*
>> > +			 *                 .----------------------.
>> > +			 *                 v                      |
>> > +			 * start -> ... -> h -> ... -> ... -> s --'
>> > +			 *                       |            ^
>> > +			 *	                 '---> cur ---'
>> > +			 * Case D: 's' is explored, not in current DFS path,
>> > +			 * but it's innermost loop header is.
>> Typo: it's -> its
>
> Ack.
>
> [...]


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-05-27  8:50   ` sashiko-bot
  2026-06-01 19:37   ` Emil Tsalapatis
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Test cases covering the following branches in bpf_compute_loops():
- Case B: simple backedge creating a loop (loop_single)
- Case B: two independent loops (loop_two_independent)
- Case B + D: nested loops where inner header's loop_header points to
  outer (loop_nested)
- Case C: diamond CFG with no loops (fwd_edges_no_loop)
- Case D: sibling inner loops within one outer loop
  (loop_nested_siblings)
- Three levels of loop nesting (loop_three_levels)
- Loop with if-else body containing forward branches
  (loop_with_if_else)

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/prog_tests/verifier.c  |   2 +
 .../selftests/bpf/progs/verifier_loop_hierarchy.c  | 233 +++++++++++++++++++++
 2 files changed, 235 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 219ff2969868..35d92d176c1f 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -57,6 +57,7 @@
 #include "verifier_live_stack.skel.h"
 #include "verifier_liveness_exp.skel.h"
 #include "verifier_load_acquire.skel.h"
+#include "verifier_loop_hierarchy.skel.h"
 #include "verifier_loops1.skel.h"
 #include "verifier_lwt.skel.h"
 #include "verifier_map_in_map.skel.h"
@@ -208,6 +209,7 @@ void test_verifier_leak_ptr(void)             { RUN(verifier_leak_ptr); }
 void test_verifier_linked_scalars(void)       { RUN(verifier_linked_scalars); }
 void test_verifier_live_stack(void)           { RUN(verifier_live_stack); }
 void test_verifier_liveness_exp(void)         { RUN(verifier_liveness_exp); }
+void test_verifier_loop_hierarchy(void)       { RUN(verifier_loop_hierarchy); }
 void test_verifier_loops1(void)               { RUN(verifier_loops1); }
 void test_verifier_lwt(void)                  { RUN(verifier_lwt); }
 void test_verifier_map_in_map(void)           { RUN(verifier_map_in_map); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c
new file mode 100644
index 000000000000..84cabfa42485
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c
@@ -0,0 +1,233 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+/*
+ * kernel/bpf/loops.c:compute_loops() distinguish between
+ * the following cases:
+ * - B: backedge -> simple loop
+ * - C: cross edge to non-loop node -> no-op
+ * - D: edge to node whose header is in DFS path -> nested loop
+ * - E: edge to node whose header is NOT in DFS path -> irreducible
+ *
+ * Below test cases cover the above branches in various combinations.
+ */
+
+/* Case B: single bounded loop. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  1       1: {{.*}} (07) r0 += 1")
+__msg("  1   1   2: {{.*}} (a5) if r0 < 0xa goto pc-2")
+__msg("          3: {{.*}} (95) exit")
+__naked void loop_single(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	if r0 < 10 goto 1b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Case B: two independent loops at the same nesting level. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  2       1: {{.*}} (07) r0 += 1")
+__msg("  2   1   2: {{.*}} (a5) if r0 < 0xa goto pc-2")
+__msg("          3: {{.*}} (b7) r1 = 0")
+__msg("  1       4: {{.*}} (07) r1 += 1")
+__msg("  1   4   5: {{.*}} (a5) if r1 < 0xa goto pc-2")
+__msg("          6: {{.*}} (95) exit")
+__naked void loop_two_independent(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	if r0 < 10 goto 1b;				\
+	r1 = 0;						\
+2:	r1 += 1;					\
+	if r1 < 10 goto 2b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Case B + D: nested loops. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  1       1: {{.*}} (07) r0 += 1")		/* outer loop header */
+__msg("  1   1   2: {{.*}} (b7) r1 = 0")		/* outer loop insn */
+__msg("  1   1   3: {{.*}} (07) r1 += 1")		/* inner loop header */
+__msg("  1   3   4: {{.*}} (a5) if r1 < 0x5 goto pc-2") /* inner loop insn */
+__msg("  1   1   5: {{.*}} (a5) if r0 < 0xa goto pc-5") /* outer loop insn */
+__msg("          6: {{.*}} (95) exit")
+__naked void loop_nested(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	r1 = 0;						\
+2:	r1 += 1;					\
+	if r1 < 5 goto 2b;				\
+	if r0 < 10 goto 1b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Case C: forward edges, no loops. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (85) call bpf_get_prandom_u32")
+__msg("          1: {{.*}} (25) if r0 > 0x0 goto pc+2")
+__msg("          2: {{.*}} (b7) r0 = 2")
+__msg("          3: {{.*}} (05) goto pc+1")
+__msg("          4: {{.*}} (b7) r0 = 3")
+__msg("          5: {{.*}} (95) exit")
+__naked void fwd_edges_no_loop(void)
+{
+	asm volatile ("					\
+	call %[bpf_get_prandom_u32];			\
+	if r0 > 0 goto 1f;				\
+	r0 = 2;						\
+	goto 2f;					\
+1:	r0 = 3;						\
+2:	exit;						\
+"	:
+	: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
+/* Case B + D: two sibling inner loops within one outer loop. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  1       1: {{.*}} (07) r0 += 1")
+__msg("  1   1   2: {{.*}} (b7) r1 = 0")
+__msg("  1   1   3: {{.*}} (07) r1 += 1")
+__msg("  1   3   4: {{.*}} (a5) if r1 < 0x5 goto pc-2")
+__msg("  1   1   5: {{.*}} (b7) r2 = 0")
+__msg("  1   1   6: {{.*}} (07) r2 += 1")
+__msg("  1   6   7: {{.*}} (a5) if r2 < 0x5 goto pc-2")
+__msg("  1   1   8: {{.*}} (a5) if r0 < 0xa goto pc-8")
+__msg("          9: {{.*}} (95) exit")
+__naked void loop_nested_siblings(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	r1 = 0;						\
+2:	r1 += 1;					\
+	if r1 < 5 goto 2b;				\
+	r2 = 0;						\
+3:	r2 += 1;					\
+	if r2 < 5 goto 3b;				\
+	if r0 < 10 goto 1b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Three levels of nesting. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  1       1: {{.*}} (07) r0 += 1")
+__msg("  1   1   2: {{.*}} (b7) r1 = 0")
+__msg("  1   1   3: {{.*}} (07) r1 += 1")
+__msg("  1   3   4: {{.*}} (b7) r2 = 0")
+__msg("  1   3   5: {{.*}} (07) r2 += 1")
+__msg("  1   5   6: {{.*}} (a5) if r2 < 0x3 goto pc-2")
+__msg("  1   3   7: {{.*}} (a5) if r1 < 0x5 goto pc-5")
+__msg("  1   1   8: {{.*}} (a5) if r0 < 0xa goto pc-8")
+__msg("          9: {{.*}} (95) exit")
+__naked void loop_three_levels(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	r1 = 0;						\
+2:	r1 += 1;					\
+	r2 = 0;						\
+3:	r2 += 1;					\
+	if r2 < 3 goto 3b;				\
+	if r1 < 5 goto 2b;				\
+	if r0 < 10 goto 1b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Loop with an if-else body (forward branch inside loop, Case C). */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (b7) r0 = 0")
+__msg("  1       1: {{.*}} (07) r0 += 1")
+__msg("  1   1   2: {{.*}} (bf) r1 = r0")
+__msg("  1   1   3: {{.*}} (25) if r1 > 0x5 goto pc+1")
+__msg("  1   1   4: {{.*}} (b7) r1 = 1")
+__msg("  1   1   5: {{.*}} (0f) r0 += r1")
+__msg("  1   1   6: {{.*}} (a5) if r0 < 0x64 goto pc-6")
+__msg("          7: {{.*}} (95) exit")
+__naked void loop_with_if_else(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	r1 = r0;					\
+	if r1 > 5 goto 2f;				\
+	r1 = 1;						\
+2:	r0 += r1;					\
+	if r0 < 100 goto 1b;				\
+	exit;						\
+"	::: __clobber_all);
+}
+
+/* Case E: irreducible loop. */
+SEC("socket")
+__success
+__log_level(2)
+__msg("Program dump")
+__msg("          0: {{.*}} (85) call bpf_get_prandom_u32")
+__msg("          1: {{.*}} (b7) r1 = 0")
+__msg("          2: {{.*}} (25) if r0 > 0x5 goto pc+2")
+__msg("  1       3: {{.*}} (b7) r1 = 1")
+__msg("  1   3   4: {{.*}} (05) goto pc+1")
+__msg("          5: {{.*}} (b7) r1 = 2")
+__msg("  1   3   6: {{.*}} (0f) r0 += r1")
+__msg("  1   3   7: {{.*}} (a5) if r0 < 0x10 goto pc-5")
+__msg("          8: {{.*}} (95) exit")
+__naked void loop_irreducible(void)
+{
+	asm volatile ("					\
+	call %[bpf_get_prandom_u32];			\
+	r1 = 0;						\
+	if r0 > 5 goto 2f;				\
+1:	r1 = 1;						\
+	goto 3f;					\
+2:	r1 = 2;						\
+3:	r0 += r1;					\
+	if r0 < 16 goto 1b;				\
+	exit;						\
+"	:
+	: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
+char _license[] SEC("license") = "GPL";

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
@ 2026-05-27  8:50   ` sashiko-bot
  2026-06-01 19:37   ` Emil Tsalapatis
  1 sibling, 0 replies; 22+ messages in thread
From: sashiko-bot @ 2026-05-27  8:50 UTC (permalink / raw)
  To: Eduard Zingerman; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The commit message omits the test case for irreducible loops (Case E), which is implemented in the patch.
--

commit 0c9a2a0bab3cd380638bc61a05140fb934767741
Author: Eduard Zingerman <eddyz87@gmail.com>

selftests/bpf: test cases for loop hierarchy computation

This commit adds test cases for loop hierarchy computation to the BPF
selftests. It covers various branches in bpf_compute_loops() to ensure
correct handling of single, nested, independent, and irreducible loops.

> Test cases covering the following branches in bpf_compute_loops():
> - Case B: simple backedge creating a loop (loop_single)
> - Case B: two independent loops (loop_two_independent)
> - Case B + D: nested loops where inner header's loop_header points to
>   outer (loop_nested)
> - Case C: diamond CFG with no loops (fwd_edges_no_loop)
> - Case D: sibling inner loops within one outer loop
>   (loop_nested_siblings)
> - Three levels of loop nesting (loop_three_levels)
> - Loop with if-else body containing forward branches
>   (loop_with_if_else)

[Severity: Low]
This isn't a bug, but could the commit message also mention the test case
for irreducible loops? The patch includes the implementation for
loop_irreducible() (Case E) in the code, but it appears to be omitted from
this list.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260527-better-1m-reporting-v3-0-b3ede0588a75@gmail.com?part=3

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
  2026-05-27  8:50   ` sashiko-bot
@ 2026-06-01 19:37   ` Emil Tsalapatis
  2026-06-01 19:44     ` Eduard Zingerman
  1 sibling, 1 reply; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 19:37 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> Test cases covering the following branches in bpf_compute_loops():
> - Case B: simple backedge creating a loop (loop_single)
> - Case B: two independent loops (loop_two_independent)
> - Case B + D: nested loops where inner header's loop_header points to
>   outer (loop_nested)
> - Case C: diamond CFG with no loops (fwd_edges_no_loop)
> - Case D: sibling inner loops within one outer loop
>   (loop_nested_siblings)
> - Three levels of loop nesting (loop_three_levels)
> - Loop with if-else body containing forward branches
>   (loop_with_if_else)
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

I was debating if it's worth adding the edge case where the instruction is
its own loop header. It is so obviously wrong that the only issue would
be the verifier handling it gracefully, and we can probably skip it.

> ---
>  tools/testing/selftests/bpf/prog_tests/verifier.c  |   2 +
>  .../selftests/bpf/progs/verifier_loop_hierarchy.c  | 233 +++++++++++++++++++++
>  2 files changed, 235 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index 219ff2969868..35d92d176c1f 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -57,6 +57,7 @@
>  #include "verifier_live_stack.skel.h"
>  #include "verifier_liveness_exp.skel.h"
>  #include "verifier_load_acquire.skel.h"
> +#include "verifier_loop_hierarchy.skel.h"
>  #include "verifier_loops1.skel.h"
>  #include "verifier_lwt.skel.h"
>  #include "verifier_map_in_map.skel.h"
> @@ -208,6 +209,7 @@ void test_verifier_leak_ptr(void)             { RUN(verifier_leak_ptr); }
>  void test_verifier_linked_scalars(void)       { RUN(verifier_linked_scalars); }
>  void test_verifier_live_stack(void)           { RUN(verifier_live_stack); }
>  void test_verifier_liveness_exp(void)         { RUN(verifier_liveness_exp); }
> +void test_verifier_loop_hierarchy(void)       { RUN(verifier_loop_hierarchy); }
>  void test_verifier_loops1(void)               { RUN(verifier_loops1); }
>  void test_verifier_lwt(void)                  { RUN(verifier_lwt); }
>  void test_verifier_map_in_map(void)           { RUN(verifier_map_in_map); }
> diff --git a/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c
> new file mode 100644
> index 000000000000..84cabfa42485
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_loop_hierarchy.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +/*
> + * kernel/bpf/loops.c:compute_loops() distinguish between
> + * the following cases:
> + * - B: backedge -> simple loop
> + * - C: cross edge to non-loop node -> no-op
> + * - D: edge to node whose header is in DFS path -> nested loop
> + * - E: edge to node whose header is NOT in DFS path -> irreducible
> + *
> + * Below test cases cover the above branches in various combinations.
> + */
> +
> +/* Case B: single bounded loop. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  1       1: {{.*}} (07) r0 += 1")
> +__msg("  1   1   2: {{.*}} (a5) if r0 < 0xa goto pc-2")
> +__msg("          3: {{.*}} (95) exit")
> +__naked void loop_single(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	if r0 < 10 goto 1b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Case B: two independent loops at the same nesting level. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  2       1: {{.*}} (07) r0 += 1")
> +__msg("  2   1   2: {{.*}} (a5) if r0 < 0xa goto pc-2")
> +__msg("          3: {{.*}} (b7) r1 = 0")
> +__msg("  1       4: {{.*}} (07) r1 += 1")
> +__msg("  1   4   5: {{.*}} (a5) if r1 < 0xa goto pc-2")
> +__msg("          6: {{.*}} (95) exit")
> +__naked void loop_two_independent(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	if r0 < 10 goto 1b;				\
> +	r1 = 0;						\
> +2:	r1 += 1;					\
> +	if r1 < 10 goto 2b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Case B + D: nested loops. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  1       1: {{.*}} (07) r0 += 1")		/* outer loop header */
> +__msg("  1   1   2: {{.*}} (b7) r1 = 0")		/* outer loop insn */
> +__msg("  1   1   3: {{.*}} (07) r1 += 1")		/* inner loop header */
> +__msg("  1   3   4: {{.*}} (a5) if r1 < 0x5 goto pc-2") /* inner loop insn */
> +__msg("  1   1   5: {{.*}} (a5) if r0 < 0xa goto pc-5") /* outer loop insn */
> +__msg("          6: {{.*}} (95) exit")
> +__naked void loop_nested(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	r1 = 0;						\
> +2:	r1 += 1;					\
> +	if r1 < 5 goto 2b;				\
> +	if r0 < 10 goto 1b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Case C: forward edges, no loops. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (85) call bpf_get_prandom_u32")
> +__msg("          1: {{.*}} (25) if r0 > 0x0 goto pc+2")
> +__msg("          2: {{.*}} (b7) r0 = 2")
> +__msg("          3: {{.*}} (05) goto pc+1")
> +__msg("          4: {{.*}} (b7) r0 = 3")
> +__msg("          5: {{.*}} (95) exit")
> +__naked void fwd_edges_no_loop(void)
> +{
> +	asm volatile ("					\
> +	call %[bpf_get_prandom_u32];			\
> +	if r0 > 0 goto 1f;				\
> +	r0 = 2;						\
> +	goto 2f;					\
> +1:	r0 = 3;						\
> +2:	exit;						\
> +"	:
> +	: __imm(bpf_get_prandom_u32)
> +	: __clobber_all);
> +}
> +
> +/* Case B + D: two sibling inner loops within one outer loop. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  1       1: {{.*}} (07) r0 += 1")
> +__msg("  1   1   2: {{.*}} (b7) r1 = 0")
> +__msg("  1   1   3: {{.*}} (07) r1 += 1")
> +__msg("  1   3   4: {{.*}} (a5) if r1 < 0x5 goto pc-2")
> +__msg("  1   1   5: {{.*}} (b7) r2 = 0")
> +__msg("  1   1   6: {{.*}} (07) r2 += 1")
> +__msg("  1   6   7: {{.*}} (a5) if r2 < 0x5 goto pc-2")
> +__msg("  1   1   8: {{.*}} (a5) if r0 < 0xa goto pc-8")
> +__msg("          9: {{.*}} (95) exit")
> +__naked void loop_nested_siblings(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	r1 = 0;						\
> +2:	r1 += 1;					\
> +	if r1 < 5 goto 2b;				\
> +	r2 = 0;						\
> +3:	r2 += 1;					\
> +	if r2 < 5 goto 3b;				\
> +	if r0 < 10 goto 1b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Three levels of nesting. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  1       1: {{.*}} (07) r0 += 1")
> +__msg("  1   1   2: {{.*}} (b7) r1 = 0")
> +__msg("  1   1   3: {{.*}} (07) r1 += 1")
> +__msg("  1   3   4: {{.*}} (b7) r2 = 0")
> +__msg("  1   3   5: {{.*}} (07) r2 += 1")
> +__msg("  1   5   6: {{.*}} (a5) if r2 < 0x3 goto pc-2")
> +__msg("  1   3   7: {{.*}} (a5) if r1 < 0x5 goto pc-5")
> +__msg("  1   1   8: {{.*}} (a5) if r0 < 0xa goto pc-8")
> +__msg("          9: {{.*}} (95) exit")
> +__naked void loop_three_levels(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	r1 = 0;						\
> +2:	r1 += 1;					\
> +	r2 = 0;						\
> +3:	r2 += 1;					\
> +	if r2 < 3 goto 3b;				\
> +	if r1 < 5 goto 2b;				\
> +	if r0 < 10 goto 1b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Loop with an if-else body (forward branch inside loop, Case C). */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (b7) r0 = 0")
> +__msg("  1       1: {{.*}} (07) r0 += 1")
> +__msg("  1   1   2: {{.*}} (bf) r1 = r0")
> +__msg("  1   1   3: {{.*}} (25) if r1 > 0x5 goto pc+1")
> +__msg("  1   1   4: {{.*}} (b7) r1 = 1")
> +__msg("  1   1   5: {{.*}} (0f) r0 += r1")
> +__msg("  1   1   6: {{.*}} (a5) if r0 < 0x64 goto pc-6")
> +__msg("          7: {{.*}} (95) exit")
> +__naked void loop_with_if_else(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	r1 = r0;					\
> +	if r1 > 5 goto 2f;				\
> +	r1 = 1;						\
> +2:	r0 += r1;					\
> +	if r0 < 100 goto 1b;				\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +/* Case E: irreducible loop. */
> +SEC("socket")
> +__success
> +__log_level(2)
> +__msg("Program dump")
> +__msg("          0: {{.*}} (85) call bpf_get_prandom_u32")
> +__msg("          1: {{.*}} (b7) r1 = 0")
> +__msg("          2: {{.*}} (25) if r0 > 0x5 goto pc+2")
> +__msg("  1       3: {{.*}} (b7) r1 = 1")
> +__msg("  1   3   4: {{.*}} (05) goto pc+1")
> +__msg("          5: {{.*}} (b7) r1 = 2")
> +__msg("  1   3   6: {{.*}} (0f) r0 += r1")
> +__msg("  1   3   7: {{.*}} (a5) if r0 < 0x10 goto pc-5")
> +__msg("          8: {{.*}} (95) exit")
> +__naked void loop_irreducible(void)
> +{
> +	asm volatile ("					\
> +	call %[bpf_get_prandom_u32];			\
> +	r1 = 0;						\
> +	if r0 > 5 goto 2f;				\
> +1:	r1 = 1;						\
> +	goto 3f;					\
> +2:	r1 = 2;						\
> +3:	r0 += r1;					\
> +	if r0 < 16 goto 1b;				\
> +	exit;						\
> +"	:
> +	: __imm(bpf_get_prandom_u32)
> +	: __clobber_all);
> +}
> +
> +char _license[] SEC("license") = "GPL";


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation
  2026-06-01 19:37   ` Emil Tsalapatis
@ 2026-06-01 19:44     ` Eduard Zingerman
  0 siblings, 0 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-06-01 19:44 UTC (permalink / raw)
  To: Emil Tsalapatis, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Mon, 2026-06-01 at 15:37 -0400, Emil Tsalapatis wrote:
> On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> > Test cases covering the following branches in bpf_compute_loops():
> > - Case B: simple backedge creating a loop (loop_single)
> > - Case B: two independent loops (loop_two_independent)
> > - Case B + D: nested loops where inner header's loop_header points to
> >   outer (loop_nested)
> > - Case C: diamond CFG with no loops (fwd_edges_no_loop)
> > - Case D: sibling inner loops within one outer loop
> >   (loop_nested_siblings)
> > - Three levels of loop nesting (loop_three_levels)
> > - Loop with if-else body containing forward branches
> >   (loop_with_if_else)
> > 
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> 
> I was debating if it's worth adding the edge case where the instruction is
> its own loop header. It is so obviously wrong that the only issue would
> be the verifier handling it gracefully, and we can probably skip it.

I'll add the this test case, just to make sure that the algorithm
behaves correctly.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
                   ` (2 preceding siblings ...)
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-05-29 10:23   ` Jiri Olsa
  2026-06-01 19:50   ` Emil Tsalapatis
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
  5 siblings, 2 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

When the verifier exceeds the 1M instruction budget, the current error
output shows a random execution trace that happens to be active when
the limit is hit, which is not very helpful for debugging.

To rectify this, employ a profiler-inspired technique:
collect and count "callchain" stack traces that the verifier visits
during program validation and report the top 3 hot traces. Sample
callchains at instructions marked as force checkpoints and loop header
entries.

When 1M instructions is met, print a report containing:
- callchains verifier visited most often;
- disassembly of the subprograms referenced from the printed
  callchains.

Here is an example of the report:

  budget_nested_call_loop():
    ; asm volatile ("                                     \ @ verifier_budget_report.c:44
    0: (b7) r0 = 0
    1: (85) call pc+1
    2: (95) exit

  budget_nested_call_loop__inner():
    ; asm volatile ("                                     \ @ verifier_budget_report.c:54
    3: (b7) r0 = 0
    4: (07) r0 += 1
    5: (05) goto pc-2

  #1 most visited simulated stacktrace (visited 499999 times):
    budget_nested_call_loop/1 (.../verifier_budget_report.c:44)
    budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54)

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  14 ++++
 kernel/bpf/verifier.c        | 155 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 169 insertions(+)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index d0a1fb135cbf..347a155d8e21 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -9,6 +9,8 @@
 #include <linux/filter.h> /* for MAX_BPF_STACK */
 #include <linux/tnum.h>
 #include <linux/cnum.h>
+#include <linux/hashtable.h>
+#include <linux/jhash.h>
 
 /* Maximum variable offset umax_value permitted when resolving memory accesses.
  * In practice this is far bigger than any realistic pointer offset; this limit
@@ -660,6 +662,17 @@ struct bpf_loop {
 	bool irreducible;
 };
 
+struct bpf_callchain {
+	u32 insn_idx[MAX_CALL_FRAMES];
+	u32 curframe;
+};
+
+struct bpf_callchain_entry {
+	struct hlist_node node;
+	struct bpf_callchain cc;
+	u64 count;
+};
+
 struct bpf_insn_aux_data {
 	union {
 		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
@@ -1043,6 +1056,7 @@ struct bpf_verifier_env {
 	u32 scc_cnt;
 	struct bpf_iarray *succ;
 	struct bpf_iarray *gotox_tmp_buf;
+	DECLARE_HASHTABLE(callchain_htab, 8);
 };
 
 static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2b3584712ad2..54b7ad65b7fc 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17247,6 +17247,152 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
 	return INSN_IDX_UPDATED;
 }
 
+static void compute_callchain(struct bpf_verifier_state *st, struct bpf_callchain *cc)
+{
+	int i;
+
+	cc->curframe = st->curframe;
+	for (i = 0; i <= st->curframe; i++)
+		cc->insn_idx[i] = bpf_frame_insn_idx(st, i);
+}
+
+static int update_callchain_profile(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
+{
+	struct bpf_callchain_entry *entry;
+	struct bpf_callchain cc = {};
+	u32 hash;
+
+	compute_callchain(st, &cc);
+	hash = jhash2(cc.insn_idx, cc.curframe + 1, cc.curframe + 1);
+	hash_for_each_possible(env->callchain_htab, entry, node, hash) {
+		if (entry->cc.curframe == cc.curframe &&
+		    !memcmp(entry->cc.insn_idx, cc.insn_idx, (cc.curframe + 1) * sizeof(u32))) {
+			entry->count++;
+			return 0;
+		}
+	}
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT);
+	if (!entry)
+		return -ENOMEM;
+	entry->cc = cc;
+	entry->count = 1;
+	hash_add(env->callchain_htab, &entry->node, hash);
+	return 0;
+}
+
+static void free_callchain_profile(struct bpf_verifier_env *env)
+{
+	struct bpf_callchain_entry *entry;
+	struct hlist_node *tmp;
+	int bkt;
+
+	hash_for_each_safe(env->callchain_htab, bkt, tmp, entry, node) {
+		hash_del(&entry->node);
+		kfree(entry);
+	}
+}
+
+static void print_callchain_entry(struct bpf_verifier_env *env,
+				  struct bpf_callchain_entry *entry, int idx)
+{
+	struct bpf_callchain *cc = &entry->cc;
+	const struct bpf_line_info *linfo;
+	struct bpf_subprog_info *sub;
+	int i, insn_idx;
+
+	verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n",
+		idx, entry->count);
+	for (i = 0; i <= cc->curframe; i++) {
+		insn_idx = cc->insn_idx[i];
+		sub = bpf_find_containing_subprog(env, insn_idx);
+		linfo = bpf_find_linfo(env->prog, insn_idx);
+		if (sub->name)
+			verbose(env, "  %s/%d", sub->name, insn_idx);
+		else
+			verbose(env, "  subprog#%td/%d", sub - env->subprog_info, insn_idx);
+		if (linfo)
+			verbose(env, " (%s:%u)",
+				btf_name_by_offset(env->prog->aux->btf, linfo->file_name_off),
+				BPF_LINE_INFO_LINE_NUM(linfo->line_col));
+		verbose(env, "\n");
+	}
+}
+
+static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub)
+{
+	u32 i, end = (sub + 1)->start;
+
+	if (sub->name)
+		verbose(env, "%s():\n", sub->name);
+	else
+		verbose(env, "subprog#%td:\n", sub - env->subprog_info);
+	env->prev_linfo = NULL;
+	for (i = sub->start; i < end; i++) {
+		verbose_linfo(env, i, "  ; ");
+		verbose(env, "  %d: ", i);
+		bpf_verbose_insn(env, &env->prog->insnsi[i]);
+		if (bpf_is_ldimm64(&env->prog->insnsi[i]))
+			i++;
+	}
+}
+
+/*
+ * Print several most visited simulated stack traces,
+ * and a disasembly of related subprograms.
+ */
+static void print_hotspots(struct bpf_verifier_env *env)
+{
+	DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {};
+	struct bpf_callchain_entry *top[3] = {};
+	struct bpf_callchain_entry *entry;
+	struct bpf_subprog_info *sub;
+	int i, j, bkt, nr_top = 0;
+
+	if (!(env->log.level & BPF_LOG_LEVEL))
+		return;
+
+	/* Collect the hottest callchains */
+	hash_for_each(env->callchain_htab, bkt, entry, node) {
+		for (i = 0; i < 3; i++) {
+			if (!top[i] || entry->count > top[i]->count) {
+				memmove(&top[i + 1], &top[i], (2 - i) * sizeof(top[0]));
+				top[i] = entry;
+				break;
+			}
+		}
+	}
+
+	for (i = 0; i < 3 && top[i]; i++)
+		nr_top++;
+
+	if (!nr_top)
+		return;
+
+	if (!(env->log.level & BPF_LOG_LEVEL2))
+		bpf_vlog_reset(&env->log, 0);
+
+	/* Identify callchain subprograms and print disasm of those */
+	for (i = 0; i < nr_top; i++) {
+		struct bpf_callchain *cc = &top[i]->cc;
+
+		for (j = 0; j <= cc->curframe; j++) {
+			sub = bpf_find_containing_subprog(env, cc->insn_idx[j]);
+			__set_bit(sub - env->subprog_info, printed_subs);
+		}
+	}
+
+	for_each_set_bit(i, printed_subs, env->subprog_cnt) {
+		disasm_subprog(env, &env->subprog_info[i]);
+		verbose(env, "\n");
+	}
+
+	/* Print the hot callchains */
+	for (i = 0; i < nr_top; i++) {
+		print_callchain_entry(env, top[i], i + 1);
+		verbose(env, "\n");
+	}
+}
+
 static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
 {
 	int err;
@@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env)
 		insn_aux = &env->insn_aux_data[env->insn_idx];
 
 		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
+			print_hotspots(env);
 			verbose(env,
 				"BPF program is too large. Processed %d insn\n",
 				env->insn_processed);
@@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env)
 		state->last_insn_idx = env->prev_insn_idx;
 		state->insn_idx = env->insn_idx;
 
+		if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) {
+			err = update_callchain_profile(env, state);
+			if (err)
+				return err;
+		}
+
 		if (bpf_is_prune_point(env, env->insn_idx)) {
 			err = bpf_is_state_visited(env, env->insn_idx);
 			if (err < 0)
@@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
 		return -ENOMEM;
 
 	env->bt.env = env;
+	hash_init(env->callchain_htab);
 
 	len = (*prog)->len;
 	env->insn_aux_data =
@@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
 	bpf_clear_insn_aux_data(env, 0, env->prog->len);
 	vfree(env->insn_aux_data);
 err_free_env:
+	free_callchain_profile(env);
 	bpf_stack_liveness_free(env);
 	kvfree(env->cfg.insn_postorder);
 	kvfree(env->scc_info);

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
@ 2026-05-29 10:23   ` Jiri Olsa
  2026-05-29 18:44     ` Eduard Zingerman
  2026-06-01 19:50   ` Emil Tsalapatis
  1 sibling, 1 reply; 22+ messages in thread
From: Jiri Olsa @ 2026-05-29 10:23 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote:

SNIP

> +
>  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
>  {
>  	int err;
> @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env)
>  		insn_aux = &env->insn_aux_data[env->insn_idx];
>  
>  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> +			print_hotspots(env);

hi,
this is really cool feature, I've already used it to fix issues :)

Would it be possible to make this configurable? like to be able
to get this output for program that passed and display more than
3 hottest callchains?

thanks,
jirka


>  			verbose(env,
>  				"BPF program is too large. Processed %d insn\n",
>  				env->insn_processed);
> @@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env)
>  		state->last_insn_idx = env->prev_insn_idx;
>  		state->insn_idx = env->insn_idx;
>  
> +		if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) {
> +			err = update_callchain_profile(env, state);
> +			if (err)
> +				return err;
> +		}
> +
>  		if (bpf_is_prune_point(env, env->insn_idx)) {
>  			err = bpf_is_state_visited(env, env->insn_idx);
>  			if (err < 0)
> @@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  		return -ENOMEM;
>  
>  	env->bt.env = env;
> +	hash_init(env->callchain_htab);
>  
>  	len = (*prog)->len;
>  	env->insn_aux_data =
> @@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  	bpf_clear_insn_aux_data(env, 0, env->prog->len);
>  	vfree(env->insn_aux_data);
>  err_free_env:
> +	free_callchain_profile(env);
>  	bpf_stack_liveness_free(env);
>  	kvfree(env->cfg.insn_postorder);
>  	kvfree(env->scc_info);
> 
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-05-29 10:23   ` Jiri Olsa
@ 2026-05-29 18:44     ` Eduard Zingerman
  2026-05-30  9:34       ` Jiri Olsa
  0 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-29 18:44 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song

On Fri, 2026-05-29 at 12:23 +0200, Jiri Olsa wrote:
> On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote:
> 
> SNIP
> 
> > +
> >  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> >  {
> >  	int err;
> > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env)
> >  		insn_aux = &env->insn_aux_data[env->insn_idx];
> >  
> >  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> > +			print_hotspots(env);
> 
> hi,
> this is really cool feature, I've already used it to fix issues :)

Hi Jiri, thank you for the feedback!
Anything you find lacking?

After discussion with Alexei (who is somewhat skeptical of the whole idea),
I plan to add:
- samples for values residing in the diverging registers
- walk back from iter_next call using jump history and
  print 5-10 instructions modifying the register in question
  (in a manner similar to backtracking done for mark chain precision).

> Would it be possible to make this configurable? like to be able
> to get this output for program that passed and display more than
> 3 hottest callchains?

For sure, there are several options for a way to toggle this print out:
- unconditionally make it a part of v1 log;
- add it as a bit in bpf_attr->log_level.

I'd say that v1 log is a better option here.

As for number of callchains printed, I'd like to avoid tweaking
bpf_attr. So this leaves several options:
- make it top 5-7-10, just hardcode a bigger number.
- add a threshold value specifying how many times a callchain should
  be visited to get printed. E.g. only print callchains visited more
  than 1000 times, if such don't exist then put threshold to 100,
  if such don't exist, print top 3.

Wdyt?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-05-29 18:44     ` Eduard Zingerman
@ 2026-05-30  9:34       ` Jiri Olsa
  0 siblings, 0 replies; 22+ messages in thread
From: Jiri Olsa @ 2026-05-30  9:34 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Jiri Olsa, bpf, ast, andrii, daniel, martin.lau, kernel-team,
	yonghong.song

On Fri, May 29, 2026 at 11:44:39AM -0700, Eduard Zingerman wrote:
> On Fri, 2026-05-29 at 12:23 +0200, Jiri Olsa wrote:
> > On Wed, May 27, 2026 at 12:29:52AM -0700, Eduard Zingerman wrote:
> > 
> > SNIP
> > 
> > > +
> > >  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> > >  {
> > >  	int err;
> > > @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env)
> > >  		insn_aux = &env->insn_aux_data[env->insn_idx];
> > >  
> > >  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> > > +			print_hotspots(env);
> > 
> > hi,
> > this is really cool feature, I've already used it to fix issues :)
> 
> Hi Jiri, thank you for the feedback!
> Anything you find lacking?

so in my case the first 2 stacktraces identified data copy deep in
the code, which was wrongly done with standard loop instead of bpf_iter,
so it led me directly to the problem

In another run I got the 'Most varying ...' output and it wasn't
that obvious for me how helpful that could be, but I did not check
on that in detail yet

> 
> After discussion with Alexei (who is somewhat skeptical of the whole idea),
> I plan to add:
> - samples for values residing in the diverging registers
> - walk back from iter_next call using jump history and
>   print 5-10 instructions modifying the register in question
>   (in a manner similar to backtracking done for mark chain precision).
> 
> > Would it be possible to make this configurable? like to be able
> > to get this output for program that passed and display more than
> > 3 hottest callchains?
> 
> For sure, there are several options for a way to toggle this print out:
> - unconditionally make it a part of v1 log;
> - add it as a bit in bpf_attr->log_level.
> 
> I'd say that v1 log is a better option here.

sounds good, I think it's negligible compared to the rest of the log

> 
> As for number of callchains printed, I'd like to avoid tweaking

I guess it's better to have it hardcoded, no need to configure that

> bpf_attr. So this leaves several options:
> - make it top 5-7-10, just hardcode a bigger number.
> - add a threshold value specifying how many times a callchain should
>   be visited to get printed. E.g. only print callchains visited more
>   than 1000 times, if such don't exist then put threshold to 100,
>   if such don't exist, print top 3.

+1 to just print top X stacks

thanks,
jirka

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
  2026-05-29 10:23   ` Jiri Olsa
@ 2026-06-01 19:50   ` Emil Tsalapatis
  2026-06-01 19:55     ` Eduard Zingerman
  1 sibling, 1 reply; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 19:50 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> When the verifier exceeds the 1M instruction budget, the current error
> output shows a random execution trace that happens to be active when
> the limit is hit, which is not very helpful for debugging.
>
> To rectify this, employ a profiler-inspired technique:
> collect and count "callchain" stack traces that the verifier visits
> during program validation and report the top 3 hot traces. Sample
> callchains at instructions marked as force checkpoints and loop header
> entries.
>
> When 1M instructions is met, print a report containing:
> - callchains verifier visited most often;
> - disassembly of the subprograms referenced from the printed
>   callchains.
>
> Here is an example of the report:
>
>   budget_nested_call_loop():
>     ; asm volatile ("                                     \ @ verifier_budget_report.c:44
>     0: (b7) r0 = 0
>     1: (85) call pc+1
>     2: (95) exit
>
>   budget_nested_call_loop__inner():
>     ; asm volatile ("                                     \ @ verifier_budget_report.c:54
>     3: (b7) r0 = 0
>     4: (07) r0 += 1
>     5: (05) goto pc-2
>
>   #1 most visited simulated stacktrace (visited 499999 times):
>     budget_nested_call_loop/1 (.../verifier_budget_report.c:44)
>     budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54)
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

Minor suggestion: Can we make the N of top-N a #define for clarity and 
to make it easy to change during hacking?

> ---
>  include/linux/bpf_verifier.h |  14 ++++
>  kernel/bpf/verifier.c        | 155 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 169 insertions(+)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index d0a1fb135cbf..347a155d8e21 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -9,6 +9,8 @@
>  #include <linux/filter.h> /* for MAX_BPF_STACK */
>  #include <linux/tnum.h>
>  #include <linux/cnum.h>
> +#include <linux/hashtable.h>
> +#include <linux/jhash.h>
>  
>  /* Maximum variable offset umax_value permitted when resolving memory accesses.
>   * In practice this is far bigger than any realistic pointer offset; this limit
> @@ -660,6 +662,17 @@ struct bpf_loop {
>  	bool irreducible;
>  };
>  
> +struct bpf_callchain {
> +	u32 insn_idx[MAX_CALL_FRAMES];
> +	u32 curframe;
> +};
> +
> +struct bpf_callchain_entry {
> +	struct hlist_node node;
> +	struct bpf_callchain cc;
> +	u64 count;
> +};
> +
>  struct bpf_insn_aux_data {
>  	union {
>  		enum bpf_reg_type ptr_type;	/* pointer type for load/store insns */
> @@ -1043,6 +1056,7 @@ struct bpf_verifier_env {
>  	u32 scc_cnt;
>  	struct bpf_iarray *succ;
>  	struct bpf_iarray *gotox_tmp_buf;
> +	DECLARE_HASHTABLE(callchain_htab, 8);
>  };
>  
>  static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2b3584712ad2..54b7ad65b7fc 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -17247,6 +17247,152 @@ static int check_indirect_jump(struct bpf_verifier_env *env, struct bpf_insn *in
>  	return INSN_IDX_UPDATED;
>  }
>  
> +static void compute_callchain(struct bpf_verifier_state *st, struct bpf_callchain *cc)
> +{
> +	int i;
> +
> +	cc->curframe = st->curframe;
> +	for (i = 0; i <= st->curframe; i++)
> +		cc->insn_idx[i] = bpf_frame_insn_idx(st, i);
> +}
> +
> +static int update_callchain_profile(struct bpf_verifier_env *env, struct bpf_verifier_state *st)
> +{
> +	struct bpf_callchain_entry *entry;
> +	struct bpf_callchain cc = {};
> +	u32 hash;
> +
> +	compute_callchain(st, &cc);
> +	hash = jhash2(cc.insn_idx, cc.curframe + 1, cc.curframe + 1);
> +	hash_for_each_possible(env->callchain_htab, entry, node, hash) {
> +		if (entry->cc.curframe == cc.curframe &&
> +		    !memcmp(entry->cc.insn_idx, cc.insn_idx, (cc.curframe + 1) * sizeof(u32))) {
> +			entry->count++;
> +			return 0;
> +		}
> +	}
> +	entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT);
> +	if (!entry)
> +		return -ENOMEM;
> +	entry->cc = cc;
> +	entry->count = 1;
> +	hash_add(env->callchain_htab, &entry->node, hash);
> +	return 0;
> +}
> +
> +static void free_callchain_profile(struct bpf_verifier_env *env)
> +{
> +	struct bpf_callchain_entry *entry;
> +	struct hlist_node *tmp;
> +	int bkt;
> +
> +	hash_for_each_safe(env->callchain_htab, bkt, tmp, entry, node) {
> +		hash_del(&entry->node);
> +		kfree(entry);
> +	}
> +}
> +
> +static void print_callchain_entry(struct bpf_verifier_env *env,
> +				  struct bpf_callchain_entry *entry, int idx)
> +{
> +	struct bpf_callchain *cc = &entry->cc;
> +	const struct bpf_line_info *linfo;
> +	struct bpf_subprog_info *sub;
> +	int i, insn_idx;
> +
> +	verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n",
> +		idx, entry->count);
> +	for (i = 0; i <= cc->curframe; i++) {
> +		insn_idx = cc->insn_idx[i];
> +		sub = bpf_find_containing_subprog(env, insn_idx);
> +		linfo = bpf_find_linfo(env->prog, insn_idx);
> +		if (sub->name)
> +			verbose(env, "  %s/%d", sub->name, insn_idx);
> +		else
> +			verbose(env, "  subprog#%td/%d", sub - env->subprog_info, insn_idx);
> +		if (linfo)
> +			verbose(env, " (%s:%u)",
> +				btf_name_by_offset(env->prog->aux->btf, linfo->file_name_off),
> +				BPF_LINE_INFO_LINE_NUM(linfo->line_col));
> +		verbose(env, "\n");
> +	}
> +}
> +
> +static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub)
> +{
> +	u32 i, end = (sub + 1)->start;
> +
> +	if (sub->name)
> +		verbose(env, "%s():\n", sub->name);
> +	else
> +		verbose(env, "subprog#%td:\n", sub - env->subprog_info);
> +	env->prev_linfo = NULL;
> +	for (i = sub->start; i < end; i++) {
> +		verbose_linfo(env, i, "  ; ");
> +		verbose(env, "  %d: ", i);
> +		bpf_verbose_insn(env, &env->prog->insnsi[i]);
> +		if (bpf_is_ldimm64(&env->prog->insnsi[i]))
> +			i++;
> +	}
> +}
> +
> +/*
> + * Print several most visited simulated stack traces,
> + * and a disasembly of related subprograms.
> + */
> +static void print_hotspots(struct bpf_verifier_env *env)
> +{
> +	DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {};
> +	struct bpf_callchain_entry *top[3] = {};
> +	struct bpf_callchain_entry *entry;
> +	struct bpf_subprog_info *sub;
> +	int i, j, bkt, nr_top = 0;
> +
> +	if (!(env->log.level & BPF_LOG_LEVEL))
> +		return;
> +
> +	/* Collect the hottest callchains */
> +	hash_for_each(env->callchain_htab, bkt, entry, node) {
> +		for (i = 0; i < 3; i++) {
> +			if (!top[i] || entry->count > top[i]->count) {
> +				memmove(&top[i + 1], &top[i], (2 - i) * sizeof(top[0]));
> +				top[i] = entry;
> +				break;
> +			}
> +		}
> +	}
> +
> +	for (i = 0; i < 3 && top[i]; i++)
> +		nr_top++;
> +
> +	if (!nr_top)
> +		return;
> +
> +	if (!(env->log.level & BPF_LOG_LEVEL2))
> +		bpf_vlog_reset(&env->log, 0);
> +
> +	/* Identify callchain subprograms and print disasm of those */
> +	for (i = 0; i < nr_top; i++) {
> +		struct bpf_callchain *cc = &top[i]->cc;
> +
> +		for (j = 0; j <= cc->curframe; j++) {
> +			sub = bpf_find_containing_subprog(env, cc->insn_idx[j]);
> +			__set_bit(sub - env->subprog_info, printed_subs);
> +		}
> +	}
> +
> +	for_each_set_bit(i, printed_subs, env->subprog_cnt) {
> +		disasm_subprog(env, &env->subprog_info[i]);
> +		verbose(env, "\n");
> +	}
> +
> +	/* Print the hot callchains */
> +	for (i = 0; i < nr_top; i++) {
> +		print_callchain_entry(env, top[i], i + 1);
> +		verbose(env, "\n");
> +	}
> +}
> +
>  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
>  {
>  	int err;
> @@ -17381,6 +17527,7 @@ static int do_check(struct bpf_verifier_env *env)
>  		insn_aux = &env->insn_aux_data[env->insn_idx];
>  
>  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> +			print_hotspots(env);
>  			verbose(env,
>  				"BPF program is too large. Processed %d insn\n",
>  				env->insn_processed);
> @@ -17390,6 +17537,12 @@ static int do_check(struct bpf_verifier_env *env)
>  		state->last_insn_idx = env->prev_insn_idx;
>  		state->insn_idx = env->insn_idx;
>  
> +		if (bpf_is_force_checkpoint(env, env->insn_idx) || insn_aux->loop) {
> +			err = update_callchain_profile(env, state);
> +			if (err)
> +				return err;
> +		}
> +
>  		if (bpf_is_prune_point(env, env->insn_idx)) {
>  			err = bpf_is_state_visited(env, env->insn_idx);
>  			if (err < 0)
> @@ -19673,6 +19826,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  		return -ENOMEM;
>  
>  	env->bt.env = env;
> +	hash_init(env->callchain_htab);
>  
>  	len = (*prog)->len;
>  	env->insn_aux_data =
> @@ -19945,6 +20099,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
>  	bpf_clear_insn_aux_data(env, 0, env->prog->len);
>  	vfree(env->insn_aux_data);
>  err_free_env:
> +	free_callchain_profile(env);
>  	bpf_stack_liveness_free(env);
>  	kvfree(env->cfg.insn_postorder);
>  	kvfree(env->scc_info);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met
  2026-06-01 19:50   ` Emil Tsalapatis
@ 2026-06-01 19:55     ` Eduard Zingerman
  0 siblings, 0 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-06-01 19:55 UTC (permalink / raw)
  To: Emil Tsalapatis, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Mon, 2026-06-01 at 15:50 -0400, Emil Tsalapatis wrote:
> On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> > When the verifier exceeds the 1M instruction budget, the current error
> > output shows a random execution trace that happens to be active when
> > the limit is hit, which is not very helpful for debugging.
> > 
> > To rectify this, employ a profiler-inspired technique:
> > collect and count "callchain" stack traces that the verifier visits
> > during program validation and report the top 3 hot traces. Sample
> > callchains at instructions marked as force checkpoints and loop header
> > entries.
> > 
> > When 1M instructions is met, print a report containing:
> > - callchains verifier visited most often;
> > - disassembly of the subprograms referenced from the printed
> >   callchains.
> > 
> > Here is an example of the report:
> > 
> >   budget_nested_call_loop():
> >     ; asm volatile ("                                     \ @ verifier_budget_report.c:44
> >     0: (b7) r0 = 0
> >     1: (85) call pc+1
> >     2: (95) exit
> > 
> >   budget_nested_call_loop__inner():
> >     ; asm volatile ("                                     \ @ verifier_budget_report.c:54
> >     3: (b7) r0 = 0
> >     4: (07) r0 += 1
> >     5: (05) goto pc-2
> > 
> >   #1 most visited simulated stacktrace (visited 499999 times):
> >     budget_nested_call_loop/1 (.../verifier_budget_report.c:44)
> >     budget_nested_call_loop__inner/4 (.../verifier_budget_report.c:54)
> > 
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> 
> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> 
> Minor suggestion: Can we make the N of top-N a #define for clarity and 
> to make it easy to change during hacking?

Yes, makes sense.

[...]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
                   ` (3 preceding siblings ...)
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-06-01 21:29   ` Emil Tsalapatis
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
  5 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

When a hot callchain ends at a iter_next, may_goto, or
callback-calling instruction, compare cached states from
explored_states to identify which registers or stack slots most
frequently differ between states. Report the top 3 most varying
locations.

The full states diff computation would require some statistical
analysis similar to k-means. To keep things simple, approximate this
by modifying states_equal() to return the first location where old and
current states differ, and counting most frequent diff locations.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h |  15 ++++
 kernel/bpf/states.c          | 184 +++++++++++++++++++++++++++++++++++++------
 kernel/bpf/verifier.c        |  57 +++++++++++---
 3 files changed, 225 insertions(+), 31 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 347a155d8e21..cc83359774f5 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -662,6 +662,17 @@ struct bpf_loop {
 	bool irreducible;
 };
 
+struct bpf_state_diff {
+	u8 slot;
+	u8 frame;
+	enum {
+		DIFF_OTHER,
+		DIFF_REG,
+		DIFF_STACK,
+		DIFF_ARG,
+	} kind;
+};
+
 struct bpf_callchain {
 	u32 insn_idx[MAX_CALL_FRAMES];
 	u32 curframe;
@@ -1631,5 +1642,9 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env);
 int bpf_do_misc_fixups(struct bpf_verifier_env *env);
 
 int bpf_compute_loops(struct bpf_verifier_env *env);
+int bpf_sample_state_diffs(struct bpf_verifier_env *env,
+			   struct bpf_callchain *cc,
+			   struct bpf_state_diff *top_diffs,
+			   int *nr_diffs);
 
 #endif /* _LINUX_BPF_VERIFIER_H */
diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
index 877338136009..b1256658189d 100644
--- a/kernel/bpf/states.c
+++ b/kernel/bpf/states.c
@@ -4,6 +4,7 @@
 #include <linux/bpf_verifier.h>
 #include <linux/cnum.h>
 #include <linux/filter.h>
+#include <linux/sort.h>
 
 #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args)
 
@@ -696,7 +697,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env,
 
 static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 		      struct bpf_func_state *cur, struct bpf_idmap *idmap,
-		      enum exact_level exact)
+		      enum exact_level exact, struct bpf_state_diff *diff)
 {
 	int i, spi;
 
@@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 				old_type = STACK_INVALID;
 			if (cur_type == STACK_POISON)
 				cur_type = STACK_INVALID;
-			if (i >= cur->allocated_stack || old_type != cur_type)
+			if (i >= cur->allocated_stack || old_type != cur_type) {
+				diff->slot = spi;
+				diff->kind = DIFF_STACK;
 				return false;
+			}
 		}
 
 		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID ||
@@ -735,8 +739,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 		/* explored stack has more populated slots than current stack
 		 * and these slots were used
 		 */
-		if (i >= cur->allocated_stack)
+		if (i >= cur->allocated_stack) {
+			diff->slot = spi;
+			diff->kind = DIFF_STACK;
 			return false;
+		}
 
 		/*
 		 * 64 and 32-bit scalar spills vs MISC/INVALID slots and vice versa.
@@ -748,8 +755,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			old_reg = scalar_reg_for_stack(env, &old->stack[spi], im);
 			cur_reg = scalar_reg_for_stack(env, &cur->stack[spi], im);
 			if (old_reg && cur_reg) {
-				if (!regsafe(env, old_reg, cur_reg, idmap, exact))
+				if (!regsafe(env, old_reg, cur_reg, idmap, exact)) {
+					diff->slot = spi;
+					diff->kind = DIFF_STACK;
 					return false;
+				}
 				i += (im == 0 ? BPF_REG_SIZE - 1 : 3);
 				continue;
 			}
@@ -763,13 +773,16 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 		    cur->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_ZERO)
 			continue;
 		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] !=
-		    cur->stack[spi].slot_type[i % BPF_REG_SIZE])
+		    cur->stack[spi].slot_type[i % BPF_REG_SIZE]) {
 			/* Ex: old explored (safe) state has STACK_SPILL in
 			 * this stack slot, but current has STACK_MISC ->
 			 * this verifier states are not equivalent,
 			 * return false to continue verification of this path
 			 */
+			diff->slot = spi;
+			diff->kind = DIFF_STACK;
 			return false;
+		}
 		if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1)
 			continue;
 		/* Both old and cur are having same slot_type */
@@ -786,16 +799,22 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			 * return false to continue verification of this path
 			 */
 			if (!regsafe(env, &old->stack[spi].spilled_ptr,
-				     &cur->stack[spi].spilled_ptr, idmap, exact))
+				     &cur->stack[spi].spilled_ptr, idmap, exact)) {
+				diff->slot = spi;
+				diff->kind = DIFF_STACK;
 				return false;
+			}
 			break;
 		case STACK_DYNPTR:
 			old_reg = &old->stack[spi].spilled_ptr;
 			cur_reg = &cur->stack[spi].spilled_ptr;
 			if (old_reg->dynptr.type != cur_reg->dynptr.type ||
 			    old_reg->dynptr.first_slot != cur_reg->dynptr.first_slot ||
-			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap))
+			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) {
+				diff->slot = spi;
+				diff->kind = DIFF_STACK;
 				return false;
+			}
 			break;
 		case STACK_ITER:
 			old_reg = &old->stack[spi].spilled_ptr;
@@ -810,15 +829,21 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			    old_reg->iter.btf_id != cur_reg->iter.btf_id ||
 			    old_reg->iter.state != cur_reg->iter.state ||
 			    /* ignore {old_reg,cur_reg}->iter.depth, see above */
-			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap))
+			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) {
+				diff->slot = spi;
+				diff->kind = DIFF_STACK;
 				return false;
+			}
 			break;
 		case STACK_IRQ_FLAG:
 			old_reg = &old->stack[spi].spilled_ptr;
 			cur_reg = &cur->stack[spi].spilled_ptr;
 			if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap) ||
-			    old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class)
+			    old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) {
+				diff->slot = spi;
+				diff->kind = DIFF_STACK;
 				return false;
+			}
 			break;
 		case STACK_MISC:
 		case STACK_ZERO:
@@ -827,6 +852,8 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			continue;
 		/* Ensure that new unhandled slot types return false by default */
 		default:
+			diff->slot = spi;
+			diff->kind = DIFF_STACK;
 			return false;
 		}
 	}
@@ -839,7 +866,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
  */
 static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			   struct bpf_func_state *cur, struct bpf_idmap *idmap,
-			   enum exact_level exact)
+			   enum exact_level exact, struct bpf_state_diff *diff)
 {
 	int i, nslots;
 
@@ -852,8 +879,11 @@ static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *
 			  &old->stack_arg_regs[i] : &not_init;
 		cur_arg = i < cur->out_stack_arg_cnt ?
 			  &cur->stack_arg_regs[i] : &not_init;
-		if (!regsafe(env, old_arg, cur_arg, idmap, exact))
+		if (!regsafe(env, old_arg, cur_arg, idmap, exact)) {
+			diff->slot = i;
+			diff->kind = DIFF_ARG;
 			return false;
+		}
 	}
 
 	return true;
@@ -933,7 +963,8 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c
  * the current state will reach 'bpf_exit' instruction safely
  */
 static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
-			      struct bpf_func_state *cur, u32 insn_idx, enum exact_level exact)
+			      struct bpf_func_state *cur, u32 insn_idx,
+			      enum exact_level exact, struct bpf_state_diff *diff)
 {
 	u16 live_regs = env->insn_aux_data[insn_idx].live_regs_before;
 	u16 i;
@@ -947,13 +978,16 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
 	for (i = 0; i < MAX_BPF_REG; i++)
 		if (((1 << i) & live_regs) &&
 		    !regsafe(env, &old->regs[i], &cur->regs[i],
-			     &env->idmap_scratch, exact))
+			     &env->idmap_scratch, exact)) {
+			diff->slot = i;
+			diff->kind = DIFF_REG;
 			return false;
+		}
 
-	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
+	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact, diff))
 		return false;
 
-	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
+	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact, diff))
 		return false;
 
 	return true;
@@ -970,11 +1004,14 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env)
 static bool states_equal(struct bpf_verifier_env *env,
 			 struct bpf_verifier_state *old,
 			 struct bpf_verifier_state *cur,
-			 enum exact_level exact)
+			 enum exact_level exact,
+			 struct bpf_state_diff *diff)
 {
 	u32 insn_idx;
 	int i;
 
+	diff->kind = DIFF_OTHER;
+
 	if (old->curframe != cur->curframe)
 		return false;
 
@@ -999,8 +1036,11 @@ static bool states_equal(struct bpf_verifier_env *env,
 		insn_idx = bpf_frame_insn_idx(old, i);
 		if (old->frame[i]->callsite != cur->frame[i]->callsite)
 			return false;
-		if (!func_states_equal(env, old->frame[i], cur->frame[i], insn_idx, exact))
+		if (!func_states_equal(env, old->frame[i], cur->frame[i],
+				       insn_idx, exact, diff)) {
+			diff->frame = i;
 			return false;
+		}
 	}
 	return true;
 }
@@ -1231,6 +1271,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl;
 	struct bpf_verifier_state *cur = env->cur_state, *new;
+	struct bpf_state_diff diff = {};
 	bool force_new_state, add_new_state, loop;
 	int n, err, states_cnt = 0;
 	struct list_head *pos, *tmp, *head;
@@ -1320,7 +1361,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 * => unsafe memory access at 11 would not be caught.
 			 */
 			if (is_iter_next_insn(env, insn_idx)) {
-				if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
+				if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
 					struct bpf_func_state *cur_frame;
 					struct bpf_reg_state *iter_state, *iter_reg;
 					int spi;
@@ -1345,13 +1386,13 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			}
 			if (is_may_goto_insn_at(env, insn_idx)) {
 				if (sl->state.may_goto_depth != cur->may_goto_depth &&
-				    states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
+				    states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
 					loop = true;
 					goto hit;
 				}
 			}
 			if (bpf_calls_callback(env, insn_idx)) {
-				if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
+				if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
 					loop = true;
 					goto hit;
 				}
@@ -1359,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			}
 			/* attempt to detect infinite loop to avoid unnecessary doomed work */
 			if (states_maybe_looping(&sl->state, cur) &&
-			    states_equal(env, &sl->state, cur, EXACT) &&
+			    states_equal(env, &sl->state, cur, EXACT, &diff) &&
 			    !iter_active_depths_differ(&sl->state, cur) &&
 			    sl->state.may_goto_depth == cur->may_goto_depth &&
 			    sl->state.callback_unroll_depth == cur->callback_unroll_depth) {
@@ -1392,7 +1433,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 		}
 		/* See comments for mark_all_regs_read_and_precise() */
 		loop = incomplete_read_marks(env, &sl->state);
-		if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT)) {
+		if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT, &diff)) {
 hit:
 			sl->hit_cnt++;
 
@@ -1588,3 +1629,102 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	list_add(&new_sl->node, head);
 	return 0;
 }
+
+static bool callchain_matches_state(struct bpf_callchain *cc,
+				    struct bpf_verifier_state *st)
+{
+	int i;
+
+	if (st->curframe != cc->curframe)
+		return false;
+	for (i = 0; i < (int)cc->curframe; i++)
+		if (st->frame[i + 1]->callsite != cc->insn_idx[i])
+			return false;
+	return st->insn_idx == cc->insn_idx[cc->curframe];
+}
+
+struct state_diff_cnt {
+	struct bpf_state_diff diff;
+	u32 cnt;
+};
+
+static int state_diff_cmp(const void *a, const void *b)
+{
+	return cmp_int(((struct state_diff_cnt *)b)->cnt, ((struct state_diff_cnt *)a)->cnt);
+}
+
+static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b)
+{
+	return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind;
+}
+
+int bpf_sample_state_diffs(struct bpf_verifier_env *env,
+			   struct bpf_callchain *cc,
+			   struct bpf_state_diff *top_diffs,
+			   int *nr_diffs)
+{
+	struct bpf_verifier_state_list *sl_i, *sl_j;
+	struct state_diff_cnt *diff_cnts = NULL;
+	struct list_head *pos_i, *pos_j, *head;
+	u32 leaf_insn, callsite, hash_idx;
+	int i, cap = 0, nr_locs = 0;
+
+	leaf_insn = cc->insn_idx[cc->curframe];
+	callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC;
+	hash_idx = (leaf_insn ^ callsite) % env->prog->len;
+	head = &env->explored_states[hash_idx];
+
+	/*
+	 * Single bucket accumulates up to 64 states, no cond_resched() necessary.
+	 * See limits in is_state_visited().
+	 */
+	list_for_each(pos_i, head) {
+		sl_i = container_of(pos_i, struct bpf_verifier_state_list, node);
+		if (!callchain_matches_state(cc, &sl_i->state))
+			continue;
+		list_for_each(pos_j, head) {
+			struct bpf_state_diff diff = {};
+
+			if (pos_i == pos_j)
+				continue;
+			sl_j = container_of(pos_j, struct bpf_verifier_state_list, node);
+			if (!callchain_matches_state(cc, &sl_j->state))
+				continue;
+			if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff))
+				continue;
+			if (diff.kind == DIFF_OTHER)
+				continue;
+			for (i = 0; i < nr_locs; i++) {
+				if (state_diff_eq(&diff_cnts[i].diff, &diff)) {
+					diff_cnts[i].cnt++;
+					goto next;
+				}
+			}
+			if (nr_locs == cap) {
+				int new_cap = cap ? cap * 2 : 16;
+				struct state_diff_cnt *new;
+
+				new = kvrealloc(diff_cnts, new_cap * sizeof(*new),
+						GFP_KERNEL_ACCOUNT);
+				if (!new) {
+					kvfree(diff_cnts);
+					return -ENOMEM;
+				}
+				memset(new + cap, 0, (new_cap - cap) * sizeof(*new));
+				diff_cnts = new;
+				cap = new_cap;
+			}
+			diff_cnts[nr_locs].diff = diff;
+			diff_cnts[nr_locs].cnt = 1;
+			nr_locs++;
+next:;
+		}
+	}
+
+	sort(diff_cnts, nr_locs, sizeof(*diff_cnts), state_diff_cmp, NULL);
+	*nr_diffs = min(nr_locs, *nr_diffs);
+	for (i = 0; i < *nr_diffs; i++)
+		top_diffs[i] = diff_cnts[i].diff;
+	kvfree(diff_cnts);
+	return 0;
+}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 54b7ad65b7fc..d09c014462f1 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17292,13 +17292,13 @@ static void free_callchain_profile(struct bpf_verifier_env *env)
 	}
 }
 
-static void print_callchain_entry(struct bpf_verifier_env *env,
-				  struct bpf_callchain_entry *entry, int idx)
+static int print_callchain_entry(struct bpf_verifier_env *env,
+				 struct bpf_callchain_entry *entry, int idx)
 {
 	struct bpf_callchain *cc = &entry->cc;
 	const struct bpf_line_info *linfo;
 	struct bpf_subprog_info *sub;
-	int i, insn_idx;
+	int i, err, insn_idx;
 
 	verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n",
 		idx, entry->count);
@@ -17316,6 +17316,38 @@ static void print_callchain_entry(struct bpf_verifier_env *env,
 				BPF_LINE_INFO_LINE_NUM(linfo->line_col));
 		verbose(env, "\n");
 	}
+
+	insn_idx = cc->insn_idx[cc->curframe];
+	if (bpf_is_force_checkpoint(env, insn_idx)) {
+		struct bpf_state_diff top_diffs[3];
+		int nr_diffs = ARRAY_SIZE(top_diffs);
+
+		err = bpf_sample_state_diffs(env, cc, top_diffs, &nr_diffs);
+		if (err)
+			return err;
+		for (i = 0; i < nr_diffs; i++) {
+			struct bpf_state_diff *d = &top_diffs[i];
+
+			switch (d->kind) {
+			case DIFF_REG:
+				verbose(env, "    Most varying: R%d (frame %d)\n",
+					d->slot, d->frame);
+				break;
+			case DIFF_STACK:
+				verbose(env, "    Most varying: fp-%d (frame %d)\n",
+					(d->slot + 1) * BPF_REG_SIZE, d->frame);
+				break;
+			case DIFF_ARG:
+				verbose(env, "    Most varying: arg#%d (frame %d)\n",
+					d->slot, d->frame);
+				break;
+			default:
+				/* shouldn't really happen */
+				continue;
+			}
+		}
+	}
+	return 0;
 }
 
 static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub)
@@ -17340,16 +17372,16 @@ static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info
  * Print several most visited simulated stack traces,
  * and a disasembly of related subprograms.
  */
-static void print_hotspots(struct bpf_verifier_env *env)
+static int print_hotspots(struct bpf_verifier_env *env)
 {
 	DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {};
 	struct bpf_callchain_entry *top[3] = {};
 	struct bpf_callchain_entry *entry;
 	struct bpf_subprog_info *sub;
-	int i, j, bkt, nr_top = 0;
+	int i, j, err, bkt, nr_top = 0;
 
 	if (!(env->log.level & BPF_LOG_LEVEL))
-		return;
+		return 0;
 
 	/* Collect the hottest callchains */
 	hash_for_each(env->callchain_htab, bkt, entry, node) {
@@ -17366,7 +17398,7 @@ static void print_hotspots(struct bpf_verifier_env *env)
 		nr_top++;
 
 	if (!nr_top)
-		return;
+		return 0;
 
 	if (!(env->log.level & BPF_LOG_LEVEL2))
 		bpf_vlog_reset(&env->log, 0);
@@ -17388,9 +17420,14 @@ static void print_hotspots(struct bpf_verifier_env *env)
 
 	/* Print the hot callchains */
 	for (i = 0; i < nr_top; i++) {
-		print_callchain_entry(env, top[i], i + 1);
+		err = print_callchain_entry(env, top[i], i + 1);
+		if (err)
+			return err;
+
 		verbose(env, "\n");
 	}
+
+	return 0;
 }
 
 static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
@@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env)
 		insn_aux = &env->insn_aux_data[env->insn_idx];
 
 		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
-			print_hotspots(env);
+			err = print_hotspots(env);
+			if (err)
+				return err;
 			verbose(env,
 				"BPF program is too large. Processed %d insn\n",
 				env->insn_processed);

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
@ 2026-06-01 21:29   ` Emil Tsalapatis
  2026-06-01 21:38     ` Eduard Zingerman
  0 siblings, 1 reply; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 21:29 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> When a hot callchain ends at a iter_next, may_goto, or
> callback-calling instruction, compare cached states from
> explored_states to identify which registers or stack slots most
> frequently differ between states. Report the top 3 most varying
> locations.
>
> The full states diff computation would require some statistical
> analysis similar to k-means. To keep things simple, approximate this
> by modifying states_equal() to return the first location where old and
> current states differ, and counting most frequent diff locations.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

One nit below.

> ---
>  include/linux/bpf_verifier.h |  15 ++++
>  kernel/bpf/states.c          | 184 +++++++++++++++++++++++++++++++++++++------
>  kernel/bpf/verifier.c        |  57 +++++++++++---
>  3 files changed, 225 insertions(+), 31 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 347a155d8e21..cc83359774f5 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -662,6 +662,17 @@ struct bpf_loop {
>  	bool irreducible;
>  };
>  
> +struct bpf_state_diff {
> +	u8 slot;
> +	u8 frame;
> +	enum {
> +		DIFF_OTHER,
> +		DIFF_REG,
> +		DIFF_STACK,
> +		DIFF_ARG,
> +	} kind;
> +};
> +
>  struct bpf_callchain {
>  	u32 insn_idx[MAX_CALL_FRAMES];
>  	u32 curframe;
> @@ -1631,5 +1642,9 @@ int bpf_fixup_call_args(struct bpf_verifier_env *env);
>  int bpf_do_misc_fixups(struct bpf_verifier_env *env);
>  
>  int bpf_compute_loops(struct bpf_verifier_env *env);
> +int bpf_sample_state_diffs(struct bpf_verifier_env *env,
> +			   struct bpf_callchain *cc,
> +			   struct bpf_state_diff *top_diffs,
> +			   int *nr_diffs);
>  
>  #endif /* _LINUX_BPF_VERIFIER_H */
> diff --git a/kernel/bpf/states.c b/kernel/bpf/states.c
> index 877338136009..b1256658189d 100644
> --- a/kernel/bpf/states.c
> +++ b/kernel/bpf/states.c
> @@ -4,6 +4,7 @@
>  #include <linux/bpf_verifier.h>
>  #include <linux/cnum.h>
>  #include <linux/filter.h>
> +#include <linux/sort.h>
>  
>  #define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args)
>  
> @@ -696,7 +697,7 @@ static struct bpf_reg_state *scalar_reg_for_stack(struct bpf_verifier_env *env,
>  
>  static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  		      struct bpf_func_state *cur, struct bpf_idmap *idmap,
> -		      enum exact_level exact)
> +		      enum exact_level exact, struct bpf_state_diff *diff)
>  {
>  	int i, spi;
>  
> @@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  				old_type = STACK_INVALID;
>  			if (cur_type == STACK_POISON)
>  				cur_type = STACK_INVALID;
> -			if (i >= cur->allocated_stack || old_type != cur_type)
> +			if (i >= cur->allocated_stack || old_type != cur_type) {

This new logging of divergent stack slots is identical throughout stack safe. Can we factor
it out into an error label at the very end?

> +				diff->slot = spi;
> +				diff->kind = DIFF_STACK;
>  				return false;
> +			}
>  		}
>  
>  		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID ||
> @@ -735,8 +739,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  		/* explored stack has more populated slots than current stack
>  		 * and these slots were used
>  		 */
> -		if (i >= cur->allocated_stack)
> +		if (i >= cur->allocated_stack) {
> +			diff->slot = spi;
> +			diff->kind = DIFF_STACK;
>  			return false;
> +		}
>  
>  		/*
>  		 * 64 and 32-bit scalar spills vs MISC/INVALID slots and vice versa.
> @@ -748,8 +755,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  			old_reg = scalar_reg_for_stack(env, &old->stack[spi], im);
>  			cur_reg = scalar_reg_for_stack(env, &cur->stack[spi], im);
>  			if (old_reg && cur_reg) {
> -				if (!regsafe(env, old_reg, cur_reg, idmap, exact))
> +				if (!regsafe(env, old_reg, cur_reg, idmap, exact)) {
> +					diff->slot = spi;
> +					diff->kind = DIFF_STACK;
>  					return false;
> +				}
>  				i += (im == 0 ? BPF_REG_SIZE - 1 : 3);
>  				continue;
>  			}
> @@ -763,13 +773,16 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  		    cur->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_ZERO)
>  			continue;
>  		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] !=
> -		    cur->stack[spi].slot_type[i % BPF_REG_SIZE])
> +		    cur->stack[spi].slot_type[i % BPF_REG_SIZE]) {
>  			/* Ex: old explored (safe) state has STACK_SPILL in
>  			 * this stack slot, but current has STACK_MISC ->
>  			 * this verifier states are not equivalent,
>  			 * return false to continue verification of this path
>  			 */
> +			diff->slot = spi;
> +			diff->kind = DIFF_STACK;
>  			return false;
> +		}
>  		if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1)
>  			continue;
>  		/* Both old and cur are having same slot_type */
> @@ -786,16 +799,22 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  			 * return false to continue verification of this path
>  			 */
>  			if (!regsafe(env, &old->stack[spi].spilled_ptr,
> -				     &cur->stack[spi].spilled_ptr, idmap, exact))
> +				     &cur->stack[spi].spilled_ptr, idmap, exact)) {
> +				diff->slot = spi;
> +				diff->kind = DIFF_STACK;
>  				return false;
> +			}
>  			break;
>  		case STACK_DYNPTR:
>  			old_reg = &old->stack[spi].spilled_ptr;
>  			cur_reg = &cur->stack[spi].spilled_ptr;
>  			if (old_reg->dynptr.type != cur_reg->dynptr.type ||
>  			    old_reg->dynptr.first_slot != cur_reg->dynptr.first_slot ||
> -			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap))
> +			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) {
> +				diff->slot = spi;
> +				diff->kind = DIFF_STACK;
>  				return false;
> +			}
>  			break;
>  		case STACK_ITER:
>  			old_reg = &old->stack[spi].spilled_ptr;
> @@ -810,15 +829,21 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  			    old_reg->iter.btf_id != cur_reg->iter.btf_id ||
>  			    old_reg->iter.state != cur_reg->iter.state ||
>  			    /* ignore {old_reg,cur_reg}->iter.depth, see above */
> -			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap))
> +			    !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) {
> +				diff->slot = spi;
> +				diff->kind = DIFF_STACK;
>  				return false;
> +			}
>  			break;
>  		case STACK_IRQ_FLAG:
>  			old_reg = &old->stack[spi].spilled_ptr;
>  			cur_reg = &cur->stack[spi].spilled_ptr;
>  			if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap) ||
> -			    old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class)
> +			    old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) {
> +				diff->slot = spi;
> +				diff->kind = DIFF_STACK;
>  				return false;
> +			}
>  			break;
>  		case STACK_MISC:
>  		case STACK_ZERO:
> @@ -827,6 +852,8 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  			continue;
>  		/* Ensure that new unhandled slot types return false by default */
>  		default:
> +			diff->slot = spi;
> +			diff->kind = DIFF_STACK;
>  			return false;
>  		}
>  	}
> @@ -839,7 +866,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>   */
>  static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>  			   struct bpf_func_state *cur, struct bpf_idmap *idmap,
> -			   enum exact_level exact)
> +			   enum exact_level exact, struct bpf_state_diff *diff)
>  {
>  	int i, nslots;
>  
> @@ -852,8 +879,11 @@ static bool stack_arg_safe(struct bpf_verifier_env *env, struct bpf_func_state *
>  			  &old->stack_arg_regs[i] : &not_init;
>  		cur_arg = i < cur->out_stack_arg_cnt ?
>  			  &cur->stack_arg_regs[i] : &not_init;
> -		if (!regsafe(env, old_arg, cur_arg, idmap, exact))
> +		if (!regsafe(env, old_arg, cur_arg, idmap, exact)) {
> +			diff->slot = i;
> +			diff->kind = DIFF_ARG;
>  			return false;
> +		}
>  	}
>  
>  	return true;
> @@ -933,7 +963,8 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c
>   * the current state will reach 'bpf_exit' instruction safely
>   */
>  static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
> -			      struct bpf_func_state *cur, u32 insn_idx, enum exact_level exact)
> +			      struct bpf_func_state *cur, u32 insn_idx,
> +			      enum exact_level exact, struct bpf_state_diff *diff)
>  {
>  	u16 live_regs = env->insn_aux_data[insn_idx].live_regs_before;
>  	u16 i;
> @@ -947,13 +978,16 @@ static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_stat
>  	for (i = 0; i < MAX_BPF_REG; i++)
>  		if (((1 << i) & live_regs) &&
>  		    !regsafe(env, &old->regs[i], &cur->regs[i],
> -			     &env->idmap_scratch, exact))
> +			     &env->idmap_scratch, exact)) {
> +			diff->slot = i;
> +			diff->kind = DIFF_REG;
>  			return false;
> +		}
>  
> -	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact))
> +	if (!stacksafe(env, old, cur, &env->idmap_scratch, exact, diff))
>  		return false;
>  
> -	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact))
> +	if (!stack_arg_safe(env, old, cur, &env->idmap_scratch, exact, diff))
>  		return false;
>  
>  	return true;
> @@ -970,11 +1004,14 @@ static void reset_idmap_scratch(struct bpf_verifier_env *env)
>  static bool states_equal(struct bpf_verifier_env *env,
>  			 struct bpf_verifier_state *old,
>  			 struct bpf_verifier_state *cur,
> -			 enum exact_level exact)
> +			 enum exact_level exact,
> +			 struct bpf_state_diff *diff)
>  {
>  	u32 insn_idx;
>  	int i;
>  
> +	diff->kind = DIFF_OTHER;
> +
>  	if (old->curframe != cur->curframe)
>  		return false;
>  
> @@ -999,8 +1036,11 @@ static bool states_equal(struct bpf_verifier_env *env,
>  		insn_idx = bpf_frame_insn_idx(old, i);
>  		if (old->frame[i]->callsite != cur->frame[i]->callsite)
>  			return false;
> -		if (!func_states_equal(env, old->frame[i], cur->frame[i], insn_idx, exact))
> +		if (!func_states_equal(env, old->frame[i], cur->frame[i],
> +				       insn_idx, exact, diff)) {
> +			diff->frame = i;
>  			return false;
> +		}
>  	}
>  	return true;
>  }
> @@ -1231,6 +1271,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  	struct bpf_verifier_state_list *new_sl;
>  	struct bpf_verifier_state_list *sl;
>  	struct bpf_verifier_state *cur = env->cur_state, *new;
> +	struct bpf_state_diff diff = {};
>  	bool force_new_state, add_new_state, loop;
>  	int n, err, states_cnt = 0;
>  	struct list_head *pos, *tmp, *head;
> @@ -1320,7 +1361,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  			 * => unsafe memory access at 11 would not be caught.
>  			 */
>  			if (is_iter_next_insn(env, insn_idx)) {
> -				if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
> +				if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
>  					struct bpf_func_state *cur_frame;
>  					struct bpf_reg_state *iter_state, *iter_reg;
>  					int spi;
> @@ -1345,13 +1386,13 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  			}
>  			if (is_may_goto_insn_at(env, insn_idx)) {
>  				if (sl->state.may_goto_depth != cur->may_goto_depth &&
> -				    states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
> +				    states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
>  					loop = true;
>  					goto hit;
>  				}
>  			}
>  			if (bpf_calls_callback(env, insn_idx)) {
> -				if (states_equal(env, &sl->state, cur, RANGE_WITHIN)) {
> +				if (states_equal(env, &sl->state, cur, RANGE_WITHIN, &diff)) {
>  					loop = true;
>  					goto hit;
>  				}
> @@ -1359,7 +1400,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  			}
>  			/* attempt to detect infinite loop to avoid unnecessary doomed work */
>  			if (states_maybe_looping(&sl->state, cur) &&
> -			    states_equal(env, &sl->state, cur, EXACT) &&
> +			    states_equal(env, &sl->state, cur, EXACT, &diff) &&
>  			    !iter_active_depths_differ(&sl->state, cur) &&
>  			    sl->state.may_goto_depth == cur->may_goto_depth &&
>  			    sl->state.callback_unroll_depth == cur->callback_unroll_depth) {
> @@ -1392,7 +1433,7 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  		}
>  		/* See comments for mark_all_regs_read_and_precise() */
>  		loop = incomplete_read_marks(env, &sl->state);
> -		if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT)) {
> +		if (states_equal(env, &sl->state, cur, loop ? RANGE_WITHIN : NOT_EXACT, &diff)) {
>  hit:
>  			sl->hit_cnt++;
>  
> @@ -1588,3 +1629,102 @@ int bpf_is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  	list_add(&new_sl->node, head);
>  	return 0;
>  }
> +
> +static bool callchain_matches_state(struct bpf_callchain *cc,
> +				    struct bpf_verifier_state *st)
> +{
> +	int i;
> +
> +	if (st->curframe != cc->curframe)
> +		return false;
> +	for (i = 0; i < (int)cc->curframe; i++)
> +		if (st->frame[i + 1]->callsite != cc->insn_idx[i])
> +			return false;
> +	return st->insn_idx == cc->insn_idx[cc->curframe];
> +}
> +
> +struct state_diff_cnt {
> +	struct bpf_state_diff diff;
> +	u32 cnt;
> +};
> +
> +static int state_diff_cmp(const void *a, const void *b)
> +{
> +	return cmp_int(((struct state_diff_cnt *)b)->cnt, ((struct state_diff_cnt *)a)->cnt);
> +}
> +
> +static bool state_diff_eq(struct bpf_state_diff *a, struct bpf_state_diff *b)
> +{
> +	return a->frame == b->frame && a->slot == b->slot && a->kind == b->kind;
> +}
> +
> +int bpf_sample_state_diffs(struct bpf_verifier_env *env,
> +			   struct bpf_callchain *cc,
> +			   struct bpf_state_diff *top_diffs,
> +			   int *nr_diffs)
> +{
> +	struct bpf_verifier_state_list *sl_i, *sl_j;
> +	struct state_diff_cnt *diff_cnts = NULL;
> +	struct list_head *pos_i, *pos_j, *head;
> +	u32 leaf_insn, callsite, hash_idx;
> +	int i, cap = 0, nr_locs = 0;
> +
> +	leaf_insn = cc->insn_idx[cc->curframe];
> +	callsite = cc->curframe > 0 ? cc->insn_idx[cc->curframe - 1] : BPF_MAIN_FUNC;
> +	hash_idx = (leaf_insn ^ callsite) % env->prog->len;
> +	head = &env->explored_states[hash_idx];
> +
> +	/*
> +	 * Single bucket accumulates up to 64 states, no cond_resched() necessary.
> +	 * See limits in is_state_visited().
> +	 */
> +	list_for_each(pos_i, head) {
> +		sl_i = container_of(pos_i, struct bpf_verifier_state_list, node);
> +		if (!callchain_matches_state(cc, &sl_i->state))
> +			continue;
> +		list_for_each(pos_j, head) {
> +			struct bpf_state_diff diff = {};
> +
> +			if (pos_i == pos_j)
> +				continue;
> +			sl_j = container_of(pos_j, struct bpf_verifier_state_list, node);
> +			if (!callchain_matches_state(cc, &sl_j->state))
> +				continue;
> +			if (states_equal(env, &sl_i->state, &sl_j->state, NOT_EXACT, &diff))
> +				continue;
> +			if (diff.kind == DIFF_OTHER)
> +				continue;
> +			for (i = 0; i < nr_locs; i++) {
> +				if (state_diff_eq(&diff_cnts[i].diff, &diff)) {
> +					diff_cnts[i].cnt++;
> +					goto next;
> +				}
> +			}
> +			if (nr_locs == cap) {
> +				int new_cap = cap ? cap * 2 : 16;
> +				struct state_diff_cnt *new;
> +
> +				new = kvrealloc(diff_cnts, new_cap * sizeof(*new),
> +						GFP_KERNEL_ACCOUNT);
> +				if (!new) {
> +					kvfree(diff_cnts);
> +					return -ENOMEM;
> +				}
> +				memset(new + cap, 0, (new_cap - cap) * sizeof(*new));
> +				diff_cnts = new;
> +				cap = new_cap;
> +			}
> +			diff_cnts[nr_locs].diff = diff;
> +			diff_cnts[nr_locs].cnt = 1;
> +			nr_locs++;
> +next:;
> +		}
> +	}
> +
> +	sort(diff_cnts, nr_locs, sizeof(*diff_cnts), state_diff_cmp, NULL);
> +	*nr_diffs = min(nr_locs, *nr_diffs);
> +	for (i = 0; i < *nr_diffs; i++)
> +		top_diffs[i] = diff_cnts[i].diff;
> +	kvfree(diff_cnts);
> +	return 0;
> +}
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 54b7ad65b7fc..d09c014462f1 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -17292,13 +17292,13 @@ static void free_callchain_profile(struct bpf_verifier_env *env)
>  	}
>  }
>  
> -static void print_callchain_entry(struct bpf_verifier_env *env,
> -				  struct bpf_callchain_entry *entry, int idx)
> +static int print_callchain_entry(struct bpf_verifier_env *env,
> +				 struct bpf_callchain_entry *entry, int idx)
>  {
>  	struct bpf_callchain *cc = &entry->cc;
>  	const struct bpf_line_info *linfo;
>  	struct bpf_subprog_info *sub;
> -	int i, insn_idx;
> +	int i, err, insn_idx;
>  
>  	verbose(env, "#%d most visited simulated stacktrace (visited %llu times):\n",
>  		idx, entry->count);
> @@ -17316,6 +17316,38 @@ static void print_callchain_entry(struct bpf_verifier_env *env,
>  				BPF_LINE_INFO_LINE_NUM(linfo->line_col));
>  		verbose(env, "\n");
>  	}
> +
> +	insn_idx = cc->insn_idx[cc->curframe];
> +	if (bpf_is_force_checkpoint(env, insn_idx)) {
> +		struct bpf_state_diff top_diffs[3];
> +		int nr_diffs = ARRAY_SIZE(top_diffs);
> +
> +		err = bpf_sample_state_diffs(env, cc, top_diffs, &nr_diffs);
> +		if (err)
> +			return err;
> +		for (i = 0; i < nr_diffs; i++) {
> +			struct bpf_state_diff *d = &top_diffs[i];
> +
> +			switch (d->kind) {
> +			case DIFF_REG:
> +				verbose(env, "    Most varying: R%d (frame %d)\n",
> +					d->slot, d->frame);
> +				break;
> +			case DIFF_STACK:
> +				verbose(env, "    Most varying: fp-%d (frame %d)\n",
> +					(d->slot + 1) * BPF_REG_SIZE, d->frame);
> +				break;
> +			case DIFF_ARG:
> +				verbose(env, "    Most varying: arg#%d (frame %d)\n",
> +					d->slot, d->frame);
> +				break;
> +			default:
> +				/* shouldn't really happen */
> +				continue;
> +			}
> +		}
> +	}
> +	return 0;
>  }
>  
>  static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info *sub)
> @@ -17340,16 +17372,16 @@ static void disasm_subprog(struct bpf_verifier_env *env, struct bpf_subprog_info
>   * Print several most visited simulated stack traces,
>   * and a disasembly of related subprograms.
>   */
> -static void print_hotspots(struct bpf_verifier_env *env)
> +static int print_hotspots(struct bpf_verifier_env *env)
>  {
>  	DECLARE_BITMAP(printed_subs, BPF_MAX_SUBPROGS + 2) = {};
>  	struct bpf_callchain_entry *top[3] = {};
>  	struct bpf_callchain_entry *entry;
>  	struct bpf_subprog_info *sub;
> -	int i, j, bkt, nr_top = 0;
> +	int i, j, err, bkt, nr_top = 0;
>  
>  	if (!(env->log.level & BPF_LOG_LEVEL))
> -		return;
> +		return 0;
>  
>  	/* Collect the hottest callchains */
>  	hash_for_each(env->callchain_htab, bkt, entry, node) {
> @@ -17366,7 +17398,7 @@ static void print_hotspots(struct bpf_verifier_env *env)
>  		nr_top++;
>  
>  	if (!nr_top)
> -		return;
> +		return 0;
>  
>  	if (!(env->log.level & BPF_LOG_LEVEL2))
>  		bpf_vlog_reset(&env->log, 0);
> @@ -17388,9 +17420,14 @@ static void print_hotspots(struct bpf_verifier_env *env)
>  
>  	/* Print the hot callchains */
>  	for (i = 0; i < nr_top; i++) {
> -		print_callchain_entry(env, top[i], i + 1);
> +		err = print_callchain_entry(env, top[i], i + 1);
> +		if (err)
> +			return err;
> +
>  		verbose(env, "\n");
>  	}
> +
> +	return 0;
>  }
>  
>  static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state)
> @@ -17527,7 +17564,9 @@ static int do_check(struct bpf_verifier_env *env)
>  		insn_aux = &env->insn_aux_data[env->insn_idx];
>  
>  		if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) {
> -			print_hotspots(env);
> +			err = print_hotspots(env);
> +			if (err)
> +				return err;
>  			verbose(env,
>  				"BPF program is too large. Processed %d insn\n",
>  				env->insn_processed);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains
  2026-06-01 21:29   ` Emil Tsalapatis
@ 2026-06-01 21:38     ` Eduard Zingerman
  0 siblings, 0 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-06-01 21:38 UTC (permalink / raw)
  To: Emil Tsalapatis, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Mon, 2026-06-01 at 17:29 -0400, Emil Tsalapatis wrote:

[...]

> > @@ -720,8 +721,11 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> >  				old_type = STACK_INVALID;
> >  			if (cur_type == STACK_POISON)
> >  				cur_type = STACK_INVALID;
> > -			if (i >= cur->allocated_stack || old_type != cur_type)
> > +			if (i >= cur->allocated_stack || old_type != cur_type) {
> 
> This new logging of divergent stack slots is identical throughout stack safe. Can we factor
> it out into an error label at the very end?

Yes, makes sense.

> > +				diff->slot = spi;
> > +				diff->kind = DIFF_STACK;
> >  				return false;
> > +			}
> >  		}
> >  
> >  		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_INVALID ||

[...[

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report
  2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
                   ` (4 preceding siblings ...)
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
@ 2026-05-27  7:29 ` Eduard Zingerman
  2026-06-01 19:55   ` Emil Tsalapatis
  5 siblings, 1 reply; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Add verifier selftests that exhaust the 1M instruction budget and
verify that the new profiling report is produced:
- nested function calls with inner loop, check that stack trace is
  print correctly;
- two loops on the same level, check that both are reported;
- iterator-based loop with non-converging register, check that
  register is in the report;
- iterator-based loop with non-converging stack slot, check that stack
  slot is in the report.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/prog_tests/verifier.c  |   2 +
 .../selftests/bpf/progs/verifier_budget_report.c   | 175 +++++++++++++++++++++
 2 files changed, 177 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 35d92d176c1f..61a732a499b0 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -14,6 +14,7 @@
 #include "verifier_basic_stack.skel.h"
 #include "verifier_bitfield_write.skel.h"
 #include "verifier_bounds.skel.h"
+#include "verifier_budget_report.skel.h"
 #include "verifier_bounds_deduction.skel.h"
 #include "verifier_bounds_deduction_non_const.skel.h"
 #include "verifier_bounds_mix_sign_unsign.skel.h"
@@ -169,6 +170,7 @@ void test_verifier_bounds(void)               { RUN(verifier_bounds); }
 void test_verifier_bounds_deduction(void)     { RUN(verifier_bounds_deduction); }
 void test_verifier_bounds_deduction_non_const(void)     { RUN(verifier_bounds_deduction_non_const); }
 void test_verifier_bounds_mix_sign_unsign(void) { RUN(verifier_bounds_mix_sign_unsign); }
+void test_verifier_budget_report(void)        { RUN(verifier_budget_report); }
 void test_verifier_bpf_get_stack(void)        { RUN(verifier_bpf_get_stack); }
 void test_verifier_bpf_trap(void)             { RUN(verifier_bpf_trap); }
 void test_verifier_bswap(void)                { RUN(verifier_bswap); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_budget_report.c b/tools/testing/selftests/bpf/progs/verifier_budget_report.c
new file mode 100644
index 000000000000..7b87e53831c2
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_budget_report.c
@@ -0,0 +1,175 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+SEC("tracepoint")
+__failure
+__log_level(1)
+__msg("two_level_calls():")
+__msg("  0: (85) call pc+1")
+__msg("  1: (95) exit")
+__msg("two_level_calls__l1():")
+__msg("  2: (85) call pc+1")
+__msg("  3: (95) exit")
+__msg("two_level_calls__l2():")
+__msg("  4: (b7) r0 = 0")
+__msg("  5: (07) r0 += 1")
+__msg("  6: (05) goto pc-2")
+__msg("#1 most visited simulated stacktrace (visited 499999 times):")
+__msg("  two_level_calls/0 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("  two_level_calls__l1/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("  two_level_calls__l2/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("BPF program is too large. Processed 1000001 insn")
+__naked void two_level_calls(void)
+{
+	asm volatile ("					\
+	call two_level_calls__l1;		\
+	exit;						\
+"	::: __clobber_all);
+}
+
+__used
+static __naked void two_level_calls__l1(void)
+{
+	asm volatile ("					\
+	call two_level_calls__l2;		\
+	exit;						\
+"	::: __clobber_all);
+}
+
+__used
+static __naked void two_level_calls__l2(void)
+{
+	asm volatile ("					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	goto 1b;					\
+"	::: __clobber_all);
+}
+
+SEC("tracepoint")
+__failure
+__log_level(1)
+__msg("two_loops():")
+__msg("  0: (b7) r6 = 250000")
+__msg("  1: (b7) r0 = 0")
+__msg("  2: (07) r0 += 1")
+__msg("  3: (ad) if r0 < r6 goto pc-2")
+__msg("  4: (b7) r0 = 0")
+__msg("  5: (07) r0 += 1")
+__msg("  6: (05) goto pc-2")
+__msg("#1 most visited simulated stacktrace (visited 250000 times):")
+__msg("  two_loops/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("#2 most visited simulated stacktrace (visited 249999 times):")
+__msg("  two_loops/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("BPF program is too large")
+__naked void two_loops(void)
+{
+	asm volatile ("					\
+	r6 = 250000;					\
+	r0 = 0;						\
+1:	r0 += 1;					\
+	if r0 < r6 goto 1b;				\
+	r0 = 0;						\
+2:	r0 += 1;					\
+	goto 2b;					\
+"	::: __clobber_all);
+}
+
+SEC("socket")
+__failure
+__log_level(1)
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("iter_loop():")
+__msg("  8: (bf) r1 = r6")
+__msg("  9: (85) call bpf_iter_num_next")
+__msg("  10: (15) if r0 == 0x0 goto pc+3")
+__msg("  11: (07) r7 += 1")
+__msg("  12: (25) if r7 > 0xf4240 goto pc+1")
+__msg("  13: (05) goto pc-6")
+__msg("#1 most visited simulated stacktrace (visited 142856 times):")
+__msg("  iter_loop/8 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("#2 most visited simulated stacktrace (visited 142856 times):")
+__msg("  iter_loop/9 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("    Most varying: R7 (frame 0)")
+__msg("BPF program is too large. Processed 1000001 insn")
+__naked void iter_loop(void)
+{
+	asm volatile ("					\
+	r1 = r10;					\
+	r1 += -16;					\
+	w2 = 0;						\
+	w3 = 1000000;					\
+	call %[bpf_iter_num_new];			\
+	r7 = 0;						\
+	r6 = r10;					\
+	r6 += -16;					\
+1:	r1 = r6;					\
+	call %[bpf_iter_num_next];			\
+	if r0 == 0 goto 2f;				\
+	r7 += 1;					\
+	if r7 > 1000000 goto 2f;			\
+	goto 1b;					\
+2:	r1 = r6;					\
+	call %[bpf_iter_num_destroy];			\
+	r0 = r7;					\
+	exit;						\
+"	:
+	: __imm(bpf_iter_num_new),
+	  __imm(bpf_iter_num_next),
+	  __imm(bpf_iter_num_destroy)
+	: __clobber_all);
+}
+
+SEC("socket")
+__failure
+__log_level(1)
+__flag(BPF_F_TEST_STATE_FREQ)
+__msg("iter_stack_varying():")
+__msg("  10: (85) call bpf_iter_num_next")
+__msg("#1 most visited simulated stacktrace (visited 111110 times):")
+__msg("  iter_stack_varying/10 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
+__msg("    Most varying: fp-24 (frame 0)")
+__msg("BPF program is too large. Processed 1000001 insn")
+__naked void iter_stack_varying(void)
+{
+	asm volatile ("					\
+	r1 = r10;					\
+	r1 += -16;					\
+	w2 = 0;						\
+	w3 = 1000000;					\
+	call %[bpf_iter_num_new];			\
+	r8 = 0;						\
+	*(u64 *)(r10 - 24) = r8;			\
+	r6 = r10;					\
+	r6 += -16;					\
+1:	r1 = r6;					\
+	call %[bpf_iter_num_next];			\
+	if r0 == 0 goto 2f;				\
+	r8 = *(u64 *)(r10 - 24);			\
+	r8 += 1;					\
+	*(u64 *)(r10 - 24) = r8;			\
+	if r8 > 1000000 goto 2f;			\
+	goto 1b;					\
+2:	r1 = r6;					\
+	call %[bpf_iter_num_destroy];			\
+	r0 = *(u64 *)(r10 - 24);			\
+	exit;						\
+"	:
+	: __imm(bpf_iter_num_new),
+	  __imm(bpf_iter_num_next),
+	  __imm(bpf_iter_num_destroy)
+	: __clobber_all);
+}
+
+void iter_num_btf_fixup(void)
+{
+	bpf_iter_num_new(0, 0, 0);
+	bpf_iter_num_next(0);
+	bpf_iter_num_destroy(0);
+}
+
+char _license[] SEC("license") = "GPL";

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
@ 2026-06-01 19:55   ` Emil Tsalapatis
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Tsalapatis @ 2026-06-01 19:55 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed May 27, 2026 at 3:29 AM EDT, Eduard Zingerman wrote:
> Add verifier selftests that exhaust the 1M instruction budget and
> verify that the new profiling report is produced:
> - nested function calls with inner loop, check that stack trace is
>   print correctly;
> - two loops on the same level, check that both are reported;
> - iterator-based loop with non-converging register, check that
>   register is in the report;
> - iterator-based loop with non-converging stack slot, check that stack
>   slot is in the report.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>  tools/testing/selftests/bpf/prog_tests/verifier.c  |   2 +
>  .../selftests/bpf/progs/verifier_budget_report.c   | 175 +++++++++++++++++++++
>  2 files changed, 177 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
> index 35d92d176c1f..61a732a499b0 100644
> --- a/tools/testing/selftests/bpf/prog_tests/verifier.c
> +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
> @@ -14,6 +14,7 @@
>  #include "verifier_basic_stack.skel.h"
>  #include "verifier_bitfield_write.skel.h"
>  #include "verifier_bounds.skel.h"
> +#include "verifier_budget_report.skel.h"
>  #include "verifier_bounds_deduction.skel.h"
>  #include "verifier_bounds_deduction_non_const.skel.h"
>  #include "verifier_bounds_mix_sign_unsign.skel.h"
> @@ -169,6 +170,7 @@ void test_verifier_bounds(void)               { RUN(verifier_bounds); }
>  void test_verifier_bounds_deduction(void)     { RUN(verifier_bounds_deduction); }
>  void test_verifier_bounds_deduction_non_const(void)     { RUN(verifier_bounds_deduction_non_const); }
>  void test_verifier_bounds_mix_sign_unsign(void) { RUN(verifier_bounds_mix_sign_unsign); }
> +void test_verifier_budget_report(void)        { RUN(verifier_budget_report); }
>  void test_verifier_bpf_get_stack(void)        { RUN(verifier_bpf_get_stack); }
>  void test_verifier_bpf_trap(void)             { RUN(verifier_bpf_trap); }
>  void test_verifier_bswap(void)                { RUN(verifier_bswap); }
> diff --git a/tools/testing/selftests/bpf/progs/verifier_budget_report.c b/tools/testing/selftests/bpf/progs/verifier_budget_report.c
> new file mode 100644
> index 000000000000..7b87e53831c2
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/verifier_budget_report.c
> @@ -0,0 +1,175 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */
> +
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +SEC("tracepoint")
> +__failure
> +__log_level(1)
> +__msg("two_level_calls():")
> +__msg("  0: (85) call pc+1")
> +__msg("  1: (95) exit")
> +__msg("two_level_calls__l1():")
> +__msg("  2: (85) call pc+1")
> +__msg("  3: (95) exit")
> +__msg("two_level_calls__l2():")
> +__msg("  4: (b7) r0 = 0")
> +__msg("  5: (07) r0 += 1")
> +__msg("  6: (05) goto pc-2")
> +__msg("#1 most visited simulated stacktrace (visited 499999 times):")
> +__msg("  two_level_calls/0 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("  two_level_calls__l1/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("  two_level_calls__l2/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("BPF program is too large. Processed 1000001 insn")
> +__naked void two_level_calls(void)
> +{
> +	asm volatile ("					\
> +	call two_level_calls__l1;		\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +__used
> +static __naked void two_level_calls__l1(void)
> +{
> +	asm volatile ("					\
> +	call two_level_calls__l2;		\
> +	exit;						\
> +"	::: __clobber_all);
> +}
> +
> +__used
> +static __naked void two_level_calls__l2(void)
> +{
> +	asm volatile ("					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	goto 1b;					\
> +"	::: __clobber_all);
> +}
> +
> +SEC("tracepoint")
> +__failure
> +__log_level(1)
> +__msg("two_loops():")
> +__msg("  0: (b7) r6 = 250000")
> +__msg("  1: (b7) r0 = 0")
> +__msg("  2: (07) r0 += 1")
> +__msg("  3: (ad) if r0 < r6 goto pc-2")
> +__msg("  4: (b7) r0 = 0")
> +__msg("  5: (07) r0 += 1")
> +__msg("  6: (05) goto pc-2")
> +__msg("#1 most visited simulated stacktrace (visited 250000 times):")
> +__msg("  two_loops/2 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("#2 most visited simulated stacktrace (visited 249999 times):")
> +__msg("  two_loops/5 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("BPF program is too large")
> +__naked void two_loops(void)
> +{
> +	asm volatile ("					\
> +	r6 = 250000;					\
> +	r0 = 0;						\
> +1:	r0 += 1;					\
> +	if r0 < r6 goto 1b;				\
> +	r0 = 0;						\
> +2:	r0 += 1;					\
> +	goto 2b;					\
> +"	::: __clobber_all);
> +}
> +
> +SEC("socket")
> +__failure
> +__log_level(1)
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__msg("iter_loop():")
> +__msg("  8: (bf) r1 = r6")
> +__msg("  9: (85) call bpf_iter_num_next")
> +__msg("  10: (15) if r0 == 0x0 goto pc+3")
> +__msg("  11: (07) r7 += 1")
> +__msg("  12: (25) if r7 > 0xf4240 goto pc+1")
> +__msg("  13: (05) goto pc-6")
> +__msg("#1 most visited simulated stacktrace (visited 142856 times):")
> +__msg("  iter_loop/8 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("#2 most visited simulated stacktrace (visited 142856 times):")
> +__msg("  iter_loop/9 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("    Most varying: R7 (frame 0)")
> +__msg("BPF program is too large. Processed 1000001 insn")
> +__naked void iter_loop(void)
> +{
> +	asm volatile ("					\
> +	r1 = r10;					\
> +	r1 += -16;					\
> +	w2 = 0;						\
> +	w3 = 1000000;					\
> +	call %[bpf_iter_num_new];			\
> +	r7 = 0;						\
> +	r6 = r10;					\
> +	r6 += -16;					\
> +1:	r1 = r6;					\
> +	call %[bpf_iter_num_next];			\
> +	if r0 == 0 goto 2f;				\
> +	r7 += 1;					\
> +	if r7 > 1000000 goto 2f;			\
> +	goto 1b;					\
> +2:	r1 = r6;					\
> +	call %[bpf_iter_num_destroy];			\
> +	r0 = r7;					\
> +	exit;						\
> +"	:
> +	: __imm(bpf_iter_num_new),
> +	  __imm(bpf_iter_num_next),
> +	  __imm(bpf_iter_num_destroy)
> +	: __clobber_all);
> +}
> +
> +SEC("socket")
> +__failure
> +__log_level(1)
> +__flag(BPF_F_TEST_STATE_FREQ)
> +__msg("iter_stack_varying():")
> +__msg("  10: (85) call bpf_iter_num_next")
> +__msg("#1 most visited simulated stacktrace (visited 111110 times):")
> +__msg("  iter_stack_varying/10 ({{.*}}/verifier_budget_report.c:{{[0-9]+}})")
> +__msg("    Most varying: fp-24 (frame 0)")
> +__msg("BPF program is too large. Processed 1000001 insn")
> +__naked void iter_stack_varying(void)
> +{
> +	asm volatile ("					\
> +	r1 = r10;					\
> +	r1 += -16;					\
> +	w2 = 0;						\
> +	w3 = 1000000;					\
> +	call %[bpf_iter_num_new];			\
> +	r8 = 0;						\
> +	*(u64 *)(r10 - 24) = r8;			\
> +	r6 = r10;					\
> +	r6 += -16;					\
> +1:	r1 = r6;					\
> +	call %[bpf_iter_num_next];			\
> +	if r0 == 0 goto 2f;				\
> +	r8 = *(u64 *)(r10 - 24);			\
> +	r8 += 1;					\
> +	*(u64 *)(r10 - 24) = r8;			\
> +	if r8 > 1000000 goto 2f;			\
> +	goto 1b;					\
> +2:	r1 = r6;					\
> +	call %[bpf_iter_num_destroy];			\
> +	r0 = *(u64 *)(r10 - 24);			\
> +	exit;						\
> +"	:
> +	: __imm(bpf_iter_num_new),
> +	  __imm(bpf_iter_num_next),
> +	  __imm(bpf_iter_num_destroy)
> +	: __clobber_all);
> +}
> +
> +void iter_num_btf_fixup(void)
> +{
> +	bpf_iter_num_new(0, 0, 0);
> +	bpf_iter_num_next(0);
> +	bpf_iter_num_destroy(0);
> +}
> +
> +char _license[] SEC("license") = "GPL";


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-06-01 21:38 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
2026-06-01  5:50   ` Emil Tsalapatis
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman
2026-06-01 19:12   ` Emil Tsalapatis
2026-06-01 19:22     ` Eduard Zingerman
2026-06-01 19:29       ` Emil Tsalapatis
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
2026-05-27  8:50   ` sashiko-bot
2026-06-01 19:37   ` Emil Tsalapatis
2026-06-01 19:44     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
2026-05-29 10:23   ` Jiri Olsa
2026-05-29 18:44     ` Eduard Zingerman
2026-05-30  9:34       ` Jiri Olsa
2026-06-01 19:50   ` Emil Tsalapatis
2026-06-01 19:55     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
2026-06-01 21:29   ` Emil Tsalapatis
2026-06-01 21:38     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
2026-06-01 19:55   ` Emil Tsalapatis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox