BPF List
 help / color / mirror / Atom feed
* [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit
@ 2026-05-27  7:29 Eduard Zingerman
  2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Eduard Zingerman @ 2026-05-27  7:29 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

When the BPF verifier exceeds the 1M instruction budget, the current
error output shows a random execution trace that happens to be active
at the moment, which is not very helpful for debugging.

This series improves the error report using a profiler-inspired
approach: collect and count "callchain" stack traces that the verifier
visits during program validation, and report the top 3 hottest traces
when the budget is exhausted. To minimize performance an memory impact
of such profiling, only collect samples when verifier visits loop
headers, iterator next, may_goto and callback-calling instructions.

For callchains ending at iterator next, may_goto, or callback-calling
instructions, identify which registers or stack slots most frequently
differ between cached and current states.

Here is an example of the report for scx lavd_dispatch (an old
version), with verifier limited to 200K instructions to trigger the
error:

  lavd_dispatch():
    ; void BPF_STRUCT_OPS(lavd_dispatch, s32 cpu, struct task_struct *prev) @ main.bpf.c:889
    ... disassembly ...

  consume_task():
    ; bool consume_task(u64 cpu_dsq_id, u64 cpdom_dsq_id) @ balance.bpf.c:410
    ... disassembly ...

  #1 most visited simulated stacktrace (visited 1807 times):
    lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
    consume_task/2715 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:316)

  #2 most visited simulated stacktrace (visited 1682 times):
    lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
    consume_task/2994 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:386)

  #3 most visited simulated stacktrace (visited 8 times):
    lavd_dispatch/255 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1022)
      Most varying: R7 (frame 0)

  BPF program is too large. Processed 200001 insn

Changelog:
v2 -> v3 (bots):
  - Compare leaf instruction index in callchain_matches_state().
  - Use cmp_int() in stat_diff_cmp() to avoid overflow warnings.
  - Add a comment in bpf_sample_state_diffs() on why it can't stale.
  - Add cond_resched() call in bpf_compute_loops() to guard against
    pathological inputs.
v1 -> v2 (bots):
  - Use kvfree() in bpf_compute_loops().
  - Adjust fwd_edges_no_loop test case to avoid dead code elimination
    converting 'if' to 'goto'.
  - Use GFP_KERNEL_ACCOUNT for callchain entry allocation in
    update_callchain_profile().
  - Zero-initialize 'cc' in update_callchain_profile() to avoid
    copying uninitialized stack memory to the heap.
  - Use %td instead of %ld for ptrdiff_t format specifier in
    print_callchain_entry() and disasm_subprog().
  - Size printed_subs bitmap as BPF_MAX_SUBPROGS + 2 to account for
    fake and exception subprograms.
  - Fix bpf_sample_state_diffs() inner loop to iterate from head
    instead of pos_i, avoiding container_of() on the dummy list head.
  - Add DIFF_OTHER to distinguish states that differ because of idmap
    or other inconsistencies.

v1: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v1-0-51e4f2c59780@gmail.com/T/
v2: https://lore.kernel.org/bpf/20260526-better-1m-reporting-v2-0-e7ec61c45d41@gmail.com/T/
---
Eduard Zingerman (6):
      bpf: move live registers and scc printout to a standalone function
      bpf: compute loops hierarchy
      selftests/bpf: test cases for loop hierarchy computation
      bpf: report hot simulated callchains when 1M instructions limit is met
      bpf: report register diff summary for hot callchains
      selftests/bpf: test budget exhaustion profiling report

 include/linux/bpf_verifier.h                       |  39 ++++
 kernel/bpf/Makefile                                |   2 +-
 kernel/bpf/fixups.c                                |   5 +
 kernel/bpf/liveness.c                              |  22 +-
 kernel/bpf/loops.c                                 | 196 +++++++++++++++++
 kernel/bpf/states.c                                | 184 ++++++++++++++--
 kernel/bpf/verifier.c                              | 233 +++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/verifier.c  |   4 +
 .../selftests/bpf/progs/verifier_budget_report.c   | 175 ++++++++++++++++
 .../selftests/bpf/progs/verifier_live_stack.c      |   2 +-
 .../selftests/bpf/progs/verifier_loop_hierarchy.c  | 233 +++++++++++++++++++++
 11 files changed, 1050 insertions(+), 45 deletions(-)
---
base-commit: 8496d9020ff37a33c2a7b2fc84350fd03ffbde78
change-id: 20260525-better-1m-reporting-1d795a21cf72

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2026-06-01 21:38 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27  7:29 [PATCH RFC bpf-next v3 0/6] bpf: better error reporting when verifier hits 1M instructions limit Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 1/6] bpf: move live registers and scc printout to a standalone function Eduard Zingerman
2026-06-01  5:50   ` Emil Tsalapatis
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 2/6] bpf: compute loops hierarchy Eduard Zingerman
2026-06-01 19:12   ` Emil Tsalapatis
2026-06-01 19:22     ` Eduard Zingerman
2026-06-01 19:29       ` Emil Tsalapatis
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 3/6] selftests/bpf: test cases for loop hierarchy computation Eduard Zingerman
2026-05-27  8:50   ` sashiko-bot
2026-06-01 19:37   ` Emil Tsalapatis
2026-06-01 19:44     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 4/6] bpf: report hot simulated callchains when 1M instructions limit is met Eduard Zingerman
2026-05-29 10:23   ` Jiri Olsa
2026-05-29 18:44     ` Eduard Zingerman
2026-05-30  9:34       ` Jiri Olsa
2026-06-01 19:50   ` Emil Tsalapatis
2026-06-01 19:55     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 5/6] bpf: report register diff summary for hot callchains Eduard Zingerman
2026-06-01 21:29   ` Emil Tsalapatis
2026-06-01 21:38     ` Eduard Zingerman
2026-05-27  7:29 ` [PATCH RFC bpf-next v3 6/6] selftests/bpf: test budget exhaustion profiling report Eduard Zingerman
2026-06-01 19:55   ` Emil Tsalapatis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox