[PATCH bpf-next v2 00/11] bpf: propagate read/precision marks over state graph backedges

BPF List
 help / color / mirror / Atom feed

From: Eduard Zingerman <eddyz87@gmail.com>
To: bpf@vger.kernel.org, ast@kernel.org
Cc: andrii@kernel.org, daniel@iogearbox.net, martin.lau@linux.dev,
	kernel-team@fb.com, yonghong.song@linux.dev,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: [PATCH bpf-next v2 00/11] bpf: propagate read/precision marks over state graph backedges
Date: Fri,  6 Jun 2025 14:03:41 -0700	[thread overview]
Message-ID: <20250606210352.1692944-1-eddyz87@gmail.com> (raw)

Current loop_entry-based states comparison logic does not handle the
following case:

 .-> A --.  Assume the states are visited in the order A, B, C.
 |   |   |  Assume that state B reaches a state equivalent to state A.
 |   v   v  At this point, state C is not processed yet, so state A
 '-- B   C  has not received any read or precision marks from C.
            As a result, these marks won't be propagated to B.

If B has incomplete marks, it is unsafe to use it in states_equal()
checks. This issue was first reported in [1].

This patch-set
--------------

Here is the gist of the algorithm implemented by this patch-set:
- Compute strongly connected components (SCCs) in the program CFG.
- When a verifier state enters an SCC, that state is recorded as the
  SCC's entry point.
- When a verifier state is found to be equivalent to another
  (e.g., B to A in the example above), it is recorded as a
  states-graph backedge.
- Backedges are accumulated per SCC (*).
- When an SCC entry state reaches `branches == 0`, propagate read and
  precision marks through the backedges until a fixed point is reached
  (e.g., from A to B, from C to A, and then again from A to B).

(*) This is an oversimplification, see patch #8 for details.

Unfortunately, this means that commit [2] needs to be reverted,
as precision propagation requires access to jump history,
and backedges represent history not belonging to `env->cur_state`.

Details are provided in patch #8; a comment in `is_state_visited()`
explains most of the mechanics.

Patch #2 adds a `compute_scc()` function, which computes SCCs in the
program CFG. This function was tested using property-based testing in
[3], but it is not included in selftests.

Previous attempt
----------------

A previous attempt to fix this is described in [4]:
1. Within the states loop, `states_equal(... RANGE_WITHIN)` ignores
   read and precision marks.
2. For states outside the loop, all registers for states within the
   loop are marked as read and precise.

This approach led to an 86x regression on the `cond_break1` selftest.
In that test, one loop was followed by another, and a certain variable
was incremented in the second loop. This variable was marked as
precise due to rule (2), which hindered convergence in the first loop.

After some off-list discussion, it was decided that this might be a
typical case and such regressions are undesirable.

This patch-set avoids such eager precision markings.

Alternatives
------------

Another option is to associate a mask of read/written/precise stack
slots with each instruction. This mask can be populated during
verifier states exploration. Upon reaching an `EXIT` instruction or an
equivalent state, the accumulated masks can be used to propagate
read/written/precise bits across the program's control flow graph
using an analysis similar to use-def.

Unfortunately, a naive implementation of this approach [5] results in
a 10x regression in `veristat` for some `sched_ext` programs due to
the inability to express the must-write property. This issue requires
further investigation.

Changes in verification performance
-----------------------------------

There are some veristat regressions when comparing with master using
selftests and sched_ext BPF binaries. The comparison is done using
master from [6] and this patch-set from [7] where memory accounting
logic is added to veristat.

========= selftests: master vs patch-set =========

File                             Program                              Insns                          Peak memory (KiB)
-------------------------------  -----------------------------------  -----  ----  ----------------  ----  -----  ----------------
arena_list.bpf.o                 arena_list_add                         374   406      +32 (+8.56%)     0      0       +0 (+0.00%)
dynptr_success.bpf.o             test_copy_from_user_str_dynptr         268   284      +16 (+5.97%)   768   1024    +256 (+33.33%)
dynptr_success.bpf.o             test_probe_read_kernel_dynptr          994  1101    +107 (+10.76%)  1024   1240    +216 (+21.09%)
dynptr_success.bpf.o             test_probe_read_user_dynptr           1000  1107    +107 (+10.70%)  1024   1240    +216 (+21.09%)
iters.bpf.o                      checkpoint_states_deletion            1211  1216       +5 (+0.41%)   512   1288   +776 (+151.56%)
iters.bpf.o                      clean_live_states                      588   620      +32 (+5.44%)   256    764   +508 (+198.44%)
pyperf600_iter.bpf.o             on_event                              2591  5929  +3338 (+128.83%)  4648  11320  +6672 (+143.55%)

Total progs: 3600
Old success: 2084
New success: 2084

total_insns diff min     :    0.00 %
total_insns diff max     :  128.83 %
total_insns abs max old  : 837,487
total_insns abs max new  : 837,487
   0 .. 5    %: 3592
   5 .. 15   %: 6
  20 .. 30   %: 1
 125 .. 130  %: 1

mem_peak diff min     : -100.00 %
mem_peak diff max     :  198.44 %
mem_peak abs max old  : 269,312 KiB
mem_peak abs max new  : 269,312 KiB
-100 .. -95  %: 63
 -60 .. -50  %: 1
 -10 .. 0    %: 20
   0 .. 5    %: 3413
   5 .. 15   %: 6
  20 .. 30   %: 4
  30 .. 40   %: 7
  40 .. 50   %: 1
  50 .. 60   %: 3
  60 .. 70   %: 1
 140 .. 150  %: 1
 150 .. 160  %: 1
 195 .. 200  %: 1
0 -> something: 78

========= scx: master vs patch-set =========

Program                   Insns                          Peak memory (KiB)
------------------------  -----  -----  ---------------  -----  -----  -----------------
arena_topology_node_init   2129   2391   +262 (+12.31%)    768    768        +0 (+0.00%)
arena_topology_node_init   2129   2391   +262 (+12.31%)    768    768        +0 (+0.00%)
arena_topology_print        591    826   +235 (+39.76%)    256    256        +0 (+0.00%)
arena_topology_print        591    826   +235 (+39.76%)    256    256        +0 (+0.00%)
chaos_init                 4261   5090   +829 (+19.46%)   2540   3032     +492 (+19.37%)
lavd_cpu_offline           5074   5706   +632 (+12.46%)   3940   6304    +2364 (+60.00%)
lavd_cpu_online            5074   5706   +632 (+12.46%)   3940   6304    +2364 (+60.00%)
lavd_dispatch             41769  47578  +5809 (+13.91%)   6208  29200  +22992 (+370.36%)
lavd_enqueue              24190  27749  +3559 (+14.71%)  22740  42872   +20132 (+88.53%)
lavd_init                  6748   7474   +726 (+10.76%)   5096   6864    +1768 (+34.69%)
lavd_select_cpu           27243  30802  +3559 (+13.06%)  26056  45048   +18992 (+72.89%)
layered_dispatch           8909  13295  +4386 (+49.23%)   7976  18028  +10052 (+126.03%)
layered_dump               1890   2097   +207 (+10.95%)   2036   3036    +1000 (+49.12%)
layered_init               4149   4531    +382 (+9.21%)   2716   2896      +180 (+6.63%)
layered_runnable           2566   2601     +35 (+1.36%)    748   1244     +496 (+66.31%)
p2dq_init                  3640   4474   +834 (+22.91%)   2004   2504     +500 (+24.95%)
refresh_layer_cpumasks      735   1048   +313 (+42.59%)    256    756    +500 (+195.31%)
rusty_init_task           31104  31104      +0 (+0.00%)   2164   2368      +204 (+9.43%)
rusty_select_cpu           1110   1110      +0 (+0.00%)    768   1004     +236 (+30.73%)
tp_cgroup_attach_task       149    203    +54 (+36.24%)      0      0        +0 (+0.00%)

Total progs: 147
Old success: 134
New success: 134
total_insns diff min     :    0.00 %
total_insns diff max     :   49.23 %
total_insns abs max old  :  72,434
total_insns abs max new  :  72,434
   0 .. 5    %: 132
   5 .. 15   %: 9
  15 .. 25   %: 2
  35 .. 45   %: 3
  45 .. 50   %: 1

mem_peak diff min     :   -0.95 %
mem_peak diff max     :  370.36 %
mem_peak abs max old  :  26,056 KiB
mem_peak abs max new  :  45,048 KiB
  -5 .. 0    %: 4
   0 .. 5    %: 128
   5 .. 15   %: 2
  15 .. 25   %: 2
  30 .. 40   %: 2
  45 .. 55   %: 1
  60 .. 70   %: 3
  70 .. 80   %: 1
  85 .. 95   %: 1
 125 .. 135  %: 1
 195 .. 205  %: 1
 370 .. 375  %: 1

Changelog
---------

v1: https://lore.kernel.org/bpf/20250524191932.389444-1-eddyz87@gmail.com/
v1 -> v2:
- Rebase
- added mem_peak statistics (Alexei)
- selftests: fixed comments and removed useless r7 assignments (Yonghong)

Links
-----

[1] https://lore.kernel.org/bpf/20250312031344.3735498-1-eddyz87@gmail.com/
[2] commit 96a30e469ca1 ("bpf: use common instruction history across all states")
[3] https://github.com/eddyz87/scc-test
[4] https://lore.kernel.org/bpf/20250426104634.744077-1-eddyz87@gmail.com/
[5] https://github.com/eddyz87/bpf/tree/propagate-read-and-precision-in-cfg
[6] https://github.com/eddyz87/bpf/tree/veristat-memory-accounting
[7] https://github.com/eddyz87/bpf/tree/scc-accumulate-backedges

Eduard Zingerman (11):
  Revert "bpf: use common instruction history across all states"
  bpf: compute SCCs in program control flow graph
  bpf: frame_insn_idx() utility function
  bpf: starting_state parameter for __mark_chain_precision()
  bpf: set 'changed' status if propagate_precision() did any updates
  bpf: set 'changed' status if propagate_liveness() did any updates
  bpf: move REG_LIVE_DONE check to clean_live_states()
  bpf: propagate read/precision marks over state graph backedges
  bpf: remove {update,get}_loop_entry functions
  bpf: include backedges in peak_states stat
  selftests/bpf: tests with a loop state missing read/precision mark

 include/linux/bpf_verifier.h              |  77 +-
 kernel/bpf/verifier.c                     | 970 +++++++++++++++-------
 tools/testing/selftests/bpf/progs/iters.c | 277 ++++++
 3 files changed, 998 insertions(+), 326 deletions(-)

-- 
2.48.1

next             reply	other threads:[~2025-06-06 21:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-06 21:03 Eduard Zingerman [this message]
2025-06-06 21:03 ` [PATCH bpf-next v2 01/11] Revert "bpf: use common instruction history across all states" Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 02/11] bpf: compute SCCs in program control flow graph Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 03/11] bpf: frame_insn_idx() utility function Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 04/11] bpf: starting_state parameter for __mark_chain_precision() Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 05/11] bpf: set 'changed' status if propagate_precision() did any updates Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 06/11] bpf: set 'changed' status if propagate_liveness() " Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 07/11] bpf: move REG_LIVE_DONE check to clean_live_states() Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 08/11] bpf: propagate read/precision marks over state graph backedges Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 09/11] bpf: remove {update,get}_loop_entry functions Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 10/11] bpf: include backedges in peak_states stat Eduard Zingerman
2025-06-06 21:03 ` [PATCH bpf-next v2 11/11] selftests/bpf: tests with a loop state missing read/precision mark Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250606210352.1692944-1-eddyz87@gmail.com \
    --to=eddyz87@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@linux.dev \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox