BPF List
 help / color / mirror / Atom feed
* [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence
@ 2025-01-22 12:04 Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 1/7] bpf: copy_verifier_state() should copy 'loop_entry' field Eduard Zingerman
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

This RFC consists of a bug fix and two improvements for
iterator/callback based loops handling:
- Patch #1 fixes a bug in copy_verifier_state(), where field
  'loop_entry' was not copied. The fix has negative impact on
  verification performance.
- Patch #3 mitigates negative impact from patch #1 by avoiding
  clean_live_states() for states that have loop_entry that is still
  being verified. This reduces amount of processed instructions for
  sched_ext by 28-92% for some programs.
- Patches #5,6 introduce a simple live registers DFA analysis.
  Results of this analysis are used in func_states_equal() and further
  reduce amount of processed instructions for sched_ext by 17-30% for
  some programs.

Below are veristat results comparing master to this RFC.

selftests:

File                   Program                     Insns      (DIFF)  States    (DIFF)
---------------------  --------------------------  -----------------  ----------------
arena_htab.bpf.o       arena_htab_llvm                -294 (-41.00%)     -20 (-35.09%)
arena_htab_asm.bpf.o   arena_htab_asm                 -152 (-25.46%)     -10 (-21.28%)
arena_list.bpf.o       arena_list_add                 +329 (+22.04%)      +7 (+23.33%)
arena_list.bpf.o       arena_list_del                 -161 (-52.10%)     -12 (-52.17%)
iters.bpf.o            checkpoint_states_deletion   -16914 (-93.32%)    -778 (-95.11%)
iters.bpf.o            iter_nested_deeply_iters       -293 (-49.41%)     -30 (-44.78%)
iters.bpf.o            iter_nested_iters              -181 (-22.26%)     -17 (-21.52%)
iters.bpf.o            iter_subprog_iters             -430 (-39.31%)     -29 (-32.95%)
iters.bpf.o            loop_state_deps2               -158 (-32.99%)     -14 (-30.43%)
pyperf600_iter.bpf.o   on_event                      -8633 (-69.97%)    -189 (-42.86%)

sched_ext:

Program                 Insns      (DIFF)  States    (DIFF)
----------------------  -----------------  ----------------
lavd_dispatch            -34018 (-22.00%)   -1885 (-21.06%)
layered_dispatch          -5378 (-46.81%)    -313 (-36.91%)
layered_dump              -3689 (-49.71%)    -455 (-66.81%)
layered_enqueue           -4038 (-30.90%)    -351 (-29.80%)
layered_init            -995483 (-99.55%)  -84233 (-99.48%)
layered_runnable          -1587 (-49.32%)    -172 (-58.31%)
refresh_layer_cpumasks   -15597 (-94.60%)   -1684 (-95.14%)
rustland_init               -85 (-17.07%)      -7 (-17.07%)
rustland_init               -85 (-17.07%)      -7 (-17.07%)
rusty_select_cpu           -706 (-33.65%)     -55 (-30.39%)
rusty_set_cpumask        -15934 (-78.67%)   -1359 (-81.62%)
tp_cgroup_attach_task       -57 (-27.67%)      -4 (-22.22%)
central_dispatch            -87 (-13.68%)      -9 (-14.29%)
central_init               -476 (-52.19%)     -12 (-25.00%)
pair_dispatch           -998423 (-99.84%)  -58264 (-99.78%)
qmap_dispatch              -581 (-24.28%)     -48 (-24.49%)
userland_dispatch           -36 (-22.78%)      -4 (-23.53%)

('layered_init' and 'pair_dispatch' hit 1M on master,
 but are verified ok with this patch-set).

sched_ext used for testing:
commit 2b7f3bba928f ("Merge pull request #1197 from devnexen/code_simpl4")

Eduard Zingerman (7):
  bpf: copy_verifier_state() should copy 'loop_entry' field
  selftests/bpf: test correct loop_entry update in copy_verifier_state
  bpf: don't do clean_live_states when state->loop_entry->branches > 0
  selftests/bpf: check states pruning for deeply nested iterator
  bpf: DFA-based liveness analysis for program registers
  bpf: use register liveness information for func_states_equal
  selftests/bpf: test cases for compute_live_registers()

 include/linux/bpf_verifier.h                  |   6 +
 kernel/bpf/verifier.c                         | 372 ++++++++++++++++--
 .../testing/selftests/bpf/prog_tests/align.c  |  11 +-
 .../bpf/prog_tests/compute_live_registers.c   |   9 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |  12 +
 .../bpf/progs/compute_live_registers.c        | 363 +++++++++++++++++
 tools/testing/selftests/bpf/progs/iters.c     | 139 +++++++
 .../selftests/bpf/progs/verifier_gotol.c      |   6 +-
 .../bpf/progs/verifier_iterating_callbacks.c  |   6 +-
 9 files changed, 886 insertions(+), 38 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/compute_live_registers.c
 create mode 100644 tools/testing/selftests/bpf/progs/compute_live_registers.c

-- 
2.47.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 1/7] bpf: copy_verifier_state() should copy 'loop_entry' field
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 2/7] selftests/bpf: test correct loop_entry update in copy_verifier_state Eduard Zingerman
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

The bpf_verifier_state.loop_entry state should be copied by
copy_verifier_state(). Otherwise, .loop_entry values from unrelated
states would poison env->cur_state.

Additionally, env->stack should not contain any states with
.loop_entry != NULL. The states in env->stack are yet to be verified,
while .loop_entry is set for states that reached an equivalent state.
This means that env->cur_state->loop_entry should always be NULL after
pop_stack().

See the selftest in the next commit for an example of the program that
is not safe yet is accepted by verifier w/o this fix.

This change has some verification performance impact for selftests:

Program                       Insns   (DIFF)  States (DIFF)
----------------------------  --------------  -------------
arena_htab_llvm               -291 (-40.59%)  -20 (-35.09%)
arena_htab_asm                -152 (-25.46%)  -10 (-21.28%)
arena_list_del                  -30 (-9.71%)   -9 (-39.13%)
checkpoint_states_deletion       -5 (-0.03%)    -1 (-0.12%)
iter_nested_deeply_iters        -26 (-4.38%)    -4 (-5.97%)
iter_subprog_check_stacksafe    -14 (-9.03%)    -1 (-6.67%)
iter_subprog_iters              -91 (-8.32%)    -5 (-5.68%)
loop_state_deps2              +246 (+51.36%)  +17 (+36.96%)
open_coded_iter                  -4 (-6.35%)   -1 (-14.29%)
on_event                       -320 (-2.59%)  -80 (-18.14%)
big_alloc2                       +7 (+0.24%)    +1 (+0.65%)
max_words                        -8 (-8.70%)   -1 (-12.50%)
cond_break2                      -6 (-5.31%)    +0 (+0.00%)

And significant negative impact for sched_ext:

Program                 Insns         (DIFF)  States      (DIFF)
----------------------  --------------------  ------------------
lavd_cpu_offline                 +4 (+0.09%)         -1 (-0.32%)
lavd_cpu_online                  +4 (+0.09%)         -1 (-0.32%)
lavd_enqueue                     -4 (-0.08%)         +0 (+0.00%)
lavd_init                   +7684 (+109.40%)     +649 (+133.26%)
layered_dispatch               -937 (-8.16%)       -86 (-10.14%)
layered_dump            +992580 (+13375.29%)  +30497 (+4478.27%)
layered_enqueue              -2733 (-20.91%)      -260 (-22.07%)
layered_runnable              -435 (-13.52%)       -38 (-12.88%)
refresh_layer_cpumasks   +658273 (+3992.68%)  +63600 (+3593.22%)
rustland_init                   -22 (-4.42%)         -4 (-9.76%)
rustland_init                   -22 (-4.42%)         -4 (-9.76%)
rusty_init                   +1917 (+12.72%)      +175 (+22.76%)
rusty_init_task                +135 (+0.32%)        +10 (+0.45%)
rusty_select_cpu          +75128 (+3580.93%)   +5807 (+3208.29%)
rusty_set_cpumask           -15799 (-78.00%)     -1349 (-81.02%)
central_dispatch            +2051 (+322.48%)     +164 (+260.32%)
nest_init                     +179 (+28.14%)       +13 (+21.67%)
qmap_dispatch                +1187 (+49.60%)       +57 (+29.08%)
qmap_dump                      +85 (+36.48%)        +8 (+36.36%)
qmap_init                     +1069 (+6.53%)       +66 (+10.95%)

Note 'layered_dump' program, which now hits 1M instructions limit.
This impact would be mitigated in the next patch.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 74525392714e..c7ceb59d3a19 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1659,6 +1659,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
 	dst_state->callback_unroll_depth = src->callback_unroll_depth;
 	dst_state->used_as_loop_entry = src->used_as_loop_entry;
 	dst_state->may_goto_depth = src->may_goto_depth;
+	dst_state->loop_entry = src->loop_entry;
 	for (i = 0; i <= src->curframe; i++) {
 		dst = dst_state->frame[i];
 		if (!dst) {
@@ -19230,6 +19231,8 @@ static int do_check(struct bpf_verifier_env *env)
 						return err;
 					break;
 				} else {
+					if (WARN_ON_ONCE(env->cur_state->loop_entry))
+						env->cur_state->loop_entry = NULL;
 					do_print_state = true;
 					continue;
 				}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 2/7] selftests/bpf: test correct loop_entry update in copy_verifier_state
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 1/7] bpf: copy_verifier_state() should copy 'loop_entry' field Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 3/7] bpf: don't do clean_live_states when state->loop_entry->branches > 0 Eduard Zingerman
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

A somewhat cumbersome test case sensitive to correct copying of
bpf_verifier_state->loop_entry fields in
verifier.c:copy_verifier_state().
W/o the fix from a previous commit the program is accepted as safe.

     1:  /* poison block */
     2:  if (random() != 24) {       // assume false branch is placed first
     3:    i = iter_new();
     4:    while (iter_next(i));
     5:    iter_destroy(i);
     6:    return;
     7:  }
     8:
     9:  /* dfs_depth block */
    10:  for (i = 10; i > 0; i--);
    11:
    12:  /* main block */
    13:  i = iter_new();             // fp[-16]
    14:  b = -24;                    // r8
    15:  for (;;) {
    16:    if (iter_next(i))
    17:      break;
    18:    if (random() == 77) {     // assume false branch is placed first
    19:      *(u64 *)(r10 + b) = 7;  // this is not safe when b == -25
    20:      iter_destroy(i);
    21:      return;
    22:    }
    23:    if (random() == 42) {     // assume false branch is placed first
    24:      b = -25;
    25:    }
    26:  }
    27:  iter_destroy(i);

The goal of this example is to:
(a) poison env->cur_state->loop_entry with a state S,
    such that S->branches == 0;
(b) set state S as a loop_entry for all checkpoints in
    /* main block */, thus forcing NOT_EXACT states comparisons;
(c) exploit incorrect loop_entry set for checkpoint at line 18
    by first creating a checkpoint with b == -24 and then
    pruning the state with b == -25 using that checkpoint.

The /* poison block */ is responsible for goal (a).
It forces verifier to first validate some unrelated iterator based
loop, which leads to an update_loop_entry() call in is_state_visited(),
which places checkpoint created at line 4 as env->cur_state->loop_entry.
Starting from line 8, the branch count for that checkpoint is 0.

The /* dfs_depth block */ is responsible for goal (b).
It abuses the fact that update_loop_entry(cur, hdr) only updates
cur->loop_entry when hdr->dfs_depth <= cur->dfs_depth.
After line 12 every state has dfs_depth bigger then dfs_depth of
poisoned env->cur_state->loop_entry. Thus the above condition is never
true for lines 12-27.

The /* main block */ is responsible for goal (c).
Verification proceeds as follows:
- checkpoint {b=-24,i=active} created at line 16;
- jump 18->23 is verified first, jump to 19 pushed to stack;
- jump 23->26 is verified first, jump to 24 pushed to stack;
- checkpoint {b=-24,i=active} created at line 15;
- current state is pruned by checkpoint created at line 16,
  this sets branches count for checkpoint at line 15 to 0;
- jump to 24 is popped from stack;
- line 16 is reached in state {b=-25,i=active};
- this is pruned by a previous checkpoint {b=-24,i=active}:
  - checkpoint's loop_entry is poisoned and has branch count of 0,
    hence states are compared using NOT_EXACT rules;
  - b is not marked precise yet.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/iters.c | 116 ++++++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index 190822b2f08b..007831dc8c46 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -1174,6 +1174,122 @@ __naked int loop_state_deps2(void)
 	);
 }
 
+SEC("?raw_tp")
+__failure
+__msg("math between fp pointer and register with unbounded")
+__flag(BPF_F_TEST_STATE_FREQ)
+__naked int loop_state_deps3(void)
+{
+	/* This is equivalent to a C program below.
+	 *
+	 *   if (random() != 24) {       // assume false branch is placed first
+	 *     i = iter_new();           // fp[-8]
+	 *     while (iter_next(i));
+	 *     iter_destroy(i);
+	 *     return;
+	 *   }
+	 *
+	 *   for (i = 10; i > 0; i--);   // increase dfs_depth for child states
+	 *
+	 *   i = iter_new();             // fp[-8]
+	 *   b = -24;                    // r8
+	 *   for (;;) {                  // checkpoint (L)
+	 *     if (iter_next(i))         // checkpoint (N)
+	 *       break;
+	 *     if (random() == 77) {     // assume false branch is placed first
+	 *       *(u64 *)(r10 + b) = 7;  // this is not safe when b == -25
+	 *       iter_destroy(i);
+	 *       return;
+	 *     }
+	 *     if (random() == 42) {     // assume false branch is placed first
+	 *       b = -25;
+	 *     }
+	 *   }
+	 *   iter_destroy(i);
+	 *
+	 * In case of a buggy verifier first loop might poison
+	 * env->cur_state->loop_entry with a state having 0 branches
+	 * and small dfs_depth. This would trigger NOT_EXACT states
+	 * comparison for some states within second loop.
+	 * Specifically, checkpoint (L) might be problematic if:
+	 * - branch with '*(u64 *)(r10 + b) = 7' is not explored yet;
+	 * - checkpoint (L) is first reached in state {b=-24};
+	 * - traversal is pruned at checkpoint (N) setting checkpoint's (L)
+	 *   branch count to 0, thus making it eligible for use in pruning;
+	 * - checkpoint (L) is next reached in state {b=-25},
+	 *   this would cause NOT_EXACT comparison with a state {b=-24}
+	 *   while 'b' is not marked precise yet.
+	 */
+	asm volatile (
+		"call %[bpf_get_prandom_u32];"
+		"if r0 == 24 goto 2f;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 5;"
+		"call %[bpf_iter_num_new];"
+	"1:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 != 0 goto 1b;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+	"2:"
+		/* loop to increase dfs_depth */
+		"r0 = 10;"
+	"3:"
+		"r0 -= 1;"
+		"if r0 != 0 goto 3b;"
+		/* end of loop */
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 0;"
+		"r3 = 10;"
+		"call %[bpf_iter_num_new];"
+		"r8 = -24;"
+	"main_loop_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_next];"
+		"if r0 == 0 goto main_loop_end_%=;"
+		/* first if */
+		"call %[bpf_get_prandom_u32];"
+		"if r0 == 77 goto unsafe_write_%=;"
+		/* second if */
+		"call %[bpf_get_prandom_u32];"
+		"if r0 == 42 goto poison_r8_%=;"
+		/* iterate */
+		"goto main_loop_%=;"
+	"main_loop_end_%=:"
+		"r1 = r10;"
+		"r1 += -8;"
+		"call %[bpf_iter_num_destroy];"
+		"r0 = 0;"
+		"exit;"
+
+	"unsafe_write_%=:"
+		"r0 = r10;"
+		"r0 += r8;"
+		"r1 = 7;"
+		"*(u64 *)(r0 + 0) = r1;"
+		"goto main_loop_end_%=;"
+
+	"poison_r8_%=:"
+		"r8 = -25;"
+		"goto main_loop_%=;"
+		:
+		: __imm(bpf_get_prandom_u32),
+		  __imm(bpf_iter_num_new),
+		  __imm(bpf_iter_num_next),
+		  __imm(bpf_iter_num_destroy)
+		: __clobber_all
+	);
+}
+
 SEC("?raw_tp")
 __success
 __naked int triple_continue(void)
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 3/7] bpf: don't do clean_live_states when state->loop_entry->branches > 0
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 1/7] bpf: copy_verifier_state() should copy 'loop_entry' field Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 2/7] selftests/bpf: test correct loop_entry update in copy_verifier_state Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 4/7] selftests/bpf: check states pruning for deeply nested iterator Eduard Zingerman
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

verifier.c:is_state_visited() uses RANGE_WITHIN states comparison rules
for cached states that have loop_entry with non-zero branches count
(meaning that loop_entry's verification is not yet done).

The RANGE_WITHIN rules in regsafe()/stacksafe() require register and
stack objects types to be identical in current and old states.

verifier.c:clean_live_states() replaces registers and stack spills
with NOT_INIT/STACK_INVALID marks, if these registers/stack spills are
not read in any child state. This means that clean_live_states() works
against loop convergence logic under some conditions. See selftest in
the next patch for a specific example.

Mitigate this by prohibiting clean_verifier_state() when
state->loop_entry->branches > 0.

This undoes negative verification performance impact of the
copy_verifier_state() fix from the previous patch.
Below is comparison between master and current patch.

selftests:

File                  Program                       Insns      (DIFF)  States    (DIFF)
--------------------  ----------------------------  -----------------  ----------------
arena_htab.bpf.o      arena_htab_llvm                  -294 (-41.00%)     -20 (-35.09%)
arena_htab_asm.bpf.o  arena_htab_asm                   -152 (-25.46%)     -10 (-21.28%)
arena_list.bpf.o      arena_list_add                   +329 (+22.04%)      +7 (+23.33%)
arena_list.bpf.o      arena_list_del                    -51 (-16.50%)      -8 (-34.78%)
iters.bpf.o           checkpoint_states_deletion      -8297 (-45.78%)    -451 (-55.13%)
iters.bpf.o           clean_live_states             -998653 (-99.87%)  -90126 (-99.85%)
iters.bpf.o           iter_nested_deeply_iters         -226 (-38.11%)     -24 (-35.82%)
iters.bpf.o           iter_subprog_check_stacksafe      -20 (-12.90%)       -1 (-6.67%)
iters.bpf.o           iter_subprog_iters               -286 (-26.14%)     -20 (-22.73%)
iters.bpf.o           loop_state_deps2                 -123 (-25.68%)     -11 (-23.91%)
iters.bpf.o           triple_continue                    -4 (-11.43%)       +0 (+0.00%)
mptcp_subflow.bpf.o   _getsockopt_subflow               -55 (-10.98%)       -2 (-8.00%)
pyperf600_iter.bpf.o  on_event                        -6025 (-48.83%)    -160 (-36.28%)

(arena_list_add requires further investigation)

sched_ext:

Program                 Insns      (DIFF)  States    (DIFF)
----------------------  -----------------  ----------------
layered_dispatch          -3570 (-31.08%)    -227 (-26.77%)
layered_dump              -2746 (-37.00%)    -411 (-60.35%)
layered_enqueue           -3781 (-28.93%)    -341 (-28.95%)
layered_init            -994488 (-99.45%)  -84153 (-99.39%)
layered_runnable          -1467 (-45.59%)    -160 (-54.24%)
refresh_layer_cpumasks   -15202 (-92.21%)   -1650 (-93.22%)
rusty_select_cpu           -647 (-30.84%)     -53 (-29.28%)
rusty_set_cpumask        -15934 (-78.67%)   -1359 (-81.62%)
central_init               -330 (-36.18%)     -10 (-20.83%)
pair_dispatch           -998092 (-99.81%)  -58249 (-99.76%)

'layered_init' and 'pair_dispatch' hit 1M on master, but are verified
ok with this patch.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c7ceb59d3a19..1c2199a3f38f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17801,12 +17801,16 @@ static void clean_verifier_state(struct bpf_verifier_env *env,
 static void clean_live_states(struct bpf_verifier_env *env, int insn,
 			      struct bpf_verifier_state *cur)
 {
+	struct bpf_verifier_state *loop_entry;
 	struct bpf_verifier_state_list *sl;
 
 	sl = *explored_state(env, insn);
 	while (sl) {
 		if (sl->state.branches)
 			goto next;
+		loop_entry = get_loop_entry(&sl->state);
+		if (loop_entry && loop_entry->branches)
+			goto next;
 		if (sl->state.insn_idx != insn ||
 		    !same_callsites(&sl->state, cur))
 			goto next;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 4/7] selftests/bpf: check states pruning for deeply nested iterator
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
                   ` (2 preceding siblings ...)
  2025-01-22 12:04 ` [RFC bpf-next v1 3/7] bpf: don't do clean_live_states when state->loop_entry->branches > 0 Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers Eduard Zingerman
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

A test case with ridiculously deep bpf_for() nesting and
a conditional update of a stack location.

Consider the innermost loop structure:

	1: bpf_for(o, 0, 10)
	2:	if (unlikely(bpf_get_prandom_u32()))
	3:		buf[0] = 42;
	4: <exit>

Assuming that verifier.c:clean_live_states() operates w/o change from
the previous patch (e.g. as on current master) verification would
proceed as follows:
- at (1) state {buf[0]=?,o=drained}:
  - checkpoint
  - push visit to (2) for later
- at (4) {buf[0]=?,o=drained}
- pop (2) {buf[0]=?,o=active}, push visit to (3) for later
- at (1) {buf[0]=?,o=active}
  - checkpoint
  - push visit to (2) for later
- at (4) {buf[0]=?,o=drained}
- pop (2) {buf[0]=?,o=active}, push visit to (3) for later
- at (1) {buf[0]=?,o=active}:
  - checkpoint reached, checkpoint's branch count becomes 0
  - checkpoint is processed by clean_live_states() and
    becomes {o=active}
- pop (3) {buf[0]=42,o=active}
- at (1), {buf[0]=42,o=active}
  - checkpoint
  - push visit to (2) for later
- at (4) {buf[0]=42,o=drained}
- pop (2) {buf[0]=42,o=active}, push visit to (3) for later
- at (1) {buf[0]=42,o=active}, checkpoint reached
- pop (3) {buf[0]=42,o=active}
- at (1) {buf[0]=42,o=active}:
  - checkpoint reached, checkpoint's branch count becomes 0
  - checkpoint is processed by clean_live_states() and
    becomes {o=active}
- ...

Note how clean_live_states() converted the checkpoint
{buf[0]=42,o=active} to {o=active} and it can no longer be matched
against {buf[0]=<any>,o=active}, because iterator based states
are compared using stacksafe(... RANGE_WITHIN), that requires
stack slots to have same types. At the same time there are
still states {buf[0]=42,o=active} pushed to DFS stack.

This behaviour becomes exacerbated with multiple nesting levels,
here are veristat results:
- nesting level 1: 69 insns
- nesting level 2: 258 insns
- nesting level 3: 900 insns
- nesting level 4: 4754 insns
- nesting level 5: 35944 insns
- nesting level 6: 312558 insns
- nesting level 7: 1M limit

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/iters.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index 007831dc8c46..427b72954b87 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -7,6 +7,8 @@
 #include "bpf_misc.h"
 #include "bpf_compiler.h"
 
+#define unlikely(x)	__builtin_expect(!!(x), 0)
+
 static volatile int zero = 0;
 
 int my_pid;
@@ -1628,4 +1630,25 @@ int iter_destroy_bad_arg(const void *ctx)
 	return 0;
 }
 
+SEC("raw_tp")
+__success
+int clean_live_states(const void *ctx)
+{
+	char buf[1];
+	int i, j, k, l, m, n, o;
+
+	bpf_for(i, 0, 10)
+	bpf_for(j, 0, 10)
+	bpf_for(k, 0, 10)
+	bpf_for(l, 0, 10)
+	bpf_for(m, 0, 10)
+	bpf_for(n, 0, 10)
+	bpf_for(o, 0, 10) {
+		if (unlikely(bpf_get_prandom_u32()))
+			buf[0] = 42;
+		bpf_printk("%s", buf);
+	}
+	return 0;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
                   ` (3 preceding siblings ...)
  2025-01-22 12:04 ` [RFC bpf-next v1 4/7] selftests/bpf: check states pruning for deeply nested iterator Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 19:45   ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 6/7] bpf: use register liveness information for func_states_equal Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 7/7] selftests/bpf: test cases for compute_live_registers() Eduard Zingerman
  6 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Compute may-live registers before each instruction in the program.
The register is live before the instruction I if it is read by I or
some instruction S following I during program execution and is not
overwritten between I and S.

This information would be used in the next patch as a hint in
func_states_equal().

Use a simple algorithm described in [1] to compute this information:
- define the following:
  - I.use : a set of all registers read by instruction I;
  - I.def : a set of all registers written by instruction I;
  - I.in  : a set of all registers that may be alive before I execution;
  - I.out : a set of all registers that may be alive after I execution;
  - I.successors : a set of instructions S that might immediately
                   follow I for some program execution;
- associate separate empty sets 'I.in' and 'I.out' with each instruction;
- visit each instruction in a postorder and update corresponding
  'I.in' and 'I.out' sets as follows:

      I.out = U [S.in for S in I.successors]
      I.in  = (I.out / I.def) U I.use

  (where U stands for set union, / stands for set difference)
- repeat the computation while I.{in,out} changes for any instruction.

On implementation side keep things as simple, as possible:
- check_cfg() already marks instructions EXPLORED in post-order,
  modify it to save the index of each EXPLORED instruction in a vector;
- represent I.{in,out,use,def} as bitmasks;
- don't split the program into basic blocks and don't maintain the
  work queue, instead:
  - do fixed-point computation by visiting each instruction;
  - maintain a simple 'changed' flag if I.{in,out} for any instruction
    change;
  Measurements show that even such simplistic implementation does not
  add measurable verification time overhead (for selftests, at-least).

Note on check_cfg() ex_insn_beg/ex_done change:
To avoid out of bounds access to env->cfg.insn_postorder array,
it should be guaranteed that instruction transitions to EXPLORED state
only once. Previously this was not the fact for incorrect programs
with direct calls to exception callbacks.

The 'align' selftest needs adjustment to skip computed insn/live
registers printout. Otherwise it matches lines from the live registers
printout.

[1] https://en.wikipedia.org/wiki/Live-variable_analysis

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 include/linux/bpf_verifier.h                  |   6 +
 kernel/bpf/verifier.c                         | 351 ++++++++++++++++--
 .../testing/selftests/bpf/prog_tests/align.c  |  11 +-
 3 files changed, 344 insertions(+), 24 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 32c23f2a3086..2b243a32d846 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -589,6 +589,8 @@ struct bpf_insn_aux_data {
 	 * accepts callback function as a parameter.
 	 */
 	bool calls_callback;
+	/* registers alive before this instruction. */
+	u16 live_regs_before;
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
@@ -742,7 +744,11 @@ struct bpf_verifier_env {
 	struct {
 		int *insn_state;
 		int *insn_stack;
+		/* vector of instruction indexes sorted in post-order */
+		int *insn_postorder;
 		int cur_stack;
+		/* current position in the insn_postorder vector */
+		int cur_postorder;
 	} cfg;
 	struct backtrack_state bt;
 	struct bpf_insn_hist_entry *insn_hist;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1c2199a3f38f..d36c5a3309e9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3268,6 +3268,15 @@ static int add_subprog_and_kfunc(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static int jmp_offset(struct bpf_insn *insn)
+{
+	u8 code = insn->code;
+
+	if (code == (BPF_JMP32 | BPF_JA))
+		return insn->imm;
+	return insn->off;
+}
+
 static int check_subprogs(struct bpf_verifier_env *env)
 {
 	int i, subprog_start, subprog_end, off, cur_subprog = 0;
@@ -3294,10 +3303,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 			goto next;
 		if (BPF_OP(code) == BPF_EXIT || BPF_OP(code) == BPF_CALL)
 			goto next;
-		if (code == (BPF_JMP32 | BPF_JA))
-			off = i + insn[i].imm + 1;
-		else
-			off = i + insn[i].off + 1;
+		off = i + jmp_offset(&insn[i]) + 1;
 		if (off < subprog_start || off >= subprog_end) {
 			verbose(env, "jump out of range from insn %d to %d\n", i, off);
 			return -EINVAL;
@@ -3827,6 +3833,17 @@ static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn)
 	return btf_name_by_offset(desc_btf, func->name_off);
 }
 
+static void verbose_insn(struct bpf_verifier_env *env, struct bpf_insn *insn)
+{
+	const struct bpf_insn_cbs cbs = {
+		.cb_call	= disasm_kfunc_name,
+		.cb_print	= verbose,
+		.private_data	= env,
+	};
+
+	print_bpf_insn(&cbs, insn, env->allow_ptr_leaks);
+}
+
 static inline void bt_init(struct backtrack_state *bt, u32 frame)
 {
 	bt->frame = frame;
@@ -4027,11 +4044,6 @@ static bool calls_callback(struct bpf_verifier_env *env, int insn_idx);
 static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 			  struct bpf_insn_hist_entry *hist, struct backtrack_state *bt)
 {
-	const struct bpf_insn_cbs cbs = {
-		.cb_call	= disasm_kfunc_name,
-		.cb_print	= verbose,
-		.private_data	= env,
-	};
 	struct bpf_insn *insn = env->prog->insnsi + idx;
 	u8 class = BPF_CLASS(insn->code);
 	u8 opcode = BPF_OP(insn->code);
@@ -4049,7 +4061,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
 		fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, bt_stack_mask(bt));
 		verbose(env, "stack=%s before ", env->tmp_str_buf);
 		verbose(env, "%d: ", idx);
-		print_bpf_insn(&cbs, insn, env->allow_ptr_leaks);
+		verbose_insn(env, insn);
 	}
 
 	/* If there is a history record that some registers gained range at this insn,
@@ -10909,6 +10921,9 @@ static int get_helper_proto(struct bpf_verifier_env *env, int func_id,
 	return *ptr ? 0 : -EINVAL;
 }
 
+/* Bitmask with 1s for all caller saved registers */
+#define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1)
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -17111,9 +17126,8 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
 static int check_cfg(struct bpf_verifier_env *env)
 {
 	int insn_cnt = env->prog->len;
-	int *insn_stack, *insn_state;
+	int *insn_stack, *insn_state, *insn_postorder;
 	int ex_insn_beg, i, ret = 0;
-	bool ex_done = false;
 
 	insn_state = env->cfg.insn_state = kvcalloc(insn_cnt, sizeof(int), GFP_KERNEL);
 	if (!insn_state)
@@ -17125,6 +17139,17 @@ static int check_cfg(struct bpf_verifier_env *env)
 		return -ENOMEM;
 	}
 
+	insn_postorder = env->cfg.insn_postorder = kvcalloc(insn_cnt, sizeof(int), GFP_KERNEL);
+	if (!insn_postorder) {
+		kvfree(insn_state);
+		kvfree(insn_stack);
+		return -ENOMEM;
+	}
+
+	ex_insn_beg = env->exception_callback_subprog
+		      ? env->subprog_info[env->exception_callback_subprog].start
+		      : 0;
+
 	insn_state[0] = DISCOVERED; /* mark 1st insn as discovered */
 	insn_stack[0] = 0; /* 0 is the first instruction */
 	env->cfg.cur_stack = 1;
@@ -17138,6 +17163,7 @@ static int check_cfg(struct bpf_verifier_env *env)
 		case DONE_EXPLORING:
 			insn_state[t] = EXPLORED;
 			env->cfg.cur_stack--;
+			insn_postorder[env->cfg.cur_postorder++] = t;
 			break;
 		case KEEP_EXPLORING:
 			break;
@@ -17156,13 +17182,10 @@ static int check_cfg(struct bpf_verifier_env *env)
 		goto err_free;
 	}
 
-	if (env->exception_callback_subprog && !ex_done) {
-		ex_insn_beg = env->subprog_info[env->exception_callback_subprog].start;
-
+	if (ex_insn_beg && insn_state[ex_insn_beg] != EXPLORED) {
 		insn_state[ex_insn_beg] = DISCOVERED;
 		insn_stack[0] = ex_insn_beg;
 		env->cfg.cur_stack = 1;
-		ex_done = true;
 		goto walk_cfg;
 	}
 
@@ -19001,19 +19024,13 @@ static int do_check(struct bpf_verifier_env *env)
 		}
 
 		if (env->log.level & BPF_LOG_LEVEL) {
-			const struct bpf_insn_cbs cbs = {
-				.cb_call	= disasm_kfunc_name,
-				.cb_print	= verbose,
-				.private_data	= env,
-			};
-
 			if (verifier_state_scratched(env))
 				print_insn_state(env, state, state->curframe);
 
 			verbose_linfo(env, env->insn_idx, "; ");
 			env->prev_log_pos = env->log.end_pos;
 			verbose(env, "%d: ", env->insn_idx);
-			print_bpf_insn(&cbs, insn, env->allow_ptr_leaks);
+			verbose_insn(env, insn);
 			env->prev_insn_print_pos = env->log.end_pos - env->prev_log_pos;
 			env->prev_log_pos = env->log.end_pos;
 		}
@@ -23024,6 +23041,289 @@ static int process_fd_array(struct bpf_verifier_env *env, union bpf_attr *attr,
 	return 0;
 }
 
+static bool can_fallthrough(struct bpf_insn *insn)
+{
+	u8 class = BPF_CLASS(insn->code);
+	u8 opcode = BPF_OP(insn->code);
+
+	if (class != BPF_JMP && class != BPF_JMP32)
+		return true;
+
+	if (opcode == BPF_EXIT || opcode == BPF_JA)
+		return false;
+
+	return true;
+}
+
+static bool can_jump(struct bpf_insn *insn)
+{
+	u8 class = BPF_CLASS(insn->code);
+	u8 opcode = BPF_OP(insn->code);
+
+	if (class != BPF_JMP && class != BPF_JMP32)
+		return false;
+
+	switch (opcode) {
+	case BPF_JA:
+	case BPF_JEQ:
+	case BPF_JNE:
+	case BPF_JLT:
+	case BPF_JLE:
+	case BPF_JGT:
+	case BPF_JGE:
+	case BPF_JSGT:
+	case BPF_JSGE:
+	case BPF_JSLT:
+	case BPF_JSLE:
+	case BPF_JCOND:
+		return true;
+	}
+
+	return false;
+}
+
+static int insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2])
+{
+	struct bpf_insn *insn = &prog->insnsi[idx];
+	int i = 0, insn_sz;
+	u32 dst;
+
+	succ[0] = prog->len;
+	succ[1] = prog->len;
+
+	insn_sz = bpf_is_ldimm64(insn) ? 2 : 1;
+	if (can_fallthrough(insn) && idx + 1 < prog->len)
+		succ[i++] = idx + insn_sz;
+
+	if (can_jump(insn)) {
+		dst = idx + jmp_offset(insn) + 1;
+		if (i == 0 || succ[0] != dst)
+			succ[i++] = dst;
+	}
+
+	return i;
+}
+
+/* Each field is a register bitmask */
+struct insn_live_regs {
+	u16 use;	/* registers read by instruction */
+	u16 def;	/* registers written by instruction */
+	u16 in;		/* registers that may be alive before instruction */
+	u16 out;	/* registers that may be alive after instruction */
+};
+
+/* Compute *use and *def values for the call instruction */
+static void compute_call_live_regs(struct bpf_verifier_env *env,
+				   struct bpf_insn *insn,
+				   u16 *use, u16 *def)
+{
+	*def = ALL_CALLER_SAVED_REGS;
+	*use = *def & ~BIT(BPF_REG_0);
+}
+
+/* Compute info->{use,def} fields for the instruction */
+static void compute_insn_live_regs(struct bpf_verifier_env *env,
+				   struct bpf_insn *insn,
+				   struct insn_live_regs *info)
+{
+	u8 class = BPF_CLASS(insn->code);
+	u8 code = BPF_OP(insn->code);
+	u8 mode = BPF_MODE(insn->code);
+	u16 src = BIT(insn->src_reg);
+	u16 dst = BIT(insn->dst_reg);
+	u16 r0  = BIT(0);
+	u16 def = 0;
+	u16 use = 0xffff;
+
+	switch (class) {
+	case BPF_LD:
+		switch (mode) {
+		case BPF_IMM:
+			if (BPF_SIZE(insn->code) == BPF_DW) {
+				def = dst;
+				use = 0;
+			}
+			break;
+		case BPF_LD | BPF_ABS:
+		case BPF_LD | BPF_IND:
+			/* stick with defaults */
+			break;
+		}
+		break;
+	case BPF_LDX:
+		switch (mode) {
+		case BPF_MEM:
+		case BPF_MEMSX:
+			def = dst;
+			use = src;
+			break;
+		}
+		break;
+	case BPF_ST:
+		switch (mode) {
+		case BPF_MEM:
+			def = 0;
+			use = dst;
+			break;
+		}
+		break;
+	case BPF_STX:
+		switch (mode) {
+		case BPF_MEM:
+			def = 0;
+			use = dst | src;
+			break;
+		case BPF_ATOMIC:
+			use = dst | src;
+			if (insn->imm & BPF_FETCH) {
+				if (insn->imm == BPF_CMPXCHG)
+					def = r0;
+				else
+					def = src;
+			} else {
+				def = 0;
+			}
+			break;
+		}
+		break;
+	case BPF_ALU:
+	case BPF_ALU64:
+		switch (code) {
+		case BPF_END:
+			use = dst;
+			def = dst;
+			break;
+		case BPF_MOV:
+			def = dst;
+			if (BPF_SRC(insn->code) == BPF_K)
+				use = 0;
+			else
+				use = src;
+			break;
+		default:
+			def = dst;
+			if (BPF_SRC(insn->code) == BPF_K)
+				use = dst;
+			else
+				use = dst | src;
+		}
+		break;
+	case BPF_JMP:
+	case BPF_JMP32:
+		switch (code) {
+		case BPF_JA:
+			def = 0;
+			use = 0;
+			break;
+		case BPF_EXIT:
+			def = 0;
+			use = r0;
+			break;
+		case BPF_CALL:
+			compute_call_live_regs(env, insn, &use, &def);
+			break;
+		default:
+			def = 0;
+			if (BPF_SRC(insn->code) == BPF_K)
+				use = dst;
+			else
+				use = dst | src;
+		}
+		break;
+	}
+
+	info->def = def;
+	info->use = use;
+}
+
+/* Compute may-live registers after each instruction in the program.
+ * The register is live after the instruction I if it is read by some
+ * instruction S following I during program execution and is not
+ * overwritten between I and S.
+ *
+ * Store result in env->insn_aux_data[i].live_regs.
+ */
+static int compute_live_registers(struct bpf_verifier_env *env)
+{
+	struct bpf_insn_aux_data *insn_aux = env->insn_aux_data;
+	struct bpf_insn *insns = env->prog->insnsi;
+	struct insn_live_regs *state;
+	int insn_cnt = env->prog->len;
+	int err = 0, i, j;
+	bool changed;
+
+	/* Use simple algorithm desribed in:
+	 * https://en.wikipedia.org/wiki/Live-variable_analysis
+	 *
+	 * - visit each instruction in a postorder and update
+	 *   state[i].in, state[i].out as follows:
+	 *
+	 *       state[i].out = U [state[s].in for S in insn_successors(i)]
+	 *       state[i].in  = (state[i].out / state[i].def) U state[i].use
+	 *
+	 *   (where U stands for set union, / stands for set difference)
+	 * - repeat the computation while {in,out} fields changes for
+	 *   any instruction.
+	 */
+	state = kvcalloc(insn_cnt, sizeof(*state), GFP_KERNEL);
+	if (!state) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	for (i = 0; i < insn_cnt; ++i)
+		compute_insn_live_regs(env, &insns[i], &state[i]);
+
+	changed = true;
+	while (changed) {
+		changed = false;
+		for (i = 0; i < env->cfg.cur_postorder; ++i) {
+			int insn_idx = env->cfg.insn_postorder[i];
+			struct insn_live_regs *live = &state[insn_idx];
+			int succ_num;
+			u32 succ[2];
+			u16 new_out = 0;
+			u16 new_in = 0;
+
+			succ_num = insn_successors(env->prog, insn_idx, succ);
+			for (int s = 0; s < succ_num; ++s)
+				new_out |= state[succ[s]].in;
+			new_in = (new_out & ~live->def) | live->use;
+			if (new_out != live->out || new_in != live->in) {
+				live->in = new_in;
+				live->out = new_out;
+				changed = true;
+			}
+		}
+	}
+
+	for (i = 0; i < insn_cnt; ++i)
+		insn_aux[i].live_regs_before = state[i].in;
+
+	if (env->log.level & BPF_LOG_LEVEL2) {
+		verbose(env, "Live regs before insn:\n");
+		for (i = 0; i < insn_cnt; ++i) {
+			verbose(env, "%3d: ", i);
+			for (j = BPF_REG_0; j < BPF_REG_10; ++j)
+				if (insn_aux[i].live_regs_before & BIT(j))
+					verbose(env, "%d", j);
+				else
+					verbose(env, ".");
+			verbose(env, " ");
+			verbose_insn(env, &insns[i]);
+			if (bpf_is_ldimm64(&insns[i]))
+				i++;
+		}
+	}
+
+out:
+	kvfree(state);
+	kvfree(env->cfg.insn_postorder);
+	env->cfg.insn_postorder = NULL;
+	env->cfg.cur_postorder = 0;
+	return err;
+}
+
 int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
 {
 	u64 start_time = ktime_get_ns();
@@ -23141,6 +23441,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	if (ret)
 		goto skip_full_check;
 
+	ret = compute_live_registers(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = mark_fastcall_patterns(env);
 	if (ret < 0)
 		goto skip_full_check;
@@ -23279,6 +23583,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
 	vfree(env->insn_aux_data);
 	kvfree(env->insn_hist);
 err_free_env:
+	kvfree(env->cfg.insn_postorder);
 	kvfree(env);
 	return ret;
 }
diff --git a/tools/testing/selftests/bpf/prog_tests/align.c b/tools/testing/selftests/bpf/prog_tests/align.c
index 4ebd0da898f5..1d53a8561ee2 100644
--- a/tools/testing/selftests/bpf/prog_tests/align.c
+++ b/tools/testing/selftests/bpf/prog_tests/align.c
@@ -610,9 +610,11 @@ static int do_test_single(struct bpf_align_test *test)
 		.log_size = sizeof(bpf_vlog),
 		.log_level = 2,
 	);
+	const char *main_pass_start = "0: R1=ctx() R10=fp0";
 	const char *line_ptr;
 	int cur_line = -1;
 	int prog_len, i;
+	char *start;
 	int fd_prog;
 	int ret;
 
@@ -632,7 +634,13 @@ static int do_test_single(struct bpf_align_test *test)
 		ret = 0;
 		/* We make a local copy so that we can strtok() it */
 		strncpy(bpf_vlog_copy, bpf_vlog, sizeof(bpf_vlog_copy));
-		line_ptr = strtok(bpf_vlog_copy, "\n");
+		start = strstr(bpf_vlog_copy, main_pass_start);
+		if (!start) {
+			ret = 1;
+			printf("Can't find initial line '%s'\n", main_pass_start);
+			goto out;
+		}
+		line_ptr = strtok(start, "\n");
 		for (i = 0; i < MAX_MATCHES; i++) {
 			struct bpf_reg_match m = test->matches[i];
 			const char *p;
@@ -682,6 +690,7 @@ static int do_test_single(struct bpf_align_test *test)
 				break;
 			}
 		}
+out:
 		if (fd_prog >= 0)
 			close(fd_prog);
 	}
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 6/7] bpf: use register liveness information for func_states_equal
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
                   ` (4 preceding siblings ...)
  2025-01-22 12:04 ` [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  2025-01-22 12:04 ` [RFC bpf-next v1 7/7] selftests/bpf: test cases for compute_live_registers() Eduard Zingerman
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Liveness analysis DFA computes a set of registers live before each
instruction. Leverage this information to skip comparison of dead
registers in func_states_equal(). This helps with convergance of
iterator processing loops, as bpf_reg_state->live marks can't be used
when loops are processed.

This has certain performance impact for selftests, here is a veristat
listing for bigger ones (this patch compared to previous patch):

File                  Program                     Insns    (DIFF)  States  (DIFF)
--------------------  --------------------------  ---------------  --------------
iters.bpf.o           checkpoint_states_deletion  -8617 (-87.68%)  -327 (-89.10%)
iters.bpf.o           iter_nested_iters            -140 (-18.13%)   -10 (-13.89%)
pyperf600_iter.bpf.o  on_event                    -2608 (-41.31%)   -29 (-10.32%)

Impact on sched_ext:

Program                 Insns     (DIFF)  States   (DIFF)
----------------------  ----------------  ---------------
lavd_dispatch           -34018 (-22.00%)  -1885 (-21.06%)
layered_dispatch         -1808 (-22.83%)    -86 (-13.85%)
layered_dump              -943 (-20.17%)    -44 (-16.30%)
layered_init              -995 (-18.05%)    -80 (-15.41%)
refresh_layer_cpumasks    -395 (-30.74%)    -34 (-28.33%)
rustland_init              -63 (-13.24%)      -3 (-8.11%)
rustland_init              -63 (-13.24%)      -3 (-8.11%)
tp_cgroup_attach_task      -53 (-26.24%)     -4 (-22.22%)
central_init              -146 (-25.09%)      -2 (-5.26%)
pair_dispatch             -331 (-17.34%)    -15 (-10.56%)
qmap_dispatch             -375 (-17.15%)    -26 (-14.94%)
userland_dispatch          -34 (-21.79%)     -4 (-23.53%)

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d36c5a3309e9..babc2e179c08 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18223,15 +18223,17 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c
  * the current state will reach 'bpf_exit' instruction safely
  */
 static bool func_states_equal(struct bpf_verifier_env *env, struct bpf_func_state *old,
-			      struct bpf_func_state *cur, enum exact_level exact)
+			      struct bpf_func_state *cur, u32 insn_idx, enum exact_level exact)
 {
-	int i;
+	u16 live_regs = env->insn_aux_data[insn_idx].live_regs_before;
+	u16 i;
 
 	if (old->callback_depth > cur->callback_depth)
 		return false;
 
 	for (i = 0; i < MAX_BPF_REG; i++)
-		if (!regsafe(env, &old->regs[i], &cur->regs[i],
+		if (((1 << i) & live_regs) &&
+		    !regsafe(env, &old->regs[i], &cur->regs[i],
 			     &env->idmap_scratch, exact))
 			return false;
 
@@ -18252,6 +18254,7 @@ static bool states_equal(struct bpf_verifier_env *env,
 			 struct bpf_verifier_state *cur,
 			 enum exact_level exact)
 {
+	u32 insn_idx;
 	int i;
 
 	if (old->curframe != cur->curframe)
@@ -18275,9 +18278,12 @@ static bool states_equal(struct bpf_verifier_env *env,
 	 * and all frame states need to be equivalent
 	 */
 	for (i = 0; i <= old->curframe; i++) {
+		insn_idx = i == old->curframe
+			   ? env->insn_idx
+			   : old->frame[i + 1]->callsite;
 		if (old->frame[i]->callsite != cur->frame[i]->callsite)
 			return false;
-		if (!func_states_equal(env, old->frame[i], cur->frame[i], exact))
+		if (!func_states_equal(env, old->frame[i], cur->frame[i], insn_idx, exact))
 			return false;
 	}
 	return true;
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC bpf-next v1 7/7] selftests/bpf: test cases for compute_live_registers()
  2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
                   ` (5 preceding siblings ...)
  2025-01-22 12:04 ` [RFC bpf-next v1 6/7] bpf: use register liveness information for func_states_equal Eduard Zingerman
@ 2025-01-22 12:04 ` Eduard Zingerman
  6 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 12:04 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

Cover instructions from each kind:
- assignment
- arithmetic
- store/load
- endian conversion
- atomics
- branches, conditional branches, may_goto, calls
- LD_ABS/LD_IND
- address_space_cast

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../bpf/prog_tests/compute_live_registers.c   |   9 +
 tools/testing/selftests/bpf/progs/bpf_misc.h  |  12 +
 .../bpf/progs/compute_live_registers.c        | 363 ++++++++++++++++++
 .../selftests/bpf/progs/verifier_gotol.c      |   6 +-
 .../bpf/progs/verifier_iterating_callbacks.c  |   6 +-
 5 files changed, 386 insertions(+), 10 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/compute_live_registers.c
 create mode 100644 tools/testing/selftests/bpf/progs/compute_live_registers.c

diff --git a/tools/testing/selftests/bpf/prog_tests/compute_live_registers.c b/tools/testing/selftests/bpf/prog_tests/compute_live_registers.c
new file mode 100644
index 000000000000..285f20241fe1
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/compute_live_registers.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "compute_live_registers.skel.h"
+#include "test_progs.h"
+
+void test_compute_live_registers(void)
+{
+	RUN_TESTS(compute_live_registers);
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index f45f4352feeb..2b34911439d4 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -208,4 +208,16 @@
 #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 #endif
 
+#if (defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86) ||	\
+     (defined(__TARGET_ARCH_riscv) && __riscv_xlen == 64) ||		\
+     defined(__TARGET_ARCH_arm) || defined(__TARGET_ARCH_s390) ||	\
+     defined(__TARGET_ARCH_loongarch)) &&				\
+	__clang_major__ >= 18
+#define CAN_USE_GOTOL
+#endif
+
+#if _clang_major__ >= 18
+#define CAN_USE_BPF_ST
+#endif
+
 #endif
diff --git a/tools/testing/selftests/bpf/progs/compute_live_registers.c b/tools/testing/selftests/bpf/progs/compute_live_registers.c
new file mode 100644
index 000000000000..988fdd442f55
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/compute_live_registers.c
@@ -0,0 +1,363 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "../../../include/linux/filter.h"
+#include "bpf_arena_common.h"
+#include "bpf_misc.h"
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, __u32);
+	__type(value, __u64);
+} test_map SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARENA);
+	__uint(map_flags, BPF_F_MMAPABLE);
+	__uint(max_entries, 1);
+} arena SEC(".maps");
+
+SEC("socket")
+__log_level(2)
+__msg(" 0: .......... (b7) r0 = 42")
+__msg(" 1: 0......... (bf) r1 = r0")
+__msg(" 2: .1........ (bf) r2 = r1")
+__msg(" 3: ..2....... (bf) r3 = r2")
+__msg(" 4: ...3...... (bf) r4 = r3")
+__msg(" 5: ....4..... (bf) r5 = r4")
+__msg(" 6: .....5.... (bf) r6 = r5")
+__msg(" 7: ......6... (bf) r7 = r6")
+__msg(" 8: .......7.. (bf) r8 = r7")
+__msg(" 9: ........8. (bf) r9 = r8")
+__msg("10: .........9 (bf) r0 = r9")
+__msg("11: 0......... (95) exit")
+__naked void assign_chain(void)
+{
+	asm volatile (
+		"r0 = 42;"
+		"r1 = r0;"
+		"r2 = r1;"
+		"r3 = r2;"
+		"r4 = r3;"
+		"r5 = r4;"
+		"r6 = r5;"
+		"r7 = r6;"
+		"r8 = r7;"
+		"r9 = r8;"
+		"r0 = r9;"
+		"exit;"
+		::: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("0: .......... (b7) r1 = 7")
+__msg("1: .1........ (07) r1 += 7")
+__msg("2: .......... (b7) r2 = 7")
+__msg("3: ..2....... (b7) r3 = 42")
+__msg("4: ..23...... (0f) r2 += r3")
+__msg("5: .......... (b7) r0 = 0")
+__msg("6: 0......... (95) exit")
+__naked void arithmetics(void)
+{
+	asm volatile (
+		"r1 = 7;"
+		"r1 += 7;"
+		"r2 = 7;"
+		"r3 = 42;"
+		"r2 += r3;"
+		"r0 = 0;"
+		"exit;"
+		::: __clobber_all);
+}
+
+#ifdef CAN_USE_BPF_ST
+SEC("socket")
+__log_level(2)
+__msg("  1: .1........ (07) r1 += -8")
+__msg("  2: .1........ (7a) *(u64 *)(r1 +0) = 7")
+__msg("  3: .1........ (b7) r2 = 42")
+__msg("  4: .12....... (7b) *(u64 *)(r1 +0) = r2")
+__msg("  5: .12....... (7b) *(u64 *)(r1 +0) = r2")
+__msg("  6: .......... (b7) r0 = 0")
+__naked void store(void)
+{
+	asm volatile (
+		"r1 = r10;"
+		"r1 += -8;"
+		"*(u64 *)(r1 +0) = 7;"
+		"r2 = 42;"
+		"*(u64 *)(r1 +0) = r2;"
+		"*(u64 *)(r1 +0) = r2;"
+		"r0 = 0;"
+		"exit;"
+		::: __clobber_all);
+}
+#endif
+
+SEC("socket")
+__log_level(2)
+__msg("1: ....4..... (07) r4 += -8")
+__msg("2: ....4..... (79) r5 = *(u64 *)(r4 +0)")
+__msg("3: ....45.... (07) r4 += -8")
+__naked void load(void)
+{
+	asm volatile (
+		"r4 = r10;"
+		"r4 += -8;"
+		"r5 = *(u64 *)(r4 +0);"
+		"r4 += -8;"
+		"r0 = r5;"
+		"exit;"
+		::: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("0: .1........ (61) r2 = *(u32 *)(r1 +0)")
+__msg("1: ..2....... (d4) r2 = le64 r2")
+__msg("2: ..2....... (bf) r0 = r2")
+__naked void endian(void)
+{
+	asm volatile (
+		"r2 = *(u32 *)(r1 +0);"
+		"r2 = le64 r2;"
+		"r0 = r2;"
+		"exit;"
+		::: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg(" 8: 0......... (b7) r1 = 1")
+__msg(" 9: 01........ (db) r1 = atomic64_fetch_add((u64 *)(r0 +0), r1)")
+__msg("10: 01........ (c3) lock *(u32 *)(r0 +0) += r1")
+__msg("11: 01........ (db) r1 = atomic64_xchg((u64 *)(r0 +0), r1)")
+__msg("12: 01........ (bf) r2 = r0")
+__msg("13: .12....... (bf) r0 = r1")
+__msg("14: .12....... (db) r0 = atomic64_cmpxchg((u64 *)(r2 +0), r0, r1)")
+__naked void atomic(void)
+{
+	asm volatile (
+		"r2 = r10;"
+		"r2 += -8;"
+		"r1 = 0;"
+		"*(u64 *)(r2 +0) = r1;"
+		"r1 = %[test_map] ll;"
+		"call %[bpf_map_lookup_elem];"
+		"if r0 == 0 goto 1f;"
+		"r1 = 1;"
+		"r1 = atomic_fetch_add((u64 *)(r0 +0), r1);"
+		".8byte %[add_nofetch];" /* same as "lock *(u32 *)(r0 +0) += r1;" */
+		"r1 = xchg_64(r0 + 0, r1);"
+		"r2 = r0;"
+		"r0 = r1;"
+		"r0 = cmpxchg_64(r2 + 0, r0, r1);"
+		"1: exit;"
+		:
+		: __imm(bpf_map_lookup_elem),
+		  __imm_addr(test_map),
+		  __imm_insn(add_nofetch, BPF_ATOMIC_OP(BPF_W, BPF_ADD, BPF_REG_0, BPF_REG_1, 0))
+		: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("4: .12345.7.. (85) call bpf_trace_printk#6")
+__msg("5: 0......7.. (0f) r0 += r7")
+__naked void regular_call(void)
+{
+	asm volatile (
+		"r7 = 1;"
+		"r1 = r10;"
+		"r1 += -8;"
+		"r2 = 1;"
+		"call %[bpf_trace_printk];"
+		"r0 += r7;"
+		"exit;"
+		:
+		: __imm(bpf_trace_printk)
+		: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("2: 012....... (25) if r1 > 0x7 goto pc+1")
+__msg("3: ..2....... (bf) r0 = r2")
+__naked void if1(void)
+{
+	asm volatile (
+		"r0 = 1;"
+		"r2 = 2;"
+		"if r1 > 0x7 goto +1;"
+		"r0 = r2;"
+		"exit;"
+		::: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("3: 0123...... (2d) if r1 > r3 goto pc+1")
+__msg("4: ..2....... (bf) r0 = r2")
+__naked void if2(void)
+{
+	asm volatile (
+		"r0 = 1;"
+		"r2 = 2;"
+		"r3 = 7;"
+		"if r1 > r3 goto +1;"
+		"r0 = r2;"
+		"exit;"
+		::: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("0: .......... (b7) r1 = 0")
+__msg("1: .1........ (b7) r2 = 7")
+__msg("2: .12....... (25) if r1 > 0x7 goto pc+4")
+__msg("3: .12....... (07) r1 += 1")
+__msg("4: .12....... (27) r2 *= 2")
+__msg("5: .12....... (05) goto pc+0")
+__msg("6: .12....... (05) goto pc-5")
+__msg("7: .......... (b7) r0 = 0")
+__msg("8: 0......... (95) exit")
+__naked void loop(void)
+{
+	asm volatile (
+		"r1 = 0;"
+		"r2 = 7;"
+		"if r1 > 0x7 goto +4;"
+		"r1 += 1;"
+		"r2 *= 2;"
+		"goto +0;"
+		"goto -5;"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_trace_printk)
+		: __clobber_all);
+}
+
+#ifdef CAN_USE_GOTOL
+SEC("socket")
+__log_level(2)
+__msg("2: .123...... (25) if r1 > 0x7 goto pc+2")
+__msg("3: ..2....... (bf) r0 = r2")
+__msg("4: 0......... (06) gotol pc+1")
+__msg("5: ...3...... (bf) r0 = r3")
+__msg("6: 0......... (95) exit")
+__naked void gotol(void)
+{
+	asm volatile (
+		"r2 = 42;"
+		"r3 = 24;"
+		"if r1 > 0x7 goto +2;"
+		"r0 = r2;"
+		"gotol +1;"
+		"r0 = r3;"
+		"exit;"
+		:
+		: __imm(bpf_trace_printk)
+		: __clobber_all);
+}
+#endif
+
+SEC("socket")
+__log_level(2)
+__msg("0: 0......... (b7) r1 = 1")
+__msg("1: 01........ (e5) may_goto pc+1")
+__msg("2: 0......... (05) goto pc-3")
+__msg("3: .1........ (bf) r0 = r1")
+__msg("4: 0......... (95) exit")
+__naked void may_goto(void)
+{
+	asm volatile (
+	"1: r1 = 1;"
+	".8byte %[may_goto];"
+	"goto 1b;"
+	"r0 = r1;"
+	"exit;"
+	:
+	: __imm(bpf_get_smp_processor_id),
+	  __imm_insn(may_goto, BPF_RAW_INSN(BPF_JMP | BPF_JCOND, 0, 0, +1 /* offset */, 0))
+	: __clobber_all);
+}
+
+SEC("socket")
+__log_level(2)
+__msg("1: 0......... (18) r2 = 0x7")
+__msg("3: 0.2....... (0f) r0 += r2")
+__naked void ldimm64(void)
+{
+	asm volatile (
+		"r0 = 0;"
+		"r2 = 0x7 ll;"
+		"r0 += r2;"
+		"exit;"
+		:
+		:: __clobber_all);
+}
+
+/* No rules specific for LD_ABS/LD_IND, default behaviour kicks in */
+SEC("socket")
+__log_level(2)
+__msg("2: 0123456789 (30) r0 = *(u8 *)skb[42]")
+__msg("3: 012.456789 (0f) r7 += r0")
+__msg("4: 012.456789 (b7) r3 = 42")
+__msg("5: 0123456789 (50) r0 = *(u8 *)skb[r3 + 0]")
+__msg("6: 0......7.. (0f) r7 += r0")
+__naked void ldabs(void)
+{
+	asm volatile (
+		"r6 = r1;"
+		"r7 = 0;"
+		"r0 = *(u8 *)skb[42];"
+		"r7 += r0;"
+		"r3 = 42;"
+		".8byte %[ld_ind];" /* same as "r0 = *(u8 *)skb[r3];" */
+		"r7 += r0;"
+		"r0 = r7;"
+		"exit;"
+		:
+		: __imm_insn(ld_ind, BPF_LD_IND(BPF_B, BPF_REG_3, 0))
+		: __clobber_all);
+}
+
+
+#ifdef __BPF_FEATURE_ADDR_SPACE_CAST
+SEC("?fentry.s/" SYS_PREFIX "sys_getpgid")
+__log_level(2)
+__msg(" 6: .12345.... (85) call bpf_arena_alloc_pages")
+__msg(" 7: 0......... (bf) r1 = addr_space_cast(r0, 0, 1)")
+__msg(" 8: .1........ (b7) r2 = 42")
+__naked void addr_space_cast(void)
+{
+	asm volatile (
+		"r1 = %[arena] ll;"
+		"r2 = 0;"
+		"r3 = 1;"
+		"r4 = 0;"
+		"r5 = 0;"
+		"call %[bpf_arena_alloc_pages];"
+		"r1 = addr_space_cast(r0, 0, 1);"
+		"r2 = 42;"
+		"*(u64 *)(r1 +0) = r2;"
+		"r0 = 0;"
+		"exit;"
+		:
+		: __imm(bpf_arena_alloc_pages),
+		  __imm_addr(arena)
+		: __clobber_all);
+}
+#endif
+
+/* to retain debug info for BTF generation */
+void kfunc_root(void)
+{
+	bpf_arena_alloc_pages(0, 0, 0, 0, 0);
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/verifier_gotol.c b/tools/testing/selftests/bpf/progs/verifier_gotol.c
index 05a329ee45ee..d5d8f24df394 100644
--- a/tools/testing/selftests/bpf/progs/verifier_gotol.c
+++ b/tools/testing/selftests/bpf/progs/verifier_gotol.c
@@ -4,11 +4,7 @@
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
-#if (defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86) || \
-	(defined(__TARGET_ARCH_riscv) && __riscv_xlen == 64) || \
-	defined(__TARGET_ARCH_arm) || defined(__TARGET_ARCH_s390) || \
-	defined(__TARGET_ARCH_loongarch)) && \
-	__clang_major__ >= 18
+#ifdef CAN_USE_GOTOL
 
 SEC("socket")
 __description("gotol, small_imm")
diff --git a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
index e54bb5385bc1..75dd922e4e9f 100644
--- a/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
+++ b/tools/testing/selftests/bpf/progs/verifier_iterating_callbacks.c
@@ -407,11 +407,7 @@ l0_%=:	call %[bpf_jiffies64];		\
 	: __clobber_all);
 }
 
-#if (defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86) || \
-	(defined(__TARGET_ARCH_riscv) && __riscv_xlen == 64) || \
-	defined(__TARGET_ARCH_arm) || defined(__TARGET_ARCH_s390) || \
-	defined(__TARGET_ARCH_loongarch)) && \
-	__clang_major__ >= 18
+#ifdef CAN_USE_GOTOL
 SEC("socket")
 __success __retval(0)
 __naked void gotol_and_may_goto(void)
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers
  2025-01-22 12:04 ` [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers Eduard Zingerman
@ 2025-01-22 19:45   ` Eduard Zingerman
  0 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2025-01-22 19:45 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Wed, 2025-01-22 at 04:04 -0800, Eduard Zingerman wrote:

[...]

> @@ -23141,6 +23441,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
>  	if (ret)
>  		goto skip_full_check;
>  
> +	ret = compute_live_registers(env);
> +	if (ret < 0)
> +		goto skip_full_check;
> +
>  	ret = mark_fastcall_patterns(env);
>  	if (ret < 0)
>  		goto skip_full_check;

In an off-list discussion Alexei asked to keep this enabled only in
privileged mode.

[...]


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-01-22 19:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-22 12:04 [RFC bpf-next v1 0/7] bpf: improvements for iterator-based loops convergence Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 1/7] bpf: copy_verifier_state() should copy 'loop_entry' field Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 2/7] selftests/bpf: test correct loop_entry update in copy_verifier_state Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 3/7] bpf: don't do clean_live_states when state->loop_entry->branches > 0 Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 4/7] selftests/bpf: check states pruning for deeply nested iterator Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 5/7] bpf: DFA-based liveness analysis for program registers Eduard Zingerman
2025-01-22 19:45   ` Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 6/7] bpf: use register liveness information for func_states_equal Eduard Zingerman
2025-01-22 12:04 ` [RFC bpf-next v1 7/7] selftests/bpf: test cases for compute_live_registers() Eduard Zingerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox