[PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
@ 2024-10-09  2:12 Eduard Zingerman
  2024-10-09  2:12 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-09  2:12 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

In [1] syzbot reported an interesting BPF program.
Verification for this program takes a very long time.

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/

The program could be simplified to the following snippet:

    /* Program type is kprobe */
       r7 = *(u16 *)(r1 +0);
    1: r7 += 0x1ab064b9;
       if r7 & 0x702000 goto 1b;
       r7 &= 0x1ee60e;
       r7 += r1;
       if r7 s> 0x37d2 goto +0;
       r0 = 0;
       exit;

The snippet exhibits the following behaviour depending on
BPF_COMPLEXITY_LIMIT_INSNS:
- at 1,000,000 verification does not finish in 15 minutes;
- at 100,000 verification finishes in 15 seconds;
- at 100 it is possible to get some verifier log.

Perf report collected for verification of the snippet:

    99.34%        veristat
       99.34%        [kernel.kallsyms]
          71.25%        [k] __mark_chain_precision
          24.81%        [k] bt_sync_linked_regs
          ...

Fragment of the log collected with instruction limit set to 100:

    1: (07) r7 += 447767737               ; R7_w=scalar(...)
    2: (45) if r7 & 0x702000 goto pc-2
    mark_precise: frame0: last_idx 2 first_idx 2 subseq_idx -1
    mark_precise: frame0: regs=r7 stack= before 1: (07) r7 += 447767737
    mark_precise: frame0: regs=r7 stack= before 2: (45) if r7 & 0x702000 goto pc-2
      ... repeats 25 times ...
    mark_precise: frame0: regs=r7 stack= before 1: (07) r7 += 447767737
    mark_precise: frame0: regs=r7 stack= before 2: (45) if r7 & 0x702000 goto pc-2
    mark_precise: frame0: parent state regs= stack=:  R1_r=ctx() R7_rw=Pscalar(...) R10=fp0
    2: R7_w=scalar(...)
    1: (07) r7 += 447767737

This shows very long backtracking walks done by mark_chain_precision()
inside of the very small loop. Such backtracking walks terminate at
checkpoint state boundaries, but checkpoints are introduced too
infrequently for this loop.

This patch forcibly enables checkpoints for each loop back-edge.
This helps with the programs in question, as verification of both
syzbot program and reduced snippet finishes in ~2.5 sec.

However, this comes with some cost. Here are veristat results for
selftests listed in veristat.cfg and cillium BPF programs from [2].

[2] https://github.com/anakryiko/cilium

$ ./veristat -e file,prog,duration,insns,states -f 'states_pct>5' \
  -C master-baseline.log this-patch.log
File                        Program                               Duration (us) (DIFF)  Insns     (DIFF)  States       (DIFF)
--------------------------  ------------------------------------  --------------------  ----------------  -------------------
loop1.bpf.o                 nested_loops                              +52014 (+40.34%)       +0 (+0.00%)    +39649 (+720.37%)
loop2.bpf.o                 while_true                                +1308 (+161.48%)  +2861 (+160.46%)      +316 (+554.39%)
loop3.bpf.o                 while_true                              +350359 (+106.88%)       +0 (+0.00%)  +101448 (+1049.86%)
loop4.bpf.o                 combinations                                 -88 (-34.78%)    -221 (-42.18%)        +13 (+72.22%)
loop5.bpf.o                 while_true                                    +5 (+10.64%)       +3 (+3.57%)         +2 (+22.22%)
loop6.bpf.o                 trace_virtqueue_add_sgs                    -1886 (-34.01%)   -3717 (-37.38%)         +18 (+8.61%)
netif_receive_skb.bpf.o     trace_netif_receive_skb                        +5 (+0.04%)       +0 (+0.00%)      +562 (+111.29%)
profiler2.bpf.o             kprobe__vfs_link                           -1132 (-12.50%)    -1142 (-8.41%)       +158 (+53.20%)
profiler2.bpf.o             kprobe__vfs_symlink                        -1059 (-17.39%)   -1173 (-13.63%)        +77 (+37.75%)
profiler2.bpf.o             kprobe_ret__do_filp_open                     -465 (-8.40%)     -832 (-9.91%)        +84 (+40.00%)
profiler2.bpf.o             raw_tracepoint__sched_process_exit          -524 (-29.91%)   -1050 (-33.89%)          -5 (-6.02%)
profiler2.bpf.o             tracepoint__syscalls__sys_enter_kill       -4977 (-22.00%)   -4585 (-18.96%)         +70 (+6.75%)
pyperf600.bpf.o             on_event                                   +16312 (+0.81%)   -53870 (-9.81%)      -3000 (-10.16%)
strobemeta_nounroll1.bpf.o  on_event                                  -16609 (-40.28%)  -20944 (-38.87%)       -294 (-18.62%)
strobemeta_nounroll2.bpf.o  on_event                                  -22022 (-23.60%)  -40646 (-34.23%)       -578 (-15.41%)
strobemeta_subprogs.bpf.o   on_event                                  -17414 (-33.30%)  -19758 (-36.47%)       -185 (-12.33%)
test_sysctl_loop1.bpf.o     sysctl_tcp_mem                                +76 (+8.05%)      +60 (+4.33%)       +71 (+273.08%)
test_sysctl_loop2.bpf.o     sysctl_tcp_mem                              +156 (+13.38%)      +77 (+5.13%)       +82 (+282.76%)
xdp_synproxy_kern.bpf.o     syncookie_tc                               +8711 (+52.72%)   +3610 (+26.23%)       +209 (+53.87%)
xdp_synproxy_kern.bpf.o     syncookie_xdp                              +8571 (+50.01%)   +3610 (+26.10%)       +209 (+53.73%)

The test case 'state_loop_first_last_equal' needs an update because
checkpoints placement for it changed. (There is now a forced
checkpoint at 'l0'). The goal of the test is to check that there is a
jump history entry with the same last_idx/first_idx pair.
Update the test by moving 'if' instruction to 'l0',
so that total number of checkpoints in the test remains the same.
(But now interesting last_idx/first_idx pair is 1/1 instead of 4/4).

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c                         |  4 +++
 .../selftests/bpf/progs/verifier_precision.c  | 25 +++++++++++--------
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7d9b38ffd220..3fa517b3fb57 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -16055,9 +16055,11 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
 	int *insn_state = env->cfg.insn_state;
 
 	if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH))
+		/* FALLTHROUGH from t already processed */
 		return DONE_EXPLORING;
 
 	if (e == BRANCH && insn_state[t] >= (DISCOVERED | BRANCH))
+		/* BRANCH from t already processed */
 		return DONE_EXPLORING;
 
 	if (w < 0 || w >= env->prog->len) {
@@ -16081,6 +16083,8 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env)
 		insn_stack[env->cfg.cur_stack++] = w;
 		return KEEP_EXPLORING;
 	} else if ((insn_state[w] & 0xF0) == DISCOVERED) {
+		/* back-edge */
+		mark_force_checkpoint(env, w);
 		if (env->bpf_capable)
 			return DONE_EXPLORING;
 		verbose_linfo(env, t, "%d: ", t);
diff --git a/tools/testing/selftests/bpf/progs/verifier_precision.c b/tools/testing/selftests/bpf/progs/verifier_precision.c
index 6b564d4c0986..7215c0cc0ccb 100644
--- a/tools/testing/selftests/bpf/progs/verifier_precision.c
+++ b/tools/testing/selftests/bpf/progs/verifier_precision.c
@@ -95,34 +95,37 @@ __naked int bpf_end_bswap(void)
 SEC("?raw_tp")
 __success __log_level(2)
 /*
- * Without the bug fix there will be no history between "last_idx 3 first_idx 3"
+ * Without the bug fix there will be no history between "last_idx 1 first_idx 1"
  * and "parent state regs=" lines. "R0_w=6" parts are here to help anchor
  * expected log messages to the one specific mark_chain_precision operation.
  *
  * This is quite fragile: if verifier checkpointing heuristic changes, this
  * might need adjusting.
  */
-__msg("2: (07) r0 += 1                       ; R0_w=6")
-__msg("3: (35) if r0 >= 0xa goto pc+1")
-__msg("mark_precise: frame0: last_idx 3 first_idx 3 subseq_idx -1")
-__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
-__msg("mark_precise: frame0: regs=r0 stack= before 1: (07) r0 += 1")
+__msg("3: (07) r0 += 1                       ; R0_w=6")
+__msg("4: (05) goto pc-4")
+__msg("1: (35) if r0 >= 0xa goto pc+3")
+__msg("mark_precise: frame0: last_idx 1 first_idx 1 subseq_idx -1")
+__msg("mark_precise: frame0: parent state regs=r0 stack=:  R0_rw=P6 R1=ctx() R10=fp0")
+__msg("mark_precise: frame0: last_idx 4 first_idx 1 subseq_idx 1")
 __msg("mark_precise: frame0: regs=r0 stack= before 4: (05) goto pc-4")
-__msg("mark_precise: frame0: regs=r0 stack= before 3: (35) if r0 >= 0xa goto pc+1")
-__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4")
-__msg("3: R0_w=6")
+__msg("mark_precise: frame0: regs=r0 stack= before 3: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 2: (07) r0 += 1")
+__msg("mark_precise: frame0: regs=r0 stack= before 1: (35) if r0 >= 0xa goto pc+3")
+__msg("mark_precise: frame0: parent state regs= stack=:  R0_rw=P4 R1=ctx() R10=fp0")
+__msg("1: R0=6")
 __naked int state_loop_first_last_equal(void)
 {
 	asm volatile (
 		"r0 = 0;"
 	"l0_%=:"
-		"r0 += 1;"
-		"r0 += 1;"
 		/* every few iterations we'll have a checkpoint here with
 		 * first_idx == last_idx, potentially confusing precision
 		 * backtracking logic
 		 */
 		"if r0 >= 10 goto l1_%=;"	/* checkpoint + mark_precise */
+		"r0 += 1;"
+		"r0 += 1;"
 		"goto l0_%=;"
 	"l1_%=:"
 		"exit;"
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop
  2024-10-09  2:12 [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
@ 2024-10-09  2:12 ` Eduard Zingerman
  2024-10-09  9:23 ` [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
  2024-10-09 19:41 ` Eduard Zingerman
  2 siblings, 0 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-09  2:12 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

The test added is a simplified reproducer from syzbot report [1].
If verifier does not insert checkpoint somewhere inside the loop,
verification of the program would take a very long time.

This would happen because mark_chain_precision() for register r7 would
constantly trace jump history of the loop back, processing many
iterations for each mark_chain_precision() call.

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../bpf/progs/verifier_search_pruning.c       | 23 +++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
index 5a14498d352f..f40e57251e94 100644
--- a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
+++ b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
@@ -2,6 +2,7 @@
 /* Converted from tools/testing/selftests/bpf/verifier/search_pruning.c */
 
 #include <linux/bpf.h>
+#include <../../../include/linux/filter.h>
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
@@ -336,4 +337,26 @@ l0_%=:	r1 = 42;					\
 	: __clobber_all);
 }
 
+/* Without checkpoint forcibly inserted at the back-edge a loop this
+ * test would take a very long time to verify.
+ */
+SEC("kprobe")
+__failure __log_level(4)
+__msg("BPF program is too large.")
+__naked void short_loop1(void)
+{
+	asm volatile (
+	"   r7 = *(u16 *)(r1 +0);"
+	"1: r7 += 0x1ab064b9;"
+	"   .8byte %[jset];" /* same as 'if r7 & 0x702000 goto 1b;' */
+	"   r7 &= 0x1ee60e;"
+	"   r7 += r1;"
+	"   if r7 s> 0x37d2 goto +0;"
+	"   r0 = 0;"
+	"   exit;"
+	:
+	: __imm_insn(jset, BPF_JMP_IMM(BPF_JSET, BPF_REG_7, 0x702000, -2))
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-09  2:12 [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
  2024-10-09  2:12 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
@ 2024-10-09  9:23 ` Eduard Zingerman
  2024-10-09 19:41 ` Eduard Zingerman
  2 siblings, 0 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-09  9:23 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Tue, 2024-10-08 at 19:12 -0700, Eduard Zingerman wrote:

[...]

> This patch forcibly enables checkpoints for each loop back-edge.
> This helps with the programs in question, as verification of both
> syzbot program and reduced snippet finishes in ~2.5 sec.

[...]

> File                        Program                               Insns     (DIFF)  States       (DIFF)
> --------------------------  ------------------------------------  ----------------  -------------------
> ...
> pyperf600.bpf.o             on_event                               -53870 (-9.81%)      -3000 (-10.16%)
> ...

fwiw, this patch speeds up verification of pyperf600 by a small margin,
sufficient for it to pass in combination with jump history bump when LLVM20 is used.

# ./veristat pyperf600.bpf.o no_alu32/pyperf600.bpf.o cpuv4/pyperf600.bpf.o
...
File             Program   Verdict  Duration (us)   Insns  States  Peak states
---------------  --------  -------  -------------  ------  ------  -----------
pyperf600.bpf.o  on_event  success        2400571  872914   26490        25480
pyperf600.bpf.o  on_event  success        2460908  947038   26090        25330
pyperf600.bpf.o  on_event  success        2158117  788481   26329        25368
---------------  --------  -------  -------------  ------  ------  -----------

W/o this patch jump history bump is not sufficient to get no_alu32 version verified,
instruction limit is reached.

The draft is here:
https://github.com/eddyz87/bpf/tree/jmp-history-pyperf600

[...]


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-09  2:12 [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
  2024-10-09  2:12 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
  2024-10-09  9:23 ` [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
@ 2024-10-09 19:41 ` Eduard Zingerman
  2024-10-10  1:08   ` Alexei Starovoitov
  2 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-09 19:41 UTC (permalink / raw)
  To: bpf, ast; +Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song

On Tue, 2024-10-08 at 19:12 -0700, Eduard Zingerman wrote:
> In [1] syzbot reported an interesting BPF program.
> Verification for this program takes a very long time.
> 
> [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> 
> The program could be simplified to the following snippet:
> 
>     /* Program type is kprobe */
>        r7 = *(u16 *)(r1 +0);
>     1: r7 += 0x1ab064b9;
>        if r7 & 0x702000 goto 1b;
>        r7 &= 0x1ee60e;
>        r7 += r1;
>        if r7 s> 0x37d2 goto +0;
>        r0 = 0;
>        exit;

Answering a few questions from off-list discussion with Alexei.
The test is not specific for jset instruction, e.g. the following
program exhibits similar behaviour:

SEC("kprobe")
__failure __log_level(4)
__msg("BPF program is too large.")
__naked void short_loop1(void)
{
	asm volatile (
	"   r7 = *(u16 *)(r1 +0);"
	"   r8 = *(u64 *)(r1 +16);"
	"1: r7 += 0x1ab064b9;"
	"if r7 < r8 goto 1b;"
	"   r7 &= 0x1ee60e;"
	"   r7 += r1;"
	"   if r7 s> 0x37d2 goto +0;"
	"   r0 = 0;"
	"   exit;"
	::: __clobber_all);
}

> The snippet exhibits the following behaviour depending on
> BPF_COMPLEXITY_LIMIT_INSNS:
> - at 1,000,000 verification does not finish in 15 minutes;
> - at 100,000 verification finishes in 15 seconds;
> - at 100 it is possible to get some verifier log.

Still investigating why running time change is non-linear.

[...]

> This patch forcibly enables checkpoints for each loop back-edge.
> This helps with the programs in question, as verification of both
> syzbot program and reduced snippet finishes in ~2.5 sec.

There is the following code in is_state_visited():

			...
			/* if the verifier is processing a loop, avoid adding new state
			 * too often, since different loop iterations have distinct
			 * states and may not help future pruning.
			 * This threshold shouldn't be too low to make sure that
			 * a loop with large bound will be rejected quickly.
			 * The most abusive loop will be:
			 * r1 += 1
			 * if r1 < 1000000 goto pc-2
			 * 1M insn_procssed limit / 100 == 10k peak states.
			 * This threshold shouldn't be too high either, since states
			 * at the end of the loop are likely to be useful in pruning.
			 */
skip_inf_loop_check:
			if (!env->test_state_freq &&
			    env->jmps_processed - env->prev_jmps_processed < 20 &&
			    env->insn_processed - env->prev_insn_processed < 100)
				add_new_state = false;
			goto miss;
			...

Which runs into a direct contradiction with what I do in this patch,
so either I need to change the patch or this fragment needs adjustment.

[...]


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-09 19:41 ` Eduard Zingerman
@ 2024-10-10  1:08   ` Alexei Starovoitov
  2024-10-10 22:13     ` Andrii Nakryiko
  0 siblings, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-10-10  1:08 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Wed, Oct 9, 2024 at 12:41 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-10-08 at 19:12 -0700, Eduard Zingerman wrote:
> > In [1] syzbot reported an interesting BPF program.
> > Verification for this program takes a very long time.
> >
> > [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> >
> > The program could be simplified to the following snippet:
> >
> >     /* Program type is kprobe */
> >        r7 = *(u16 *)(r1 +0);
> >     1: r7 += 0x1ab064b9;
> >        if r7 & 0x702000 goto 1b;
> >        r7 &= 0x1ee60e;
> >        r7 += r1;
> >        if r7 s> 0x37d2 goto +0;
> >        r0 = 0;
> >        exit;
>
> Answering a few questions from off-list discussion with Alexei.
> The test is not specific for jset instruction, e.g. the following
> program exhibits similar behaviour:
>
> SEC("kprobe")
> __failure __log_level(4)
> __msg("BPF program is too large.")
> __naked void short_loop1(void)
> {
>         asm volatile (
>         "   r7 = *(u16 *)(r1 +0);"
>         "   r8 = *(u64 *)(r1 +16);"
>         "1: r7 += 0x1ab064b9;"
>         "if r7 < r8 goto 1b;"
>         "   r7 &= 0x1ee60e;"
>         "   r7 += r1;"
>         "   if r7 s> 0x37d2 goto +0;"
>         "   r0 = 0;"
>         "   exit;"
>         ::: __clobber_all);
> }
>
> > The snippet exhibits the following behaviour depending on
> > BPF_COMPLEXITY_LIMIT_INSNS:
> > - at 1,000,000 verification does not finish in 15 minutes;
> > - at 100,000 verification finishes in 15 seconds;
> > - at 100 it is possible to get some verifier log.
>
> Still investigating why running time change is non-linear.
>
> [...]
>
> > This patch forcibly enables checkpoints for each loop back-edge.
> > This helps with the programs in question, as verification of both
> > syzbot program and reduced snippet finishes in ~2.5 sec.
>
> There is the following code in is_state_visited():
>
>                         ...
>                         /* if the verifier is processing a loop, avoid adding new state
>                          * too often, since different loop iterations have distinct
>                          * states and may not help future pruning.
>                          * This threshold shouldn't be too low to make sure that
>                          * a loop with large bound will be rejected quickly.
>                          * The most abusive loop will be:
>                          * r1 += 1
>                          * if r1 < 1000000 goto pc-2
>                          * 1M insn_procssed limit / 100 == 10k peak states.
>                          * This threshold shouldn't be too high either, since states
>                          * at the end of the loop are likely to be useful in pruning.
>                          */
> skip_inf_loop_check:
>                         if (!env->test_state_freq &&
>                             env->jmps_processed - env->prev_jmps_processed < 20 &&
>                             env->insn_processed - env->prev_insn_processed < 100)
>                                 add_new_state = false;
>                         goto miss;
>                         ...
>
> Which runs into a direct contradiction with what I do in this patch,
> so either I need to change the patch or this fragment needs adjustment.

Yeah. It's not saving all states exactly to avoid:
loop3.bpf.o                 while_true
+350359 (+106.88%)       +0 (+0.00%)  +101448 (+1049.86%)

100k states is a lot of memory. It might cause OOMs.

pyperf600 improvement is certainly nice though.

It feels it's better to adjust above heuristic 20 && 100 instead
of forcing save state everywhere.

Something should be done about:
          71.25%        [k] __mark_chain_precision
          24.81%        [k] bt_sync_linked_regs
as well.
The algorithm there needs some tweaks.

pw-bot: cr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10  1:08   ` Alexei Starovoitov
@ 2024-10-10 22:13     ` Andrii Nakryiko
  2024-10-10 22:40       ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2024-10-10 22:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Wed, Oct 9, 2024 at 6:09 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Oct 9, 2024 at 12:41 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > On Tue, 2024-10-08 at 19:12 -0700, Eduard Zingerman wrote:
> > > In [1] syzbot reported an interesting BPF program.
> > > Verification for this program takes a very long time.
> > >
> > > [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> > >
> > > The program could be simplified to the following snippet:
> > >
> > >     /* Program type is kprobe */
> > >        r7 = *(u16 *)(r1 +0);
> > >     1: r7 += 0x1ab064b9;
> > >        if r7 & 0x702000 goto 1b;
> > >        r7 &= 0x1ee60e;
> > >        r7 += r1;
> > >        if r7 s> 0x37d2 goto +0;
> > >        r0 = 0;
> > >        exit;
> >
> > Answering a few questions from off-list discussion with Alexei.
> > The test is not specific for jset instruction, e.g. the following
> > program exhibits similar behaviour:
> >
> > SEC("kprobe")
> > __failure __log_level(4)
> > __msg("BPF program is too large.")
> > __naked void short_loop1(void)
> > {
> >         asm volatile (
> >         "   r7 = *(u16 *)(r1 +0);"
> >         "   r8 = *(u64 *)(r1 +16);"
> >         "1: r7 += 0x1ab064b9;"
> >         "if r7 < r8 goto 1b;"
> >         "   r7 &= 0x1ee60e;"
> >         "   r7 += r1;"
> >         "   if r7 s> 0x37d2 goto +0;"
> >         "   r0 = 0;"
> >         "   exit;"
> >         ::: __clobber_all);
> > }
> >
> > > The snippet exhibits the following behaviour depending on
> > > BPF_COMPLEXITY_LIMIT_INSNS:
> > > - at 1,000,000 verification does not finish in 15 minutes;
> > > - at 100,000 verification finishes in 15 seconds;
> > > - at 100 it is possible to get some verifier log.
> >
> > Still investigating why running time change is non-linear.
> >
> > [...]
> >
> > > This patch forcibly enables checkpoints for each loop back-edge.
> > > This helps with the programs in question, as verification of both
> > > syzbot program and reduced snippet finishes in ~2.5 sec.
> >
> > There is the following code in is_state_visited():
> >
> >                         ...
> >                         /* if the verifier is processing a loop, avoid adding new state
> >                          * too often, since different loop iterations have distinct
> >                          * states and may not help future pruning.
> >                          * This threshold shouldn't be too low to make sure that
> >                          * a loop with large bound will be rejected quickly.
> >                          * The most abusive loop will be:
> >                          * r1 += 1
> >                          * if r1 < 1000000 goto pc-2
> >                          * 1M insn_procssed limit / 100 == 10k peak states.
> >                          * This threshold shouldn't be too high either, since states
> >                          * at the end of the loop are likely to be useful in pruning.
> >                          */
> > skip_inf_loop_check:
> >                         if (!env->test_state_freq &&
> >                             env->jmps_processed - env->prev_jmps_processed < 20 &&
> >                             env->insn_processed - env->prev_insn_processed < 100)
> >                                 add_new_state = false;
> >                         goto miss;
> >                         ...
> >
> > Which runs into a direct contradiction with what I do in this patch,
> > so either I need to change the patch or this fragment needs adjustment.
>
> Yeah. It's not saving all states exactly to avoid:
> loop3.bpf.o                 while_true
> +350359 (+106.88%)       +0 (+0.00%)  +101448 (+1049.86%)
>
> 100k states is a lot of memory. It might cause OOMs.
>
> pyperf600 improvement is certainly nice though.
>
> It feels it's better to adjust above heuristic 20 && 100 instead
> of forcing save state everywhere.
>
> Something should be done about:
>           71.25%        [k] __mark_chain_precision
>           24.81%        [k] bt_sync_linked_regs
> as well.
> The algorithm there needs some tweaks.

If we were to store bpf_jmp_history_entry for each instruction (and we
can do that efficiently, memory-wise, I had the patch), and then for
each instruction we maintained a list of "input" regs/slots and
corresponding "output" regs/slots as we simulate each instruction
forward, I think __mark_chain_precision would be much simpler and thus
faster. We'd basically just walk backwards instruction by instruction,
check if any of the output regs/slots need to be precise (few bitmasks
intersection), and if yes, set all input regs/slots as "need
precision", and just continue forward.

I think it's actually a simpler approach and should be faster. Simpler
because it's easy to tell inputs/outputs while doing forward
instruction processing. Faster because __mark_chain_precision would
only do very simple operation without lots of branching and checks.

We are already halfway there, tbh, with the linked registers and
insn_stack_access_flags() (i.e., information that we couldn't
re-derive in the backwards pass). So maybe we should bite the bullet
and go all in?

P.S. And just in general, there is a certain appeal to always having a
complete history up to any current point in time. Feels like this
would be useful in the future for some extra optimizations or checks.

>
> pw-bot: cr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10 22:13     ` Andrii Nakryiko
@ 2024-10-10 22:40       ` Eduard Zingerman
  2024-10-10 22:52         ` Eduard Zingerman
  2024-10-11  1:44         ` Andrii Nakryiko
  0 siblings, 2 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-10 22:40 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, 2024-10-10 at 15:13 -0700, Andrii Nakryiko wrote:
> On Wed, Oct 9, 2024 at 6:09 PM Alexei Starovoitov

[...]

> > Something should be done about:
> >           71.25%        [k] __mark_chain_precision
> >           24.81%        [k] bt_sync_linked_regs
> > as well.
> > The algorithm there needs some tweaks.
> 
> If we were to store bpf_jmp_history_entry for each instruction (and we
> can do that efficiently, memory-wise, I had the patch), and then for
> each instruction we maintained a list of "input" regs/slots and
> corresponding "output" regs/slots as we simulate each instruction
> forward, I think __mark_chain_precision would be much simpler and thus
> faster. We'd basically just walk backwards instruction by instruction,
> check if any of the output regs/slots need to be precise (few bitmasks
> intersection), and if yes, set all input regs/slots as "need
> precision", and just continue forward.
> 
> I think it's actually a simpler approach and should be faster. Simpler
> because it's easy to tell inputs/outputs while doing forward
> instruction processing. Faster because __mark_chain_precision would
> only do very simple operation without lots of branching and checks.

I think this would bring significant speedup.
Not sure it would completely fix the issue at hand,
as mark_chain_precision() walks like 100 instructions back on each
iteration of the loop, but it might be a step in the right direction.

Do you mind if I refresh your old patches for jump history,
or do you want to work on this yourself?

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10 22:40       ` Eduard Zingerman
@ 2024-10-10 22:52         ` Eduard Zingerman
  2024-10-10 23:23           ` Alexei Starovoitov
  2024-10-11  1:44         ` Andrii Nakryiko
  1 sibling, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-10 22:52 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, 2024-10-10 at 15:40 -0700, Eduard Zingerman wrote:

> I think this would bring significant speedup.
> Not sure it would completely fix the issue at hand,
> as mark_chain_precision() walks like 100 instructions back on each
> iteration of the loop, but it might be a step in the right direction.

In theory, we can do mark_chain_precision() lazily:
- mark that certain registers need precision for an instruction;
- wait till next checkpoint is created;
- do the walk for marked registers.

This should be simpler to implement on top of what Andrii suggests.

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10 22:52         ` Eduard Zingerman
@ 2024-10-10 23:23           ` Alexei Starovoitov
  2024-10-18  2:17             ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2024-10-10 23:23 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, Oct 10, 2024 at 3:52 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-10-10 at 15:40 -0700, Eduard Zingerman wrote:
>
> > I think this would bring significant speedup.
> > Not sure it would completely fix the issue at hand,
> > as mark_chain_precision() walks like 100 instructions back on each
> > iteration of the loop, but it might be a step in the right direction.
>
> In theory, we can do mark_chain_precision() lazily:
> - mark that certain registers need precision for an instruction;
> - wait till next checkpoint is created;
> - do the walk for marked registers.
>
> This should be simpler to implement on top of what Andrii suggests.

Indeed it can wait until the next new_state, since
when the next iteration of the loop starts the sl->state
are guaranteed to be some olderstates. Though the verifier may have done
multiple loop iteration traversals, but it's still in 'cur' state,
so states_equal(env, &sl->state, cur...) should still be correct.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10 23:23           ` Alexei Starovoitov
@ 2024-10-18  2:17             ` Eduard Zingerman
  0 siblings, 0 replies; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-18  2:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, 2024-10-10 at 16:23 -0700, Alexei Starovoitov wrote:
> On Thu, Oct 10, 2024 at 3:52 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Thu, 2024-10-10 at 15:40 -0700, Eduard Zingerman wrote:
> > 
> > > I think this would bring significant speedup.
> > > Not sure it would completely fix the issue at hand,
> > > as mark_chain_precision() walks like 100 instructions back on each
> > > iteration of the loop, but it might be a step in the right direction.
> > 
> > In theory, we can do mark_chain_precision() lazily:
> > - mark that certain registers need precision for an instruction;
> > - wait till next checkpoint is created;
> > - do the walk for marked registers.
> > 
> > This should be simpler to implement on top of what Andrii suggests.
> 
> Indeed it can wait until the next new_state, since
> when the next iteration of the loop starts the sl->state
> are guaranteed to be some olderstates. Though the verifier may have done
> multiple loop iteration traversals, but it's still in 'cur' state,
> so states_equal(env, &sl->state, cur...) should still be correct.

After spending absolutely unreasonable amount of time on this I
finally identified the missing piece of the puzzle.

The example program takes so much time to verify because it
accumulates a huge jmp_history array (262145 entries)
within a single state and constantly executes mark_chain_precision()
against it.

The reason for such huge jmp_history is an interplay between two
conditions:

	if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
	    env->insn_processed - env->prev_insn_processed >= 8)
		add_new_state = true;

and

	if (!force_new_state &&
	    env->jmps_processed - env->prev_jmps_processed < 20 &&
	    env->insn_processed - env->prev_insn_processed < 100)
		add_new_state = false;

I try to explain it in the commit message for [1].

If one waits long enough, the verifier eventually finishes with
-ENOMEM, as it fails to allocate big enough jmp_history array
(one has to wait 15 minutes on master, and about a minute with my patch [2],
 the patch has not changed since our discussion on Wednesday,
 the error message is the same as we discussed on Wednesday).

The discussed change to lazy precision marks propagation turned out to
be a dud:
- it does not solve the issue at hand, verifier still exits with
  ENOMEM, just faster;
- overall verification speedup for regular programs is not big:

$ ./veristat -e file,prog,duration -f 'duration>100000' -C master-baseline.log current.log 
File                      Program           Duration (us) (A)  Duration (us) (B)  Duration (us) (DIFF)
------------------------  ----------------  -----------------  -----------------  --------------------
loop1.bpf.o               nested_loops                 124334             117382        -6952 (-5.59%)
loop3.bpf.o               while_true                   318165             253264      -64901 (-20.40%)
pyperf100.bpf.o           on_event                     704852             692300       -12552 (-1.78%)
pyperf180.bpf.o           on_event                    2304115            2251502       -52613 (-2.28%)
pyperf50.bpf.o            on_event                     183971             184124         +153 (+0.08%)
pyperf600.bpf.o           on_event                    2076073            2051733       -24340 (-1.17%)
pyperf600_nounroll.bpf.o  on_event                     394093             402552        +8459 (+2.15%)
strobemeta.bpf.o          on_event                     169218             169835         +617 (+0.36%)
test_verif_scale1.bpf.o   balancer_ingress             144779             146486        +1707 (+1.18%)
test_verif_scale2.bpf.o   balancer_ingress             147078             151902        +4824 (+3.28%)
test_verif_scale3.bpf.o   balancer_ingress             184959             188723        +3764 (+2.04%)

Here only loop3 demonstrates significant speedup (re-run for loop3 several times),
but it does not warrant the code change in my opinion.

[1] https://lore.kernel.org/bpf/20241018020307.1766906-1-eddyz87@gmail.com/
[2] https://github.com/eddyz87/bpf/tree/lazy-mark-chain-precision


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges
  2024-10-10 22:40       ` Eduard Zingerman
  2024-10-10 22:52         ` Eduard Zingerman
@ 2024-10-11  1:44         ` Andrii Nakryiko
  1 sibling, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2024-10-11  1:44 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, Oct 10, 2024 at 3:40 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2024-10-10 at 15:13 -0700, Andrii Nakryiko wrote:
> > On Wed, Oct 9, 2024 at 6:09 PM Alexei Starovoitov
>
> [...]
>
> > > Something should be done about:
> > >           71.25%        [k] __mark_chain_precision
> > >           24.81%        [k] bt_sync_linked_regs
> > > as well.
> > > The algorithm there needs some tweaks.
> >
> > If we were to store bpf_jmp_history_entry for each instruction (and we
> > can do that efficiently, memory-wise, I had the patch), and then for
> > each instruction we maintained a list of "input" regs/slots and
> > corresponding "output" regs/slots as we simulate each instruction
> > forward, I think __mark_chain_precision would be much simpler and thus
> > faster. We'd basically just walk backwards instruction by instruction,
> > check if any of the output regs/slots need to be precise (few bitmasks
> > intersection), and if yes, set all input regs/slots as "need
> > precision", and just continue forward.
> >
> > I think it's actually a simpler approach and should be faster. Simpler
> > because it's easy to tell inputs/outputs while doing forward
> > instruction processing. Faster because __mark_chain_precision would
> > only do very simple operation without lots of branching and checks.
>
> I think this would bring significant speedup.
> Not sure it would completely fix the issue at hand,
> as mark_chain_precision() walks like 100 instructions back on each
> iteration of the loop, but it might be a step in the right direction.
>
> Do you mind if I refresh your old patches for jump history,
> or do you want to work on this yourself?

No, I don't mind at all. Go ahead.

>
> [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
@ 2024-10-18  2:03 Eduard Zingerman
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-18  2:03 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

A specifically crafted program might trick verifier into growing very
long jump history within a single bpf_verifier_state instance.
Very long jump history makes mark_chain_precision() unreasonably slow,
especially in case if verifier processes a loop.

Mitigate this by forcing new state in is_state_visited() in case if
current state's jump history is too long.

Use same constant as in `skip_inf_loop_check`, but multiply it by
arbitrarily chosen value 2 to account for jump history containing not
only information about jumps, but also information about stack access.

For an example of problematic program consider the code below,
w/o this patch the example is processed by verifier for ~15 minutes,
before failing to allocate big-enough chunk for jmp_history.

    0: r7 = *(u16 *)(r1 +0);"
    1: r7 += 0x1ab064b9;"
    2: if r7 & 0x702000 goto 1b;
    3: r7 &= 0x1ee60e;"
    4: r7 += r1;"
    5: if r7 s> 0x37d2 goto +0;"
    6: r0 = 0;"
    7: exit;"

Perf profiling shows that most of the time is spent in
mark_chain_precision() ~95%.

The easiest way to explain why this program causes problems is to
apply the following patch:

    diff --git a/include/linux/bpf.h b/include/linux/bpf.h
    index 0c216e71cec7..4b4823961abe 100644
    \--- a/include/linux/bpf.h
    \+++ b/include/linux/bpf.h
    \@@ -1926,7 +1926,7 @@ struct bpf_array {
            };
     };

    -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
    +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
     #define MAX_TAIL_CALL_CNT 33

     /* Maximum number of loops for bpf_loop and bpf_iter_num.
    diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
    index f514247ba8ba..75e88be3bb3e 100644
    \--- a/kernel/bpf/verifier.c
    \+++ b/kernel/bpf/verifier.c
    \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
     skip_inf_loop_check:
                            if (!force_new_state &&
                                env->jmps_processed - env->prev_jmps_processed < 20 &&
    -                           env->insn_processed - env->prev_insn_processed < 100)
    +                           env->insn_processed - env->prev_insn_processed < 100) {
    +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
    +                                       env->insn_idx,
    +                                       env->jmps_processed - env->prev_jmps_processed,
    +                                       cur->jmp_history_cnt);
                                    add_new_state = false;
    +                       }
                            goto miss;
                    }
                    /* If sl->state is a part of a loop and this loop's entry is a part of
    \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
            if (!add_new_state)
                    return 0;

    +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
    +               env->insn_idx);
    +
            /* There were no equivalent states, remember the current one.
             * Technically the current state is not proven to be safe yet,
             * but it will either reach outer most bpf_exit (which means it's safe)

And observe verification log:

    ...
    is_state_visited: new checkpoint at 5, resetting env->jmps_processed
    5: R1=ctx() R7=ctx(...)
    5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
    6: (b7) r0 = 0                        ; R0_w=0
    7: (95) exit

    from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
    6: R1=ctx() R7=ctx(...) R10=fp0
    6: (b7) r0 = 0                        ; R0_w=0
    7: (95) exit
    is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74

    from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
    1: R1=ctx() R7_w=scalar(...) R10=fp0
    1: (07) r7 += 447767737
    is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
    2: R7_w=scalar(...)
    2: (45) if r7 & 0x702000 goto pc-2
    ... mark_precise 152 steps for r7 ...
    2: R7_w=scalar(...)
    is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
    1: (07) r7 += 447767737
    is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
    2: R7_w=scalar(...)
    2: (45) if r7 & 0x702000 goto pc-2
    ...
    BPF program is too large. Processed 257 insn

The log output shows that checkpoint at label (1) is never created,
because it is suppressed by `skip_inf_loop_check` logic:
a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
   onto stack and proceeds to (3);
b. At (5) checkpoint is created, and this resets
   env->{jmps,insns}_processed.
c. Verification proceeds and reaches `exit`;
d. State saved at step (a) is popped from stack and is_state_visited()
   considers if checkpoint needs to be added, but because
   env->{jmps,insns}_processed had been just reset at step (b)
   the `skip_inf_loop_check` logic forces `add_new_state` to false.
e. Verifier proceeds with current state, which slowly accumulates
   more and more entries in the jump history.

The accumulation of entries in the jump history is a problem because
of two factors:
- it eventually exhausts memory available for kmalloc() allocation;
- mark_chain_precision() traverses the jump history of a state,
  meaning that if `r7` is marked precise, verifier would iterate
  ever growing jump history until parent state boundary is reached.

(note: the log also shows a REG INVARIANTS VIOLATION warning
       upon jset processing, but that's another bug to fix).

With this patch applied, the example above is rejected by verifier
under 1s of time, reaching 1M instructions limit.

The program is a simplified reproducer from syzbot report [1].
Previous discussion could be found at [2].
The patch does not cause any changes in verification performance,
when tested on selftests from veristat.cfg and cilium programs taken
from [3].

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
[2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
[3] https://github.com/anakryiko/cilium

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f514247ba8ba..f64c831a9278 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
 	return false;
 }
 
+#define MAX_JMPS_PER_STATE 20
+
 static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 {
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl, **pprev;
 	struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
 	int i, j, n, err, states_cnt = 0;
-	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
+	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
+			       /* - Long jmp history hinders mark_chain_precision performance,
+				*   so force new state if jmp history of current state exceeds
+				*   a threshold.
+				* - Jmp history records not only jumps, but also stack access,
+				*   so keep this constant 2x times the limit imposed on
+				*   env->jmps_processed for loop cases (see skip_inf_loop_check).
+				*/
+			       cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;
 	bool add_new_state = force_new_state;
 	bool force_exact;
 
@@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 */
 skip_inf_loop_check:
 			if (!force_new_state &&
-			    env->jmps_processed - env->prev_jmps_processed < 20 &&
+			    env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
 			    env->insn_processed - env->prev_insn_processed < 100)
 				add_new_state = false;
 			goto miss;
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop
  2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
@ 2024-10-18  2:03 ` Eduard Zingerman
  2024-10-18 11:05   ` Daniel Borkmann
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2024-10-18  2:03 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

The test added is a simplified reproducer from syzbot report [1].
If verifier does not insert checkpoint somewhere inside the loop,
verification of the program would take a very long time.

This would happen because mark_chain_precision() for register r7 would
constantly trace jump history of the loop back, processing many
iterations for each mark_chain_precision() call.

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../bpf/progs/verifier_search_pruning.c       | 23 +++++++++++++++++++
 tools/testing/selftests/bpf/veristat.cfg      |  1 +
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
index 5a14498d352f..f40e57251e94 100644
--- a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
+++ b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
@@ -2,6 +2,7 @@
 /* Converted from tools/testing/selftests/bpf/verifier/search_pruning.c */
 
 #include <linux/bpf.h>
+#include <../../../include/linux/filter.h>
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
@@ -336,4 +337,26 @@ l0_%=:	r1 = 42;					\
 	: __clobber_all);
 }
 
+/* Without checkpoint forcibly inserted at the back-edge a loop this
+ * test would take a very long time to verify.
+ */
+SEC("kprobe")
+__failure __log_level(4)
+__msg("BPF program is too large.")
+__naked void short_loop1(void)
+{
+	asm volatile (
+	"   r7 = *(u16 *)(r1 +0);"
+	"1: r7 += 0x1ab064b9;"
+	"   .8byte %[jset];" /* same as 'if r7 & 0x702000 goto 1b;' */
+	"   r7 &= 0x1ee60e;"
+	"   r7 += r1;"
+	"   if r7 s> 0x37d2 goto +0;"
+	"   r0 = 0;"
+	"   exit;"
+	:
+	: __imm_insn(jset, BPF_JMP_IMM(BPF_JSET, BPF_REG_7, 0x702000, -2))
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/veristat.cfg b/tools/testing/selftests/bpf/veristat.cfg
index 1a385061618d..e661ffdcaadf 100644
--- a/tools/testing/selftests/bpf/veristat.cfg
+++ b/tools/testing/selftests/bpf/veristat.cfg
@@ -15,3 +15,4 @@ test_usdt*
 test_verif_scale*
 test_xdp_noinline*
 xdp_synproxy*
+verifier_search_pruning*
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
@ 2024-10-18 11:05   ` Daniel Borkmann
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2024-10-18 11:05 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yonghong.song

On 10/18/24 4:03 AM, Eduard Zingerman wrote:
> The test added is a simplified reproducer from syzbot report [1].
> If verifier does not insert checkpoint somewhere inside the loop,
> verification of the program would take a very long time.
> 
> This would happen because mark_chain_precision() for register r7 would
> constantly trace jump history of the loop back, processing many
> iterations for each mark_chain_precision() call.
> 
> [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> 
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-10-18 11:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09  2:12 [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
2024-10-09  2:12 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
2024-10-09  9:23 ` [PATCH bpf-next v1 1/2] bpf: force checkpoints at loop back-edges Eduard Zingerman
2024-10-09 19:41 ` Eduard Zingerman
2024-10-10  1:08   ` Alexei Starovoitov
2024-10-10 22:13     ` Andrii Nakryiko
2024-10-10 22:40       ` Eduard Zingerman
2024-10-10 22:52         ` Eduard Zingerman
2024-10-10 23:23           ` Alexei Starovoitov
2024-10-18  2:17             ` Eduard Zingerman
2024-10-11  1:44         ` Andrii Nakryiko
  -- strict thread matches above, loose matches on Subject: below --
2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
2024-10-18 11:05   ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox