[PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
@ 2024-10-18  2:03 Eduard Zingerman
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-18  2:03 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

A specifically crafted program might trick verifier into growing very
long jump history within a single bpf_verifier_state instance.
Very long jump history makes mark_chain_precision() unreasonably slow,
especially in case if verifier processes a loop.

Mitigate this by forcing new state in is_state_visited() in case if
current state's jump history is too long.

Use same constant as in `skip_inf_loop_check`, but multiply it by
arbitrarily chosen value 2 to account for jump history containing not
only information about jumps, but also information about stack access.

For an example of problematic program consider the code below,
w/o this patch the example is processed by verifier for ~15 minutes,
before failing to allocate big-enough chunk for jmp_history.

    0: r7 = *(u16 *)(r1 +0);"
    1: r7 += 0x1ab064b9;"
    2: if r7 & 0x702000 goto 1b;
    3: r7 &= 0x1ee60e;"
    4: r7 += r1;"
    5: if r7 s> 0x37d2 goto +0;"
    6: r0 = 0;"
    7: exit;"

Perf profiling shows that most of the time is spent in
mark_chain_precision() ~95%.

The easiest way to explain why this program causes problems is to
apply the following patch:

    diff --git a/include/linux/bpf.h b/include/linux/bpf.h
    index 0c216e71cec7..4b4823961abe 100644
    \--- a/include/linux/bpf.h
    \+++ b/include/linux/bpf.h
    \@@ -1926,7 +1926,7 @@ struct bpf_array {
            };
     };

    -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
    +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
     #define MAX_TAIL_CALL_CNT 33

     /* Maximum number of loops for bpf_loop and bpf_iter_num.
    diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
    index f514247ba8ba..75e88be3bb3e 100644
    \--- a/kernel/bpf/verifier.c
    \+++ b/kernel/bpf/verifier.c
    \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
     skip_inf_loop_check:
                            if (!force_new_state &&
                                env->jmps_processed - env->prev_jmps_processed < 20 &&
    -                           env->insn_processed - env->prev_insn_processed < 100)
    +                           env->insn_processed - env->prev_insn_processed < 100) {
    +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
    +                                       env->insn_idx,
    +                                       env->jmps_processed - env->prev_jmps_processed,
    +                                       cur->jmp_history_cnt);
                                    add_new_state = false;
    +                       }
                            goto miss;
                    }
                    /* If sl->state is a part of a loop and this loop's entry is a part of
    \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
            if (!add_new_state)
                    return 0;

    +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
    +               env->insn_idx);
    +
            /* There were no equivalent states, remember the current one.
             * Technically the current state is not proven to be safe yet,
             * but it will either reach outer most bpf_exit (which means it's safe)

And observe verification log:

    ...
    is_state_visited: new checkpoint at 5, resetting env->jmps_processed
    5: R1=ctx() R7=ctx(...)
    5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
    6: (b7) r0 = 0                        ; R0_w=0
    7: (95) exit

    from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
    6: R1=ctx() R7=ctx(...) R10=fp0
    6: (b7) r0 = 0                        ; R0_w=0
    7: (95) exit
    is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74

    from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
    1: R1=ctx() R7_w=scalar(...) R10=fp0
    1: (07) r7 += 447767737
    is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
    2: R7_w=scalar(...)
    2: (45) if r7 & 0x702000 goto pc-2
    ... mark_precise 152 steps for r7 ...
    2: R7_w=scalar(...)
    is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
    1: (07) r7 += 447767737
    is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
    2: R7_w=scalar(...)
    2: (45) if r7 & 0x702000 goto pc-2
    ...
    BPF program is too large. Processed 257 insn

The log output shows that checkpoint at label (1) is never created,
because it is suppressed by `skip_inf_loop_check` logic:
a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
   onto stack and proceeds to (3);
b. At (5) checkpoint is created, and this resets
   env->{jmps,insns}_processed.
c. Verification proceeds and reaches `exit`;
d. State saved at step (a) is popped from stack and is_state_visited()
   considers if checkpoint needs to be added, but because
   env->{jmps,insns}_processed had been just reset at step (b)
   the `skip_inf_loop_check` logic forces `add_new_state` to false.
e. Verifier proceeds with current state, which slowly accumulates
   more and more entries in the jump history.

The accumulation of entries in the jump history is a problem because
of two factors:
- it eventually exhausts memory available for kmalloc() allocation;
- mark_chain_precision() traverses the jump history of a state,
  meaning that if `r7` is marked precise, verifier would iterate
  ever growing jump history until parent state boundary is reached.

(note: the log also shows a REG INVARIANTS VIOLATION warning
       upon jset processing, but that's another bug to fix).

With this patch applied, the example above is rejected by verifier
under 1s of time, reaching 1M instructions limit.

The program is a simplified reproducer from syzbot report [1].
Previous discussion could be found at [2].
The patch does not cause any changes in verification performance,
when tested on selftests from veristat.cfg and cilium programs taken
from [3].

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
[2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
[3] https://github.com/anakryiko/cilium

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/verifier.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f514247ba8ba..f64c831a9278 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
 	return false;
 }
 
+#define MAX_JMPS_PER_STATE 20
+
 static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 {
 	struct bpf_verifier_state_list *new_sl;
 	struct bpf_verifier_state_list *sl, **pprev;
 	struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
 	int i, j, n, err, states_cnt = 0;
-	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
+	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
+			       /* - Long jmp history hinders mark_chain_precision performance,
+				*   so force new state if jmp history of current state exceeds
+				*   a threshold.
+				* - Jmp history records not only jumps, but also stack access,
+				*   so keep this constant 2x times the limit imposed on
+				*   env->jmps_processed for loop cases (see skip_inf_loop_check).
+				*/
+			       cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;
 	bool add_new_state = force_new_state;
 	bool force_exact;
 
@@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 			 */
 skip_inf_loop_check:
 			if (!force_new_state &&
-			    env->jmps_processed - env->prev_jmps_processed < 20 &&
+			    env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
 			    env->insn_processed - env->prev_insn_processed < 100)
 				add_new_state = false;
 			goto miss;
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop
  2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
@ 2024-10-18  2:03 ` Eduard Zingerman
  2024-10-18 11:05   ` Daniel Borkmann
  2024-10-18 11:03 ` [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Daniel Borkmann
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-18  2:03 UTC (permalink / raw)
  To: bpf, ast
  Cc: andrii, daniel, martin.lau, kernel-team, yonghong.song,
	Eduard Zingerman

The test added is a simplified reproducer from syzbot report [1].
If verifier does not insert checkpoint somewhere inside the loop,
verification of the program would take a very long time.

This would happen because mark_chain_precision() for register r7 would
constantly trace jump history of the loop back, processing many
iterations for each mark_chain_precision() call.

[1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../bpf/progs/verifier_search_pruning.c       | 23 +++++++++++++++++++
 tools/testing/selftests/bpf/veristat.cfg      |  1 +
 2 files changed, 24 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
index 5a14498d352f..f40e57251e94 100644
--- a/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
+++ b/tools/testing/selftests/bpf/progs/verifier_search_pruning.c
@@ -2,6 +2,7 @@
 /* Converted from tools/testing/selftests/bpf/verifier/search_pruning.c */
 
 #include <linux/bpf.h>
+#include <../../../include/linux/filter.h>
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
@@ -336,4 +337,26 @@ l0_%=:	r1 = 42;					\
 	: __clobber_all);
 }
 
+/* Without checkpoint forcibly inserted at the back-edge a loop this
+ * test would take a very long time to verify.
+ */
+SEC("kprobe")
+__failure __log_level(4)
+__msg("BPF program is too large.")
+__naked void short_loop1(void)
+{
+	asm volatile (
+	"   r7 = *(u16 *)(r1 +0);"
+	"1: r7 += 0x1ab064b9;"
+	"   .8byte %[jset];" /* same as 'if r7 & 0x702000 goto 1b;' */
+	"   r7 &= 0x1ee60e;"
+	"   r7 += r1;"
+	"   if r7 s> 0x37d2 goto +0;"
+	"   r0 = 0;"
+	"   exit;"
+	:
+	: __imm_insn(jset, BPF_JMP_IMM(BPF_JSET, BPF_REG_7, 0x702000, -2))
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/veristat.cfg b/tools/testing/selftests/bpf/veristat.cfg
index 1a385061618d..e661ffdcaadf 100644
--- a/tools/testing/selftests/bpf/veristat.cfg
+++ b/tools/testing/selftests/bpf/veristat.cfg
@@ -15,3 +15,4 @@ test_usdt*
 test_verif_scale*
 test_xdp_noinline*
 xdp_synproxy*
+verifier_search_pruning*
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
@ 2024-10-18 11:03 ` Daniel Borkmann
  2024-10-18 16:47   ` Eduard Zingerman
  2024-10-21 20:23 ` Andrii Nakryiko
  2024-10-22  2:18 ` Alexei Starovoitov
  3 siblings, 1 reply; 15+ messages in thread
From: Daniel Borkmann @ 2024-10-18 11:03 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yonghong.song

On 10/18/24 4:03 AM, Eduard Zingerman wrote:
> A specifically crafted program might trick verifier into growing very
> long jump history within a single bpf_verifier_state instance.
> Very long jump history makes mark_chain_precision() unreasonably slow,
> especially in case if verifier processes a loop.
> 
> Mitigate this by forcing new state in is_state_visited() in case if
> current state's jump history is too long.
> 
> Use same constant as in `skip_inf_loop_check`, but multiply it by
> arbitrarily chosen value 2 to account for jump history containing not
> only information about jumps, but also information about stack access.
[...]
> 
> The log output shows that checkpoint at label (1) is never created,
> because it is suppressed by `skip_inf_loop_check` logic:
> a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
>     onto stack and proceeds to (3);
> b. At (5) checkpoint is created, and this resets
>     env->{jmps,insns}_processed.
> c. Verification proceeds and reaches `exit`;
> d. State saved at step (a) is popped from stack and is_state_visited()
>     considers if checkpoint needs to be added, but because
>     env->{jmps,insns}_processed had been just reset at step (b)
>     the `skip_inf_loop_check` logic forces `add_new_state` to false.
> e. Verifier proceeds with current state, which slowly accumulates
>     more and more entries in the jump history.
> 
> The accumulation of entries in the jump history is a problem because
> of two factors:
> - it eventually exhausts memory available for kmalloc() allocation;
> - mark_chain_precision() traverses the jump history of a state,
>    meaning that if `r7` is marked precise, verifier would iterate
>    ever growing jump history until parent state boundary is reached.
> 
> (note: the log also shows a REG INVARIANTS VIOLATION warning
>         upon jset processing, but that's another bug to fix).
> 
> With this patch applied, the example above is rejected by verifier
> under 1s of time, reaching 1M instructions limit.
> 
> The program is a simplified reproducer from syzbot report [1].
> Previous discussion could be found at [2].
> The patch does not cause any changes in verification performance,
> when tested on selftests from veristat.cfg and cilium programs taken
> from [3].
> 
> [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> [2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
> [3] https://github.com/anakryiko/cilium

Impressive that syzbot was able to generate this, and awesome analysis
as well as fix.

I guess we should also add :

Reported-by: syzbot+7e46cdef14bf496a3ab4@syzkaller.appspotmail.com

Can we also add a Fixes tag so that this can eventually be picked up
by stable? bpf tree would be the appropriate target, no?

> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

> ---
>   kernel/bpf/verifier.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f514247ba8ba..f64c831a9278 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
>   	return false;
>   }
>   
> +#define MAX_JMPS_PER_STATE 20
> +
>   static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>   {
>   	struct bpf_verifier_state_list *new_sl;
>   	struct bpf_verifier_state_list *sl, **pprev;
>   	struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
>   	int i, j, n, err, states_cnt = 0;
> -	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
> +	bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
> +			       /* - Long jmp history hinders mark_chain_precision performance,
> +				*   so force new state if jmp history of current state exceeds
> +				*   a threshold.
> +				* - Jmp history records not only jumps, but also stack access,
> +				*   so keep this constant 2x times the limit imposed on
> +				*   env->jmps_processed for loop cases (see skip_inf_loop_check).
> +				*/
> +			       cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;
>   	bool add_new_state = force_new_state;
>   	bool force_exact;
>   
> @@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>   			 */
>   skip_inf_loop_check:
>   			if (!force_new_state &&
> -			    env->jmps_processed - env->prev_jmps_processed < 20 &&
> +			    env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
>   			    env->insn_processed - env->prev_insn_processed < 100)
>   				add_new_state = false;
>   			goto miss;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
@ 2024-10-18 11:05   ` Daniel Borkmann
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Borkmann @ 2024-10-18 11:05 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yonghong.song

On 10/18/24 4:03 AM, Eduard Zingerman wrote:
> The test added is a simplified reproducer from syzbot report [1].
> If verifier does not insert checkpoint somewhere inside the loop,
> verification of the program would take a very long time.
> 
> This would happen because mark_chain_precision() for register r7 would
> constantly trace jump history of the loop back, processing many
> iterations for each mark_chain_precision() call.
> 
> [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> 
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-18 11:03 ` [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Daniel Borkmann
@ 2024-10-18 16:47   ` Eduard Zingerman
  2024-10-21  7:53     ` Daniel Borkmann
  0 siblings, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-18 16:47 UTC (permalink / raw)
  To: Daniel Borkmann, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yonghong.song

On Fri, 2024-10-18 at 13:03 +0200, Daniel Borkmann wrote:

[...]

Hi Daniel,

> Impressive that syzbot was able to generate this, and awesome analysis
> as well as fix.

Thank you for taking a look. I was a bit surprised by syzbot
generating such program as well, but I guess this is an instance of
infinite monkey theorem...

> I guess we should also add :
> 
> Reported-by: syzbot+7e46cdef14bf496a3ab4@syzkaller.appspotmail.com

Yes, we can do that. I was hesitant to add it because original report
was about a bug in mm/slub.c.

> Can we also add a Fixes tag so that this can eventually be picked up
> by stable? bpf tree would be the appropriate target, no?

The fixes tag can be:

Fixes: 2589726d12a1 ("bpf: introduce bounded loops")

But I'm a bit hesitant if this really a bug, maybe just add:

Cc: stable@vger.kernel.org

?

[...]


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-18 16:47   ` Eduard Zingerman
@ 2024-10-21  7:53     ` Daniel Borkmann
  0 siblings, 0 replies; 15+ messages in thread
From: Daniel Borkmann @ 2024-10-21  7:53 UTC (permalink / raw)
  To: Eduard Zingerman, bpf, ast; +Cc: andrii, martin.lau, kernel-team, yonghong.song

On 10/18/24 6:47 PM, Eduard Zingerman wrote:
> On Fri, 2024-10-18 at 13:03 +0200, Daniel Borkmann wrote:
> [...]
> 
>> Impressive that syzbot was able to generate this, and awesome analysis
>> as well as fix.
> 
> Thank you for taking a look. I was a bit surprised by syzbot
> generating such program as well, but I guess this is an instance of
> infinite monkey theorem...
> 
>> I guess we should also add :
>>
>> Reported-by: syzbot+7e46cdef14bf496a3ab4@syzkaller.appspotmail.com
> 
> Yes, we can do that. I was hesitant to add it because original report
> was about a bug in mm/slub.c.

Ok, but as you mentioned the program was derived from this syzbot report,
so for reference, I think it's ok to mention it.

>> Can we also add a Fixes tag so that this can eventually be picked up
>> by stable? bpf tree would be the appropriate target, no?
> 
> The fixes tag can be:
> 
> Fixes: 2589726d12a1 ("bpf: introduce bounded loops")

Thanks!

> But I'm a bit hesitant if this really a bug, maybe just add:
> 
> Cc: stable@vger.kernel.org

If we have a proper Fixes tag, then stable will pick it up anyway, but ...

> For an example of problematic program consider the code below,
> w/o this patch the example is processed by verifier for ~15 minutes,
> before failing to allocate big-enough chunk for jmp_history.

... would qualify for bpf tree imho.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
  2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
  2024-10-18 11:03 ` [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Daniel Borkmann
@ 2024-10-21 20:23 ` Andrii Nakryiko
  2024-10-22  2:03   ` Alexei Starovoitov
  2024-10-22  2:18 ` Alexei Starovoitov
  3 siblings, 1 reply; 15+ messages in thread
From: Andrii Nakryiko @ 2024-10-21 20:23 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song

On Thu, Oct 17, 2024 at 7:03 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> A specifically crafted program might trick verifier into growing very
> long jump history within a single bpf_verifier_state instance.
> Very long jump history makes mark_chain_precision() unreasonably slow,
> especially in case if verifier processes a loop.
>
> Mitigate this by forcing new state in is_state_visited() in case if
> current state's jump history is too long.
>
> Use same constant as in `skip_inf_loop_check`, but multiply it by
> arbitrarily chosen value 2 to account for jump history containing not
> only information about jumps, but also information about stack access.
>
> For an example of problematic program consider the code below,
> w/o this patch the example is processed by verifier for ~15 minutes,
> before failing to allocate big-enough chunk for jmp_history.
>
>     0: r7 = *(u16 *)(r1 +0);"
>     1: r7 += 0x1ab064b9;"
>     2: if r7 & 0x702000 goto 1b;
>     3: r7 &= 0x1ee60e;"
>     4: r7 += r1;"
>     5: if r7 s> 0x37d2 goto +0;"
>     6: r0 = 0;"
>     7: exit;"
>
> Perf profiling shows that most of the time is spent in
> mark_chain_precision() ~95%.
>
> The easiest way to explain why this program causes problems is to
> apply the following patch:
>
>     diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>     index 0c216e71cec7..4b4823961abe 100644
>     \--- a/include/linux/bpf.h
>     \+++ b/include/linux/bpf.h
>     \@@ -1926,7 +1926,7 @@ struct bpf_array {
>             };
>      };
>
>     -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
>     +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
>      #define MAX_TAIL_CALL_CNT 33
>
>      /* Maximum number of loops for bpf_loop and bpf_iter_num.
>     diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>     index f514247ba8ba..75e88be3bb3e 100644
>     \--- a/kernel/bpf/verifier.c
>     \+++ b/kernel/bpf/verifier.c
>     \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>      skip_inf_loop_check:
>                             if (!force_new_state &&
>                                 env->jmps_processed - env->prev_jmps_processed < 20 &&
>     -                           env->insn_processed - env->prev_insn_processed < 100)
>     +                           env->insn_processed - env->prev_insn_processed < 100) {
>     +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
>     +                                       env->insn_idx,
>     +                                       env->jmps_processed - env->prev_jmps_processed,
>     +                                       cur->jmp_history_cnt);
>                                     add_new_state = false;
>     +                       }
>                             goto miss;
>                     }
>                     /* If sl->state is a part of a loop and this loop's entry is a part of
>     \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>             if (!add_new_state)
>                     return 0;
>
>     +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
>     +               env->insn_idx);
>     +
>             /* There were no equivalent states, remember the current one.
>              * Technically the current state is not proven to be safe yet,
>              * but it will either reach outer most bpf_exit (which means it's safe)
>
> And observe verification log:
>
>     ...
>     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
>     5: R1=ctx() R7=ctx(...)
>     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
>     6: (b7) r0 = 0                        ; R0_w=0
>     7: (95) exit
>
>     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
>     6: R1=ctx() R7=ctx(...) R10=fp0
>     6: (b7) r0 = 0                        ; R0_w=0
>     7: (95) exit
>     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
>
>     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
>     1: R1=ctx() R7_w=scalar(...) R10=fp0
>     1: (07) r7 += 447767737
>     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
>     2: R7_w=scalar(...)
>     2: (45) if r7 & 0x702000 goto pc-2
>     ... mark_precise 152 steps for r7 ...
>     2: R7_w=scalar(...)
>     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
>     1: (07) r7 += 447767737
>     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
>     2: R7_w=scalar(...)
>     2: (45) if r7 & 0x702000 goto pc-2
>     ...
>     BPF program is too large. Processed 257 insn
>
> The log output shows that checkpoint at label (1) is never created,
> because it is suppressed by `skip_inf_loop_check` logic:
> a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
>    onto stack and proceeds to (3);
> b. At (5) checkpoint is created, and this resets
>    env->{jmps,insns}_processed.
> c. Verification proceeds and reaches `exit`;
> d. State saved at step (a) is popped from stack and is_state_visited()
>    considers if checkpoint needs to be added, but because
>    env->{jmps,insns}_processed had been just reset at step (b)
>    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> e. Verifier proceeds with current state, which slowly accumulates
>    more and more entries in the jump history.
>
> The accumulation of entries in the jump history is a problem because
> of two factors:
> - it eventually exhausts memory available for kmalloc() allocation;
> - mark_chain_precision() traverses the jump history of a state,
>   meaning that if `r7` is marked precise, verifier would iterate
>   ever growing jump history until parent state boundary is reached.
>
> (note: the log also shows a REG INVARIANTS VIOLATION warning
>        upon jset processing, but that's another bug to fix).
>
> With this patch applied, the example above is rejected by verifier
> under 1s of time, reaching 1M instructions limit.
>
> The program is a simplified reproducer from syzbot report [1].
> Previous discussion could be found at [2].
> The patch does not cause any changes in verification performance,
> when tested on selftests from veristat.cfg and cilium programs taken
> from [3].
>
> [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> [2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
> [3] https://github.com/anakryiko/cilium
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/verifier.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index f514247ba8ba..f64c831a9278 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
>         return false;
>  }
>
> +#define MAX_JMPS_PER_STATE 20
> +
>  static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  {
>         struct bpf_verifier_state_list *new_sl;
>         struct bpf_verifier_state_list *sl, **pprev;
>         struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
>         int i, j, n, err, states_cnt = 0;
> -       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
> +       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
> +                              /* - Long jmp history hinders mark_chain_precision performance,
> +                               *   so force new state if jmp history of current state exceeds
> +                               *   a threshold.
> +                               * - Jmp history records not only jumps, but also stack access,
> +                               *   so keep this constant 2x times the limit imposed on
> +                               *   env->jmps_processed for loop cases (see skip_inf_loop_check).
> +                               */
> +                              cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;

this feels like a wrong place to add this heuristic. Just few lines
below there is:


if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
    env->insn_processed - env->prev_insn_processed >= 8)
        add_new_state = true;

Please add jmp_history_cnt check here, as it conceptually fits with
jmps_processed and insn_processed check. It also has a huge comment
with justification already, so might as well just extend that for
jmp_history_cnt.

pw-bot: cr

>         bool add_new_state = force_new_state;
>         bool force_exact;
>
> @@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                          */
>  skip_inf_loop_check:
>                         if (!force_new_state &&
> -                           env->jmps_processed - env->prev_jmps_processed < 20 &&
> +                           env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
>                             env->insn_processed - env->prev_insn_processed < 100)
>                                 add_new_state = false;

and then this one is logically matching add_new_state = true; case
above that I mentioned.


With these changes, I'd drop * 2 factor for one of the checks. If
necessary, just bump it to 30 or so, if you are afraid of stack
accesses. But let's keep it simple with one threshold, if possible?

>                         goto miss;
> --
> 2.46.2
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-21 20:23 ` Andrii Nakryiko
@ 2024-10-22  2:03   ` Alexei Starovoitov
  2024-10-22  3:19     ` Andrii Nakryiko
  0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2024-10-22  2:03 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, Oct 21, 2024 at 1:23 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Oct 17, 2024 at 7:03 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> >
> > A specifically crafted program might trick verifier into growing very
> > long jump history within a single bpf_verifier_state instance.
> > Very long jump history makes mark_chain_precision() unreasonably slow,
> > especially in case if verifier processes a loop.
> >
> > Mitigate this by forcing new state in is_state_visited() in case if
> > current state's jump history is too long.
> >
> > Use same constant as in `skip_inf_loop_check`, but multiply it by
> > arbitrarily chosen value 2 to account for jump history containing not
> > only information about jumps, but also information about stack access.
> >
> > For an example of problematic program consider the code below,
> > w/o this patch the example is processed by verifier for ~15 minutes,
> > before failing to allocate big-enough chunk for jmp_history.
> >
> >     0: r7 = *(u16 *)(r1 +0);"
> >     1: r7 += 0x1ab064b9;"
> >     2: if r7 & 0x702000 goto 1b;
> >     3: r7 &= 0x1ee60e;"
> >     4: r7 += r1;"
> >     5: if r7 s> 0x37d2 goto +0;"
> >     6: r0 = 0;"
> >     7: exit;"
> >
> > Perf profiling shows that most of the time is spent in
> > mark_chain_precision() ~95%.
> >
> > The easiest way to explain why this program causes problems is to
> > apply the following patch:
> >
> >     diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> >     index 0c216e71cec7..4b4823961abe 100644
> >     \--- a/include/linux/bpf.h
> >     \+++ b/include/linux/bpf.h
> >     \@@ -1926,7 +1926,7 @@ struct bpf_array {
> >             };
> >      };
> >
> >     -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
> >     +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
> >      #define MAX_TAIL_CALL_CNT 33
> >
> >      /* Maximum number of loops for bpf_loop and bpf_iter_num.
> >     diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >     index f514247ba8ba..75e88be3bb3e 100644
> >     \--- a/kernel/bpf/verifier.c
> >     \+++ b/kernel/bpf/verifier.c
> >     \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >      skip_inf_loop_check:
> >                             if (!force_new_state &&
> >                                 env->jmps_processed - env->prev_jmps_processed < 20 &&
> >     -                           env->insn_processed - env->prev_insn_processed < 100)
> >     +                           env->insn_processed - env->prev_insn_processed < 100) {
> >     +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
> >     +                                       env->insn_idx,
> >     +                                       env->jmps_processed - env->prev_jmps_processed,
> >     +                                       cur->jmp_history_cnt);
> >                                     add_new_state = false;
> >     +                       }
> >                             goto miss;
> >                     }
> >                     /* If sl->state is a part of a loop and this loop's entry is a part of
> >     \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >             if (!add_new_state)
> >                     return 0;
> >
> >     +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
> >     +               env->insn_idx);
> >     +
> >             /* There were no equivalent states, remember the current one.
> >              * Technically the current state is not proven to be safe yet,
> >              * but it will either reach outer most bpf_exit (which means it's safe)
> >
> > And observe verification log:
> >
> >     ...
> >     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
> >     5: R1=ctx() R7=ctx(...)
> >     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
> >     6: (b7) r0 = 0                        ; R0_w=0
> >     7: (95) exit
> >
> >     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
> >     6: R1=ctx() R7=ctx(...) R10=fp0
> >     6: (b7) r0 = 0                        ; R0_w=0
> >     7: (95) exit
> >     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
> >
> >     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
> >     1: R1=ctx() R7_w=scalar(...) R10=fp0
> >     1: (07) r7 += 447767737
> >     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
> >     2: R7_w=scalar(...)
> >     2: (45) if r7 & 0x702000 goto pc-2
> >     ... mark_precise 152 steps for r7 ...
> >     2: R7_w=scalar(...)
> >     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
> >     1: (07) r7 += 447767737
> >     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
> >     2: R7_w=scalar(...)
> >     2: (45) if r7 & 0x702000 goto pc-2
> >     ...
> >     BPF program is too large. Processed 257 insn
> >
> > The log output shows that checkpoint at label (1) is never created,
> > because it is suppressed by `skip_inf_loop_check` logic:
> > a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
> >    onto stack and proceeds to (3);
> > b. At (5) checkpoint is created, and this resets
> >    env->{jmps,insns}_processed.
> > c. Verification proceeds and reaches `exit`;
> > d. State saved at step (a) is popped from stack and is_state_visited()
> >    considers if checkpoint needs to be added, but because
> >    env->{jmps,insns}_processed had been just reset at step (b)
> >    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> > e. Verifier proceeds with current state, which slowly accumulates
> >    more and more entries in the jump history.
> >
> > The accumulation of entries in the jump history is a problem because
> > of two factors:
> > - it eventually exhausts memory available for kmalloc() allocation;
> > - mark_chain_precision() traverses the jump history of a state,
> >   meaning that if `r7` is marked precise, verifier would iterate
> >   ever growing jump history until parent state boundary is reached.
> >
> > (note: the log also shows a REG INVARIANTS VIOLATION warning
> >        upon jset processing, but that's another bug to fix).
> >
> > With this patch applied, the example above is rejected by verifier
> > under 1s of time, reaching 1M instructions limit.
> >
> > The program is a simplified reproducer from syzbot report [1].
> > Previous discussion could be found at [2].
> > The patch does not cause any changes in verification performance,
> > when tested on selftests from veristat.cfg and cilium programs taken
> > from [3].
> >
> > [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> > [2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
> > [3] https://github.com/anakryiko/cilium
> >
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 14 ++++++++++++--
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index f514247ba8ba..f64c831a9278 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
> >         return false;
> >  }
> >
> > +#define MAX_JMPS_PER_STATE 20
> > +
> >  static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >  {
> >         struct bpf_verifier_state_list *new_sl;
> >         struct bpf_verifier_state_list *sl, **pprev;
> >         struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
> >         int i, j, n, err, states_cnt = 0;
> > -       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
> > +       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
> > +                              /* - Long jmp history hinders mark_chain_precision performance,
> > +                               *   so force new state if jmp history of current state exceeds
> > +                               *   a threshold.
> > +                               * - Jmp history records not only jumps, but also stack access,
> > +                               *   so keep this constant 2x times the limit imposed on
> > +                               *   env->jmps_processed for loop cases (see skip_inf_loop_check).
> > +                               */
> > +                              cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;
>
> this feels like a wrong place to add this heuristic. Just few lines
> below there is:
>
>
> if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
>     env->insn_processed - env->prev_insn_processed >= 8)
>         add_new_state = true;
>
> Please add jmp_history_cnt check here, as it conceptually fits with
> jmps_processed and insn_processed check. It also has a huge comment
> with justification already, so might as well just extend that for
> jmp_history_cnt.

I think adding if (cur->jmp_history_cnt > 20) add_new_state = true;
won't help. It will get back to false.
But I agree that tweaking force_new_state also is not quite clean.

btw the "huge comment" may need to be revised :)
bpf progs today probably look different than they were in 2019.

>
> pw-bot: cr
>
> >         bool add_new_state = force_new_state;
> >         bool force_exact;
> >
> > @@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> >                          */
> >  skip_inf_loop_check:
> >                         if (!force_new_state &&
> > -                           env->jmps_processed - env->prev_jmps_processed < 20 &&
> > +                           env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
> >                             env->insn_processed - env->prev_insn_processed < 100)
> >                                 add_new_state = false;
>
> and then this one is logically matching add_new_state = true; case
> above that I mentioned.
>
>
> With these changes, I'd drop * 2 factor for one of the checks. If
> necessary, just bump it to 30 or so, if you are afraid of stack
> accesses. But let's keep it simple with one threshold, if possible?

+1

2/8 heuristic is working together with 20/100.

Instead of tweaking force_new_state earlier, it's better to do:
if (!force_new_state && cur->jmp_history_cnt < N &&
    env->jmps_processed - env->prev_jmps_processed < 20 && ..)
    add_new_state = false;

You're essentially proposing N == 40.
Just add your existing comment next to the check.
# define MAX_JMPS_PER_STATE is imo overkill.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
                   ` (2 preceding siblings ...)
  2024-10-21 20:23 ` Andrii Nakryiko
@ 2024-10-22  2:18 ` Alexei Starovoitov
  2024-10-22  2:27   ` Eduard Zingerman
  3 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2024-10-22  2:18 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Thu, Oct 17, 2024 at 7:03 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> A specifically crafted program might trick verifier into growing very
> long jump history within a single bpf_verifier_state instance.
> Very long jump history makes mark_chain_precision() unreasonably slow,
> especially in case if verifier processes a loop.
>
> Mitigate this by forcing new state in is_state_visited() in case if
> current state's jump history is too long.
>
> Use same constant as in `skip_inf_loop_check`, but multiply it by
> arbitrarily chosen value 2 to account for jump history containing not
> only information about jumps, but also information about stack access.
>
> For an example of problematic program consider the code below,
> w/o this patch the example is processed by verifier for ~15 minutes,
> before failing to allocate big-enough chunk for jmp_history.
>
>     0: r7 = *(u16 *)(r1 +0);"
>     1: r7 += 0x1ab064b9;"
>     2: if r7 & 0x702000 goto 1b;
>     3: r7 &= 0x1ee60e;"
>     4: r7 += r1;"
>     5: if r7 s> 0x37d2 goto +0;"
>     6: r0 = 0;"
>     7: exit;"
>
> Perf profiling shows that most of the time is spent in
> mark_chain_precision() ~95%.
>
> The easiest way to explain why this program causes problems is to
> apply the following patch:
>
>     diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>     index 0c216e71cec7..4b4823961abe 100644
>     \--- a/include/linux/bpf.h
>     \+++ b/include/linux/bpf.h
>     \@@ -1926,7 +1926,7 @@ struct bpf_array {
>             };
>      };
>
>     -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
>     +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
>      #define MAX_TAIL_CALL_CNT 33
>
>      /* Maximum number of loops for bpf_loop and bpf_iter_num.
>     diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>     index f514247ba8ba..75e88be3bb3e 100644
>     \--- a/kernel/bpf/verifier.c
>     \+++ b/kernel/bpf/verifier.c
>     \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>      skip_inf_loop_check:
>                             if (!force_new_state &&
>                                 env->jmps_processed - env->prev_jmps_processed < 20 &&
>     -                           env->insn_processed - env->prev_insn_processed < 100)
>     +                           env->insn_processed - env->prev_insn_processed < 100) {
>     +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
>     +                                       env->insn_idx,
>     +                                       env->jmps_processed - env->prev_jmps_processed,
>     +                                       cur->jmp_history_cnt);
>                                     add_new_state = false;
>     +                       }
>                             goto miss;
>                     }
>                     /* If sl->state is a part of a loop and this loop's entry is a part of
>     \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>             if (!add_new_state)
>                     return 0;
>
>     +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
>     +               env->insn_idx);
>     +
>             /* There were no equivalent states, remember the current one.
>              * Technically the current state is not proven to be safe yet,
>              * but it will either reach outer most bpf_exit (which means it's safe)
>
> And observe verification log:
>
>     ...
>     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
>     5: R1=ctx() R7=ctx(...)
>     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
>     6: (b7) r0 = 0                        ; R0_w=0
>     7: (95) exit
>
>     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
>     6: R1=ctx() R7=ctx(...) R10=fp0
>     6: (b7) r0 = 0                        ; R0_w=0
>     7: (95) exit
>     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
>
>     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
>     1: R1=ctx() R7_w=scalar(...) R10=fp0
>     1: (07) r7 += 447767737
>     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
>     2: R7_w=scalar(...)
>     2: (45) if r7 & 0x702000 goto pc-2
>     ... mark_precise 152 steps for r7 ...
>     2: R7_w=scalar(...)
>     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
>     1: (07) r7 += 447767737
>     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
>     2: R7_w=scalar(...)
>     2: (45) if r7 & 0x702000 goto pc-2
>     ...
>     BPF program is too large. Processed 257 insn
>
> The log output shows that checkpoint at label (1) is never created,
> because it is suppressed by `skip_inf_loop_check` logic:
> a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
>    onto stack and proceeds to (3);
> b. At (5) checkpoint is created, and this resets
>    env->{jmps,insns}_processed.
> c. Verification proceeds and reaches `exit`;
> d. State saved at step (a) is popped from stack and is_state_visited()
>    considers if checkpoint needs to be added, but because
>    env->{jmps,insns}_processed had been just reset at step (b)
>    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> e. Verifier proceeds with current state, which slowly accumulates
>    more and more entries in the jump history.

I'm still not sure why it grew to thousands of entries in jmp_history.
Looking at the above trace jmps_processed grows 1 to 1 with jmp_history_cnt.
Also cur->jmp_history_cnt is reset to zero at the same time as
jmps processed.
So in the above test 75 vs 4 difference came from jmp_history
entries that were there before the loop ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-22  2:18 ` Alexei Starovoitov
@ 2024-10-22  2:27   ` Eduard Zingerman
  2024-10-22  2:53     ` Alexei Starovoitov
  0 siblings, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-22  2:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, 2024-10-21 at 19:18 -0700, Alexei Starovoitov wrote:

[...]

> >     0: r7 = *(u16 *)(r1 +0);"
> >     1: r7 += 0x1ab064b9;"
> >     2: if r7 & 0x702000 goto 1b;
> >     3: r7 &= 0x1ee60e;"
> >     4: r7 += r1;"
> >     5: if r7 s> 0x37d2 goto +0;"
> >     6: r0 = 0;"
> >     7: exit;"

[...]

> > And observe verification log:
> > 
> >     ...
> >     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
> >     5: R1=ctx() R7=ctx(...)
> >     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
> >     6: (b7) r0 = 0                        ; R0_w=0
> >     7: (95) exit
> > 
> >     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
> >     6: R1=ctx() R7=ctx(...) R10=fp0
> >     6: (b7) r0 = 0                        ; R0_w=0
> >     7: (95) exit
> >     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
> > 
> >     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
> >     1: R1=ctx() R7_w=scalar(...) R10=fp0
> >     1: (07) r7 += 447767737
> >     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
> >     2: R7_w=scalar(...)
> >     2: (45) if r7 & 0x702000 goto pc-2
> >     ... mark_precise 152 steps for r7 ...
> >     2: R7_w=scalar(...)
> >     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
> >     1: (07) r7 += 447767737
> >     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
> >     2: R7_w=scalar(...)
> >     2: (45) if r7 & 0x702000 goto pc-2
> >     ...
> >     BPF program is too large. Processed 257 insn
> > 
> > The log output shows that checkpoint at label (1) is never created,
> > because it is suppressed by `skip_inf_loop_check` logic:
> > a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
> >    onto stack and proceeds to (3);
> > b. At (5) checkpoint is created, and this resets
> >    env->{jmps,insns}_processed.
> > c. Verification proceeds and reaches `exit`;
> > d. State saved at step (a) is popped from stack and is_state_visited()
> >    considers if checkpoint needs to be added, but because
> >    env->{jmps,insns}_processed had been just reset at step (b)
> >    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> > e. Verifier proceeds with current state, which slowly accumulates
> >    more and more entries in the jump history.
> 
> I'm still not sure why it grew to thousands of entries in jmp_history.
> Looking at the above trace jmps_processed grows 1 to 1 with jmp_history_cnt.
> Also cur->jmp_history_cnt is reset to zero at the same time as
> jmps processed.
> So in the above test 75 vs 4 difference came from jmp_history
> entries that were there before the loop ?

    0: r7 = *(u16 *)(r1 +0);"
    1: r7 += 0x1ab064b9;"
    2: if r7 & 0x702000 goto 1b;
    3: r7 &= 0x1ee60e;"
    4: r7 += r1;"
    5: if r7 s> 0x37d2 goto +0;"
    6: r0 = 0;"
    7: exit;"

- When 'if' at (2) is processed current state is copied (let's call
  this copy C), copy is put to the stack for later processing,
  it's jump history is not cleared.
- Then current state proceeds verifying 3-5-6-7. At (5) checkpoint is
  created and env->{jmps,insns}_processed are reset.
- Then state C is popped from the stack, it goes back to (1) and then (2),
  at (2) a copy C1 is created but no checkpoint, as env->{jmps,insns}_processed
  do not meet thresholds. C1's jmp_history is one entry longer then C's.
- Whole thing repeats until ENOMEM.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-22  2:27   ` Eduard Zingerman
@ 2024-10-22  2:53     ` Alexei Starovoitov
  2024-10-22  5:38       ` Eduard Zingerman
  0 siblings, 1 reply; 15+ messages in thread
From: Alexei Starovoitov @ 2024-10-22  2:53 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, Oct 21, 2024 at 7:27 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-10-21 at 19:18 -0700, Alexei Starovoitov wrote:
>
> [...]
>
> > >     0: r7 = *(u16 *)(r1 +0);"
> > >     1: r7 += 0x1ab064b9;"
> > >     2: if r7 & 0x702000 goto 1b;
> > >     3: r7 &= 0x1ee60e;"
> > >     4: r7 += r1;"
> > >     5: if r7 s> 0x37d2 goto +0;"
> > >     6: r0 = 0;"
> > >     7: exit;"
>
> [...]
>
> > > And observe verification log:
> > >
> > >     ...
> > >     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
> > >     5: R1=ctx() R7=ctx(...)
> > >     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >
> > >     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
> > >
> > >     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ... mark_precise 152 steps for r7 ...
> > >     2: R7_w=scalar(...)
> > >     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ...
> > >     BPF program is too large. Processed 257 insn
> > >
> > > The log output shows that checkpoint at label (1) is never created,
> > > because it is suppressed by `skip_inf_loop_check` logic:
> > > a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
> > >    onto stack and proceeds to (3);
> > > b. At (5) checkpoint is created, and this resets
> > >    env->{jmps,insns}_processed.
> > > c. Verification proceeds and reaches `exit`;
> > > d. State saved at step (a) is popped from stack and is_state_visited()
> > >    considers if checkpoint needs to be added, but because
> > >    env->{jmps,insns}_processed had been just reset at step (b)
> > >    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> > > e. Verifier proceeds with current state, which slowly accumulates
> > >    more and more entries in the jump history.
> >
> > I'm still not sure why it grew to thousands of entries in jmp_history.
> > Looking at the above trace jmps_processed grows 1 to 1 with jmp_history_cnt.
> > Also cur->jmp_history_cnt is reset to zero at the same time as
> > jmps processed.
> > So in the above test 75 vs 4 difference came from jmp_history
> > entries that were there before the loop ?
>
>     0: r7 = *(u16 *)(r1 +0);"
>     1: r7 += 0x1ab064b9;"
>     2: if r7 & 0x702000 goto 1b;
>     3: r7 &= 0x1ee60e;"
>     4: r7 += r1;"
>     5: if r7 s> 0x37d2 goto +0;"
>     6: r0 = 0;"
>     7: exit;"
>
> - When 'if' at (2) is processed current state is copied (let's call
>   this copy C), copy is put to the stack for later processing,
>   it's jump history is not cleared.
> - Then current state proceeds verifying 3-5-6-7. At (5) checkpoint is
>   created and env->{jmps,insns}_processed are reset.
> - Then state C is popped from the stack, it goes back to (1) and then (2),
>   at (2) a copy C1 is created but no checkpoint, as env->{jmps,insns}_processed
>   do not meet thresholds. C1's jmp_history is one entry longer then C's.
> - Whole thing repeats until ENOMEM.

I see. Thanks for explaining.
So the bug is actually in reset logic of jmps/insns_processed
coupled with push/pop stack.
I think let's add cur->jmp_history_cnt < 40 check for now
and target bpf tree (assuming no veristat regressions),
but for bpf-next we probably need to follow up.
We can probably remove jmps_processed counter
and replace it with jmp_history_cnt.

Based on the above explanation insns_processed counter is
also bogus.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-22  2:03   ` Alexei Starovoitov
@ 2024-10-22  3:19     ` Andrii Nakryiko
  0 siblings, 0 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2024-10-22  3:19 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, Oct 21, 2024 at 7:03 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Oct 21, 2024 at 1:23 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Oct 17, 2024 at 7:03 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > A specifically crafted program might trick verifier into growing very
> > > long jump history within a single bpf_verifier_state instance.
> > > Very long jump history makes mark_chain_precision() unreasonably slow,
> > > especially in case if verifier processes a loop.
> > >
> > > Mitigate this by forcing new state in is_state_visited() in case if
> > > current state's jump history is too long.
> > >
> > > Use same constant as in `skip_inf_loop_check`, but multiply it by
> > > arbitrarily chosen value 2 to account for jump history containing not
> > > only information about jumps, but also information about stack access.
> > >
> > > For an example of problematic program consider the code below,
> > > w/o this patch the example is processed by verifier for ~15 minutes,
> > > before failing to allocate big-enough chunk for jmp_history.
> > >
> > >     0: r7 = *(u16 *)(r1 +0);"
> > >     1: r7 += 0x1ab064b9;"
> > >     2: if r7 & 0x702000 goto 1b;
> > >     3: r7 &= 0x1ee60e;"
> > >     4: r7 += r1;"
> > >     5: if r7 s> 0x37d2 goto +0;"
> > >     6: r0 = 0;"
> > >     7: exit;"
> > >
> > > Perf profiling shows that most of the time is spent in
> > > mark_chain_precision() ~95%.
> > >
> > > The easiest way to explain why this program causes problems is to
> > > apply the following patch:
> > >
> > >     diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > >     index 0c216e71cec7..4b4823961abe 100644
> > >     \--- a/include/linux/bpf.h
> > >     \+++ b/include/linux/bpf.h
> > >     \@@ -1926,7 +1926,7 @@ struct bpf_array {
> > >             };
> > >      };
> > >
> > >     -#define BPF_COMPLEXITY_LIMIT_INSNS      1000000 /* yes. 1M insns */
> > >     +#define BPF_COMPLEXITY_LIMIT_INSNS      256 /* yes. 1M insns */
> > >      #define MAX_TAIL_CALL_CNT 33
> > >
> > >      /* Maximum number of loops for bpf_loop and bpf_iter_num.
> > >     diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > >     index f514247ba8ba..75e88be3bb3e 100644
> > >     \--- a/kernel/bpf/verifier.c
> > >     \+++ b/kernel/bpf/verifier.c
> > >     \@@ -18024,8 +18024,13 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >      skip_inf_loop_check:
> > >                             if (!force_new_state &&
> > >                                 env->jmps_processed - env->prev_jmps_processed < 20 &&
> > >     -                           env->insn_processed - env->prev_insn_processed < 100)
> > >     +                           env->insn_processed - env->prev_insn_processed < 100) {
> > >     +                               verbose(env, "is_state_visited: suppressing checkpoint at %d, %d jmps processed, cur->jmp_history_cnt is %d\n",
> > >     +                                       env->insn_idx,
> > >     +                                       env->jmps_processed - env->prev_jmps_processed,
> > >     +                                       cur->jmp_history_cnt);
> > >                                     add_new_state = false;
> > >     +                       }
> > >                             goto miss;
> > >                     }
> > >                     /* If sl->state is a part of a loop and this loop's entry is a part of
> > >     \@@ -18142,6 +18147,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >             if (!add_new_state)
> > >                     return 0;
> > >
> > >     +       verbose(env, "is_state_visited: new checkpoint at %d, resetting env->jmps_processed\n",
> > >     +               env->insn_idx);
> > >     +
> > >             /* There were no equivalent states, remember the current one.
> > >              * Technically the current state is not proven to be safe yet,
> > >              * but it will either reach outer most bpf_exit (which means it's safe)
> > >
> > > And observe verification log:
> > >
> > >     ...
> > >     is_state_visited: new checkpoint at 5, resetting env->jmps_processed
> > >     5: R1=ctx() R7=ctx(...)
> > >     5: (65) if r7 s> 0x37d2 goto pc+0     ; R7=ctx(...)
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >
> > >     from 5 to 6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: R1=ctx() R7=ctx(...) R10=fp0
> > >     6: (b7) r0 = 0                        ; R0_w=0
> > >     7: (95) exit
> > >     is_state_visited: suppressing checkpoint at 1, 3 jmps processed, cur->jmp_history_cnt is 74
> > >
> > >     from 2 to 1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: R1=ctx() R7_w=scalar(...) R10=fp0
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 3 jmps processed, cur->jmp_history_cnt is 75
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ... mark_precise 152 steps for r7 ...
> > >     2: R7_w=scalar(...)
> > >     is_state_visited: suppressing checkpoint at 1, 4 jmps processed, cur->jmp_history_cnt is 75
> > >     1: (07) r7 += 447767737
> > >     is_state_visited: suppressing checkpoint at 2, 4 jmps processed, cur->jmp_history_cnt is 76
> > >     2: R7_w=scalar(...)
> > >     2: (45) if r7 & 0x702000 goto pc-2
> > >     ...
> > >     BPF program is too large. Processed 257 insn
> > >
> > > The log output shows that checkpoint at label (1) is never created,
> > > because it is suppressed by `skip_inf_loop_check` logic:
> > > a. When 'if' at (2) is processed it pushes a state with insn_idx (1)
> > >    onto stack and proceeds to (3);
> > > b. At (5) checkpoint is created, and this resets
> > >    env->{jmps,insns}_processed.
> > > c. Verification proceeds and reaches `exit`;
> > > d. State saved at step (a) is popped from stack and is_state_visited()
> > >    considers if checkpoint needs to be added, but because
> > >    env->{jmps,insns}_processed had been just reset at step (b)
> > >    the `skip_inf_loop_check` logic forces `add_new_state` to false.
> > > e. Verifier proceeds with current state, which slowly accumulates
> > >    more and more entries in the jump history.
> > >
> > > The accumulation of entries in the jump history is a problem because
> > > of two factors:
> > > - it eventually exhausts memory available for kmalloc() allocation;
> > > - mark_chain_precision() traverses the jump history of a state,
> > >   meaning that if `r7` is marked precise, verifier would iterate
> > >   ever growing jump history until parent state boundary is reached.
> > >
> > > (note: the log also shows a REG INVARIANTS VIOLATION warning
> > >        upon jset processing, but that's another bug to fix).
> > >
> > > With this patch applied, the example above is rejected by verifier
> > > under 1s of time, reaching 1M instructions limit.
> > >
> > > The program is a simplified reproducer from syzbot report [1].
> > > Previous discussion could be found at [2].
> > > The patch does not cause any changes in verification performance,
> > > when tested on selftests from veristat.cfg and cilium programs taken
> > > from [3].
> > >
> > > [1] https://lore.kernel.org/bpf/670429f6.050a0220.49194.0517.GAE@google.com/
> > > [2] https://lore.kernel.org/bpf/20241009021254.2805446-1-eddyz87@gmail.com/
> > > [3] https://github.com/anakryiko/cilium
> > >
> > > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 14 ++++++++++++--
> > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index f514247ba8ba..f64c831a9278 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -17873,13 +17873,23 @@ static bool iter_active_depths_differ(struct bpf_verifier_state *old, struct bpf
> > >         return false;
> > >  }
> > >
> > > +#define MAX_JMPS_PER_STATE 20
> > > +
> > >  static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >  {
> > >         struct bpf_verifier_state_list *new_sl;
> > >         struct bpf_verifier_state_list *sl, **pprev;
> > >         struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
> > >         int i, j, n, err, states_cnt = 0;
> > > -       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx);
> > > +       bool force_new_state = env->test_state_freq || is_force_checkpoint(env, insn_idx) ||
> > > +                              /* - Long jmp history hinders mark_chain_precision performance,
> > > +                               *   so force new state if jmp history of current state exceeds
> > > +                               *   a threshold.
> > > +                               * - Jmp history records not only jumps, but also stack access,
> > > +                               *   so keep this constant 2x times the limit imposed on
> > > +                               *   env->jmps_processed for loop cases (see skip_inf_loop_check).
> > > +                               */
> > > +                              cur->jmp_history_cnt > MAX_JMPS_PER_STATE * 2;
> >
> > this feels like a wrong place to add this heuristic. Just few lines
> > below there is:
> >
> >
> > if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
> >     env->insn_processed - env->prev_insn_processed >= 8)
> >         add_new_state = true;
> >
> > Please add jmp_history_cnt check here, as it conceptually fits with
> > jmps_processed and insn_processed check. It also has a huge comment
> > with justification already, so might as well just extend that for
> > jmp_history_cnt.
>
> I think adding if (cur->jmp_history_cnt > 20) add_new_state = true;
> won't help. It will get back to false.
> But I agree that tweaking force_new_state also is not quite clean.
>
> btw the "huge comment" may need to be revised :)
> bpf progs today probably look different than they were in 2019.
>
> >
> > pw-bot: cr
> >
> > >         bool add_new_state = force_new_state;
> > >         bool force_exact;
> > >
> > > @@ -18023,7 +18033,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
> > >                          */
> > >  skip_inf_loop_check:
> > >                         if (!force_new_state &&
> > > -                           env->jmps_processed - env->prev_jmps_processed < 20 &&
> > > +                           env->jmps_processed - env->prev_jmps_processed < MAX_JMPS_PER_STATE &&
> > >                             env->insn_processed - env->prev_insn_processed < 100)
> > >                                 add_new_state = false;
> >
> > and then this one is logically matching add_new_state = true; case
> > above that I mentioned.
> >
> >
> > With these changes, I'd drop * 2 factor for one of the checks. If
> > necessary, just bump it to 30 or so, if you are afraid of stack
> > accesses. But let's keep it simple with one threshold, if possible?
>
> +1
>
> 2/8 heuristic is working together with 20/100.
>
> Instead of tweaking force_new_state earlier, it's better to do:
> if (!force_new_state && cur->jmp_history_cnt < N &&
>     env->jmps_processed - env->prev_jmps_processed < 20 && ..)
>     add_new_state = false;

Yep, I actually had *exactly* this in mind, ack.


Basically all these heuristics are expressing the idea of doing just
the right amount of work between checkpoints, not too little, not too
much. Jump history size (accumulated in the current state) will be a
third  measure of "enough useful work", basically.

>
> You're essentially proposing N == 40.
> Just add your existing comment next to the check.
> # define MAX_JMPS_PER_STATE is imo overkill.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-22  2:53     ` Alexei Starovoitov
@ 2024-10-22  5:38       ` Eduard Zingerman
  2024-10-23  2:52         ` Eduard Zingerman
  0 siblings, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-22  5:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, 2024-10-21 at 19:53 -0700, Alexei Starovoitov wrote:

[...]

> Instead of tweaking force_new_state earlier, it's better to do:
> if (!force_new_state && cur->jmp_history_cnt < N &&
>     env->jmps_processed - env->prev_jmps_processed < 20 && ..)
>     add_new_state = false;
>
> You're essentially proposing N == 40.
> Just add your existing comment next to the check.
> # define MAX_JMPS_PER_STATE is imo overkill.

This condition is only triggered when verifier processes loops,
but this corner case is also possible w/o loops,
e.g. consider the following program:

    func#0 @0
    0: R1=ctx() R10=fp0
    0: (79) r2 = *(u64 *)(r1 +0)          ; R1=ctx() R2_w=scalar()
    1: (b7) r0 = 0                        ; R0_w=0
    2: (35) if r2 >= 0x0 goto pc+5
    mark_precise: frame0: last_idx 2 first_idx 0 subseq_idx -1 
    mark_precise: frame0: regs=r2 stack= before 1: (b7) r0 = 0
    mark_precise: frame0: regs=r2 stack= before 0: (79) r2 = *(u64 *)(r1 +0)
    2: R2_w=scalar()
    8: (35) if r2 >= 0x1 goto pc+6        ; R2_w=0
    9: (05) goto pc+0
    10: (05) goto pc+0
    11: (05) goto pc+0
    12: (05) goto pc+0
    13: (05) goto pc+0
    14: (95) exit
    
    from 8 to 15: R0_w=0 R1=ctx() R2_w=scalar(umin=1) R10=fp0
    15: R0_w=0 R1=ctx() R2_w=scalar(umin=1) R10=fp0
    15: (35) if r2 >= 0x2 goto pc+6       ; R2_w=1
    16: (05) goto pc+0
    17: (05) goto pc+0
    18: (05) goto pc+0
    19: (05) goto pc+0
    20: (05) goto pc+0
    21: (95) exit
    
    from 15 to 22: R0_w=0 R1=ctx() R2_w=scalar(umin=2) R10=fp0
    22: R0_w=0 R1=ctx() R2_w=scalar(umin=2) R10=fp0
    22: (35) if r2 >= 0x3 goto pc+6       ; R2_w=2
    23: (05) goto pc+0
    24: (05) goto pc+0
    25: (05) goto pc+0
    26: (05) goto pc+0
    27: (05) goto pc+0
    28: (95) exit

    ... and so on ...

Here amount of "goto +0;" instructions before each exit is chosen
specifically to force checkpoint before "exit;" is processed.

This takes ~10 minutes to verify on master.
Surprisingly current patch does not seem to help,
I'll investigate this tomorrow.
Full example is in the end of the email.

> I see. Thanks for explaining.
> So the bug is actually in reset logic of jmps/insns_processed
> coupled with push/pop stack.
> I think let's add cur->jmp_history_cnt < 40 check for now
> and target bpf tree (assuming no veristat regressions),
> but for bpf-next we probably need to follow up.

You'd like different fixes for bpf and bpf-next?

> We can probably remove jmps_processed counter
> and replace it with jmp_history_cnt.

I tried that :)
Results are a bit surprising, when the following patch is used:

    diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
    index 4513372c5bc8..4565c0c3e5b1 100644
    --- a/include/linux/bpf_verifier.h
    +++ b/include/linux/bpf_verifier.h
    @@ -468,6 +468,10 @@ struct bpf_verifier_state {
            u32 dfs_depth;
            u32 callback_unroll_depth;
            u32 may_goto_depth;
    +       /* number of jmps, calls, exits analyzed within this state */
    +       u32 jmps_processed;
    +       /* total number of instructions processed within this state */
    +       u32 insn_processed;
     }
    ...
    diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
    index f514247ba8ba..d61648d4c905 100644
    --- a/kernel/bpf/verifier.c
    +++ b/kernel/bpf/verifier.c
    ...
    @@ -17891,8 +17893,8 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
             * In tests that amounts to up to 50% reduction into total verifier
             * memory consumption and 20% verifier time speedup.
             */
    -       if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
    -           env->insn_processed - env->prev_insn_processed >= 8)
    +       if (cur->jmps_processed >= 2 &&
    +           cur->insn_processed >= 8)
                    add_new_state = true;

File                              Program           Insns       (DIFF)  States   (DIFF)
--------------------------------  ----------------  ------------------  ---------------
bpf_xdp.o                         tail_lb_ipv4         +8003 (+18.27%)   -394 (-14.18%)
bpf_xdp.o                         tail_lb_ipv6        +19757 (+24.71%)   -431 (-11.23%)
loop1.bpf.o                       nested_loops             +0 (+0.00%)      +0 (+0.00%)
loop3.bpf.o                       while_true               +0 (+0.00%)      +0 (+0.00%)
pyperf100.bpf.o                   on_event               -213 (-0.32%)  -1128 (-19.26%)
pyperf180.bpf.o                   on_event              +7863 (+6.25%)  -1801 (-17.23%)
pyperf600.bpf.o                   on_event            -79267 (-14.44%)  -5090 (-17.24%)
pyperf600_nounroll.bpf.o          on_event             +39203 (+7.78%)  -6055 (-17.69%)
strobemeta.bpf.o                  on_event          +587119 (+246.59%)  -1909 (-37.09%)
strobemeta_nounroll1.bpf.o        on_event            +49553 (+91.96%)   -667 (-42.24%)
strobemeta_nounroll2.bpf.o        on_event           +109673 (+92.35%)  -1732 (-46.19%)
strobemeta_subprogs.bpf.o         on_event               -413 (-0.76%)   -582 (-38.77%)
test_cls_redirect.bpf.o           cls_redirect        +20875 (+58.96%)     -73 (-3.35%)
test_cls_redirect_subprogs.bpf.o  cls_redirect        +12337 (+20.37%)     +52 (+1.25%)
test_verif_scale1.bpf.o           balancer_ingress   +224652 (+41.09%)    -670 (-7.76%)
test_verif_scale2.bpf.o           balancer_ingress      +7145 (+0.92%)  +2983 (+97.87%)
test_verif_scale3.bpf.o           balancer_ingress   +162514 (+19.40%)  -1959 (-22.68%)
verifier_search_pruning.bpf.o     short_loop1                      N/A              N/A

Plus timer_mim.c starts to fail because we do not force checkpoint at
entry to async callback :)

> Based on the above explanation insns_processed counter is
> also bogus.

My reasoning is that jmp_history_cnt is what matters for
mark_chain_precision(), so it should be a marker for state cut-off,
insns_processed does not seem to add such constraints.

---

diff --git a/tools/testing/selftests/bpf/prog_tests/jumpjump.c b/tools/testing/selftests/bpf/prog_tests/jumpjump.c
new file mode 100644
index 000000000000..f78675ba0341
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/jumpjump.c
@@ -0,0 +1,62 @@
+#include <test_progs.h>
+
+#define MAX_INSNS (512 * 1024)
+
+static char log_buf[16 * 1024* 1024];
+
+void test_jumpjump(void)
+{
+	/* Generate a series of jumps that would take a long time to verify. */
+	struct bpf_insn *insns = NULL;
+	int jmps_processed = 0;
+	int insn_processed = 0;
+	int jmp_idx = -1;
+	int prog_fd = -1;
+	int cnt = 0;
+	int i = 0;
+
+	insns = calloc(MAX_INSNS + 32, sizeof(*insns));
+	if (!ASSERT_OK_PTR(insns, "insns = calloc()"))
+		return;
+	insns[i++] = BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, 0);
+	insn_processed++;
+	insns[i++] = BPF_MOV64_IMM(BPF_REG_0, 0);
+	insn_processed++;
+	while (i < MAX_INSNS) {
+		if (jmp_idx > 0)
+			insns[jmp_idx].off = i - jmp_idx - 1;
+		jmp_idx = i;
+		insns[i++] = BPF_JMP_IMM(BPF_JGE, BPF_REG_2, cnt++, 0);
+		jmps_processed++;
+		insn_processed++;
+		while (insn_processed < 7 || jmps_processed < 2) {
+			insns[i++] = BPF_JMP_IMM(BPF_JA, 0, 0, 0);
+			jmps_processed++;
+			insn_processed++;
+		}
+		insns[i++] = BPF_EXIT_INSN();
+		jmps_processed = 1;
+		insn_processed = 1;
+	}
+	if (jmp_idx > 0)
+		insns[jmp_idx].off = i - jmp_idx - 1;
+	insns[i++] = BPF_EXIT_INSN();
+
+	LIBBPF_OPTS(bpf_prog_load_opts, opts);
+	opts.log_buf = log_buf;
+	opts.log_size = sizeof(log_buf);
+	opts.log_level = 0;
+	if (env.verbosity >= VERBOSE_VERY)
+		opts.log_level = 1 | 2 | 4;
+	prog_fd = bpf_prog_load(BPF_PROG_TYPE_RAW_TRACEPOINT, NULL, "GPL", insns, i, &opts);
+	if (prog_fd < 0)
+		PRINT_FAIL("Can't load program, errno %d (%s)\n", errno, strerror(errno));
+	if (env.verbosity >= VERBOSE_VERY || prog_fd < 0) {
+		printf("Verifier log:\n%s\n", log_buf);
+	}
+	if (prog_fd < 0)
+		goto out;
+out:
+	close(prog_fd);
+	free(insns);
+}


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-22  5:38       ` Eduard Zingerman
@ 2024-10-23  2:52         ` Eduard Zingerman
  2024-10-23 17:31           ` Andrii Nakryiko
  0 siblings, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2024-10-23  2:52 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, 2024-10-21 at 22:38 -0700, Eduard Zingerman wrote:

[...]

> This takes ~10 minutes to verify on master.
> Surprisingly current patch does not seem to help,
> I'll investigate this tomorrow.
> Full example is in the end of the email.

I messed up the example a little bit.
The example shared previously takes so long to process because of "goto +0;".
opt_remove_nops() deletes such jumps and we know that bpf_remove_insns()
is not efficient.

Corrected example uses conditional jump in place of "goto +0;" and
slightly adjusted counters. Full program is here:
https://gist.github.com/eddyz87/cb813387323b78bcd6a7e264fc44c817
Here is it's verification log to get the idea:

    0:  (79) r2 = *(u64 *)(r1 +0)
    1:  (b7) r0 = 0
    2:  (35) if r2 >= 0x1 goto pc+5
    push_stack: at 2, jmp_history_cnt 0
    3:  (35) if r0 >= 0x0 goto pc+0
    4:  (35) if r0 >= 0x0 goto pc+0
    5:  (35) if r0 >= 0x0 goto pc+0
    6:  (35) if r0 >= 0x0 goto pc+0
    is_state_visited: new checkpoint at 7, resetting env->jmps_processed
    7:  (95) exit
    8:  (35) if r2 >= 0x2 goto pc+7
    push_stack: at 8, jmp_history_cnt 1
    9:  (35) if r0 >= 0x0 goto pc+0
    10: (35) if r0 >= 0x0 goto pc+0
    11: (35) if r0 >= 0x0 goto pc+0
    12: (35) if r0 >= 0x0 goto pc+0
    13: (35) if r0 >= 0x0 goto pc+0
    14: (35) if r0 >= 0x0 goto pc+0
    is_state_visited: new checkpoint at 15, resetting env->jmps_processed
    15: (95) exit
    ...
    320: (35) if r2 >= 0x29 goto pc+7
    push_stack: at 320, jmp_history_cnt 40
    320: R2_w=40
    321: (35) if r0 >= 0x0 goto pc+0
    322: (35) if r0 >= 0x0 goto pc+0
    323: (35) if r0 >= 0x0 goto pc+0
    324: (35) if r0 >= 0x0 goto pc+0
    325: (35) if r0 >= 0x0 goto pc+0
    326: (35) if r0 >= 0x0 goto pc+0
    is_state_visited: new checkpoint at 327, resetting env->jmps_processed
    327: (95) exit
    ...

A bpf program w/o loops, at each 'if r2 >= ...' push_stack() saves a
state with ever increasing jump history.
- right amount of 'if r0 >= ...' instructions is maintained before 'exit'
  to force a new checkpoint;
- exit is processed;
- state is popped from the stack and first insn it processes is
  'if r2 >= ...', thus a new state is saved by push_stack()
  with jump history longer by 1.

On master this fails with ENOMEM and the following error in the log:

    [  418.083600] test_progs: page allocation failure: order:7, mode:0x140cc0(GFP_USER|__GFP_COMP), \
                     nodemask=(null),cpuset=/,mems_allowed=0
                   ...
    [  418.084158] Call Trace:
                    ...
    [  418.084649]  krealloc_noprof+0x53/0xd0
    [  418.084688]  copy_verifier_state+0x78/0x390
                    ...

Same happens if jmp_history_cnt check is moved to 'skip_inf_loop_check':

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18022,7 +18022,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
                         * at the end of the loop are likely to be useful in pruning.
                         */
 skip_inf_loop_check:
-                       if (!force_new_state &&
+                       if (!force_new_state && cur->jmp_history_cnt < 40 &&
                            env->jmps_processed - env->prev_jmps_processed < 20 &&
                            env->insn_processed - env->prev_insn_processed < 100)


Or if it is in the else branch. Simply because 'skip_inf_loop_check' is
for instructions that have been already seen on the current verification path.

However, with change suggested in this patch-set such ENOMEM situation
is not possible. Hence I insist that large enough jmp_history_cnt
should force a new state, and point where I put the check covers more
cases then alternatives.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long
  2024-10-23  2:52         ` Eduard Zingerman
@ 2024-10-23 17:31           ` Andrii Nakryiko
  0 siblings, 0 replies; 15+ messages in thread
From: Andrii Nakryiko @ 2024-10-23 17:31 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Alexei Starovoitov, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Kernel Team, Yonghong Song

On Tue, Oct 22, 2024 at 7:52 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Mon, 2024-10-21 at 22:38 -0700, Eduard Zingerman wrote:
>
> [...]
>
> > This takes ~10 minutes to verify on master.
> > Surprisingly current patch does not seem to help,
> > I'll investigate this tomorrow.
> > Full example is in the end of the email.
>

[...]

> On master this fails with ENOMEM and the following error in the log:
>

Have you tried applying my insn_history optimization? It should
definitely avoid -ENOMEM, no? It still might be slow, but probably
much faster than that we have now. It would be nice if you can rebase
my old patch and land it (without any of the precision changes, that's
a second step).

>     [  418.083600] test_progs: page allocation failure: order:7, mode:0x140cc0(GFP_USER|__GFP_COMP), \
>                      nodemask=(null),cpuset=/,mems_allowed=0
>                    ...
>     [  418.084158] Call Trace:
>                     ...
>     [  418.084649]  krealloc_noprof+0x53/0xd0
>     [  418.084688]  copy_verifier_state+0x78/0x390
>                     ...
>
> Same happens if jmp_history_cnt check is moved to 'skip_inf_loop_check':
>
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -18022,7 +18022,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                          * at the end of the loop are likely to be useful in pruning.
>                          */
>  skip_inf_loop_check:
> -                       if (!force_new_state &&
> +                       if (!force_new_state && cur->jmp_history_cnt < 40 &&
>                             env->jmps_processed - env->prev_jmps_processed < 20 &&
>                             env->insn_processed - env->prev_insn_processed < 100)
>
>
> Or if it is in the else branch. Simply because 'skip_inf_loop_check' is
> for instructions that have been already seen on the current verification path.
>
> However, with change suggested in this patch-set such ENOMEM situation
> is not possible. Hence I insist that large enough jmp_history_cnt
> should force a new state, and point where I put the check covers more
> cases then alternatives.
>

I think we are in agreement that jmp_history_cnt should cause a new
state. But ok, using force_new_state is fine by me, it actually allows
us to remove duplication of logic. But then let's do this (it gets
unwieldy to combine variable declaration and initialization, it's
non-trivial amount of logic, should be its own statement):

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 587a6c76e564..1f30ef99b246 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17886,9 +17886,11 @@ static int is_state_visited(struct
bpf_verifier_env *env, int insn_idx)
        struct bpf_verifier_state_list *sl, **pprev;
        struct bpf_verifier_state *cur = env->cur_state, *new, *loop_entry;
        int i, j, n, err, states_cnt = 0;
-       bool force_new_state = env->test_state_freq ||
is_force_checkpoint(env, insn_idx);
-       bool add_new_state = force_new_state;
-       bool force_exact;
+       bool force_new_state, add_new_state, force_exact;
+
+       force_new_state = env->test_state_freq ||
+                         is_force_checkpoint(env, insn_idx) ||
+                         cur->jmp_history_cnt > 40;

        /* bpf progs typically have pruning point every 4 instructions
         * http://vger.kernel.org/bpfconf2019.html#session-1
@@ -17898,6 +17900,7 @@ static int is_state_visited(struct
bpf_verifier_env *env, int insn_idx)
         * In tests that amounts to up to 50% reduction into total verifier
         * memory consumption and 20% verifier time speedup.
         */
+       add_new_state = force_new_state;
        if (env->jmps_processed - env->prev_jmps_processed >= 2 &&
            env->insn_processed - env->prev_insn_processed >= 8)
                add_new_state = true;

There is no need to adjust skip_inf_loop_check condition because it
will already work fine due to !force_new_state check there. No?

And I wouldn't elaborate *that* much in a comment about
jmp_history_cnt > 40. It's a heuristic, we keep jump history short
enough. Multi-line comment here is just distracting, IMO. I wouldn't
add MAX_JMPS_PER_STATE as well, it sounds way too "official". It's a
heuristic, not some sort of fundamental rule.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-10-23 17:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-18  2:03 [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Eduard Zingerman
2024-10-18  2:03 ` [PATCH bpf-next v1 2/2] selftests/bpf: test with a very short loop Eduard Zingerman
2024-10-18 11:05   ` Daniel Borkmann
2024-10-18 11:03 ` [PATCH bpf-next v1 1/2] bpf: force checkpoint when jmp history is too long Daniel Borkmann
2024-10-18 16:47   ` Eduard Zingerman
2024-10-21  7:53     ` Daniel Borkmann
2024-10-21 20:23 ` Andrii Nakryiko
2024-10-22  2:03   ` Alexei Starovoitov
2024-10-22  3:19     ` Andrii Nakryiko
2024-10-22  2:18 ` Alexei Starovoitov
2024-10-22  2:27   ` Eduard Zingerman
2024-10-22  2:53     ` Alexei Starovoitov
2024-10-22  5:38       ` Eduard Zingerman
2024-10-23  2:52         ` Eduard Zingerman
2024-10-23 17:31           ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox