[PATCH bpf-next v3 0/2] bpf, verifier: Improve precision of BPF

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH bpf-next v3 0/2] bpf, verifier: Improve precision of BPF_MUL
@ 2024-12-14 19:04 Matan Shachnai
  2024-12-14 19:04 ` [PATCH bpf-next v3 1/2] " Matan Shachnai
  2024-12-14 19:04 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for BPF_MUL Matan Shachnai
  0 siblings, 2 replies; 5+ messages in thread
From: Matan Shachnai @ 2024-12-14 19:04 UTC (permalink / raw)
  To: ast
  Cc: harishankar.vishwanathan, srinivas.narayana, santosh.nagarakatte,
	m.shachnai, Matan Shachnai, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, Cupertino Miranda, Menglong Dong, bpf,
	linux-kernel, linux-kselftest

Hi All,

This patch-set aims to improve precision of BPF_MUL and add testcases
to illustrate precision gains using signed and unsigned bounds. 

Thanks for taking the time to review and specifically for Eduard's feedback!

Best,
Matan

Changes from v1:
 - Fixed typo made in patch

Changes from v2:
 - Added signed multiplication to BPF_MUL
 - Added test cases to exercise BPF_MUL 
 - Reordered patches in the series.

Matan Shachnai (2):
  bpf, verifier: Improve precision of BPF_MUL
  selftests/bpf: Add testcases for BPF_MUL

 kernel/bpf/verifier.c                         |  72 +++++-----
 .../selftests/bpf/progs/verifier_bounds.c     | 134 ++++++++++++++++++
 2 files changed, 166 insertions(+), 40 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH bpf-next v3 1/2] bpf, verifier: Improve precision of BPF_MUL
  2024-12-14 19:04 [PATCH bpf-next v3 0/2] bpf, verifier: Improve precision of BPF_MUL Matan Shachnai
@ 2024-12-14 19:04 ` Matan Shachnai
  2024-12-16 20:31   ` Alexei Starovoitov
  2024-12-14 19:04 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for BPF_MUL Matan Shachnai
  1 sibling, 1 reply; 5+ messages in thread
From: Matan Shachnai @ 2024-12-14 19:04 UTC (permalink / raw)
  To: ast
  Cc: harishankar.vishwanathan, srinivas.narayana, santosh.nagarakatte,
	m.shachnai, Matan Shachnai, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, Cupertino Miranda, Menglong Dong, bpf,
	linux-kernel, linux-kselftest

This patch improves (or maintains) the precision of register value tracking
in BPF_MUL across all possible inputs. It also simplifies
scalar32_min_max_mul() and scalar_min_max_mul().

As it stands, BPF_MUL is composed of three functions:

case BPF_MUL:
  tnum_mul();
  scalar32_min_max_mul();
  scalar_min_max_mul();

The current implementation of scalar_min_max_mul() restricts the u64 input
ranges of dst_reg and src_reg to be within [0, U32_MAX]:

    /* Both values are positive, so we can work with unsigned and
     * copy the result to signed (unless it exceeds S64_MAX).
     */
    if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {
        /* Potential overflow, we know nothing */
        __mark_reg64_unbounded(dst_reg);
        return;
    }

This restriction is done to avoid unsigned overflow, which could otherwise
wrap the result around 0, and leave an unsound output where umin > umax. We
also observe that limiting these u64 input ranges to [0, U32_MAX] leads to
a loss of precision. Consider the case where the u64 bounds of dst_reg are
[0, 2^34] and the u64 bounds of src_reg are [0, 2^2]. While the
multiplication of these two bounds doesn't overflow and is sound [0, 2^36],
the current scalar_min_max_mul() would set the entire register state to
unbounded.

Importantly, we update BPF_MUL to allow signed bound multiplication
(i.e. multiplying negative bounds) as well as allow u64 inputs to take on
values from [0, U64_MAX]. We perform signed multiplication on two bounds
[a,b] and [c,d] by multiplying every combination of the bounds
(i.e. a*c, a*d, b*c, and b*d) and checking for overflow of each product. If
there is an overflow, we mark the signed bounds unbounded [S64_MIN, S64_MAX].
In the case of no overflow, we take the minimum of these products to
be the resulting smin, and the maximum to be the resulting smax.

The key idea here is that if there’s no possibility of overflow, either
when multiplying signed bounds or unsigned bounds, we can safely multiply the
respective bounds; otherwise, we set the bounds that exhibit overflow
(during multiplication) to unbounded.

if (check_mul_overflow(*dst_umax, src_reg->umax_value, dst_umax) ||
       (check_mul_overflow(*dst_umin, src_reg->umin_value, dst_umin))) {
        /* Overflow possible, we know nothing */
        dst_reg->umin_value = 0;
        dst_reg->umax_value = U64_MAX;
    }
  ...

Below, we provide an example BPF program (below) that exhibits the
imprecision in the current BPF_MUL, where the outputs are all unbounded. In
contrast, the updated BPF_MUL produces a bounded register state:

BPF_LD_IMM64(BPF_REG_1, 11),
BPF_LD_IMM64(BPF_REG_2, 4503599627370624),
BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
BPF_ALU64_REG(BPF_AND, BPF_REG_1, BPF_REG_2),
BPF_LD_IMM64(BPF_REG_3, 809591906117232263),
BPF_ALU64_REG(BPF_MUL, BPF_REG_3, BPF_REG_1),
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),

Verifier log using the old BPF_MUL:

func#0 @0
0: R1=ctx() R10=fp0
0: (18) r1 = 0xb                      ; R1_w=11
2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
4: (87) r2 = -r2                      ; R2_w=scalar()
5: (87) r2 = -r2                      ; R2_w=scalar()
6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar()
...

Verifier using the new updated BPF_MUL (more precise bounds at label 9)

func#0 @0
0: R1=ctx() R10=fp0
0: (18) r1 = 0xb                      ; R1_w=11
2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
4: (87) r2 = -r2                      ; R2_w=scalar()
5: (87) r2 = -r2                      ; R2_w=scalar()
6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar(smin=0,smax=umax=0x7b96bb0a94a3a7cd,var_off=(0x0; 0x7fffffffffffffff))
...

Finally, we proved the soundness of the new scalar_min_max_mul() and
scalar32_min_max_mul() functions. Typically, multiplication operations are
expensive to check with bitvector-based solvers. We were able to prove the
soundness of these functions using Non-Linear Integer Arithmetic (NIA)
theory. Additionally, using Agni [2,3], we obtained the encodings for
scalar32_min_max_mul() and scalar_min_max_mul() in bitvector theory, and
were able to prove their soundness using 8-bit bitvectors (instead of
64-bit bitvectors that the functions actually use).

In conclusion, with this patch,

1. We were able to show that we can improve the overall precision of
   BPF_MUL. We proved (using an SMT solver) that this new version of
   BPF_MUL is at least as precise as the current version for all inputs
   and more precise for some inputs.

2. We are able to prove the soundness of the new scalar_min_max_mul() and
   scalar32_min_max_mul(). By leveraging the existing proof of tnum_mul
   [1], we can say that the composition of these three functions within
   BPF_MUL is sound.

[1] https://ieeexplore.ieee.org/abstract/document/9741267
[2] https://link.springer.com/chapter/10.1007/978-3-031-37709-9_12
[3] https://people.cs.rutgers.edu/~sn349/papers/sas24-preprint.pdf

Co-developed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Matan Shachnai <m.shachnai@gmail.com>
---
 kernel/bpf/verifier.c | 72 +++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 40 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c855e7905c35..5b0f83cc7f4d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14118,64 +14118,56 @@ static void scalar_min_max_sub(struct bpf_reg_state *dst_reg,
 static void scalar32_min_max_mul(struct bpf_reg_state *dst_reg,
 				 struct bpf_reg_state *src_reg)
 {
-	s32 smin_val = src_reg->s32_min_value;
-	u32 umin_val = src_reg->u32_min_value;
-	u32 umax_val = src_reg->u32_max_value;
+	s32 *dst_smin = &dst_reg->s32_min_value;
+	s32 *dst_smax = &dst_reg->s32_max_value;
+	u32 *dst_umin = &dst_reg->u32_min_value;
+	u32 *dst_umax = &dst_reg->u32_max_value;
+	s32 tmp_prod[4];
 
-	if (smin_val < 0 || dst_reg->s32_min_value < 0) {
-		/* Ain't nobody got time to multiply that sign */
-		__mark_reg32_unbounded(dst_reg);
-		return;
-	}
-	/* Both values are positive, so we can work with unsigned and
-	 * copy the result to signed (unless it exceeds S32_MAX).
-	 */
-	if (umax_val > U16_MAX || dst_reg->u32_max_value > U16_MAX) {
-		/* Potential overflow, we know nothing */
-		__mark_reg32_unbounded(dst_reg);
-		return;
+	if (check_mul_overflow(*dst_umax, src_reg->u32_max_value, dst_umax) ||
+	    check_mul_overflow(*dst_umin, src_reg->u32_min_value, dst_umin)) {
+		/* Overflow possible, we know nothing */
+		dst_reg->u32_min_value = 0;
+		dst_reg->u32_max_value = U32_MAX;
 	}
-	dst_reg->u32_min_value *= umin_val;
-	dst_reg->u32_max_value *= umax_val;
-	if (dst_reg->u32_max_value > S32_MAX) {
+	if (check_mul_overflow(*dst_smin, src_reg->s32_min_value, &tmp_prod[0]) ||
+	    check_mul_overflow(*dst_smin, src_reg->s32_max_value, &tmp_prod[1]) ||
+	    check_mul_overflow(*dst_smax, src_reg->s32_min_value, &tmp_prod[2]) ||
+	    check_mul_overflow(*dst_smax, src_reg->s32_max_value, &tmp_prod[3])) {
 		/* Overflow possible, we know nothing */
 		dst_reg->s32_min_value = S32_MIN;
 		dst_reg->s32_max_value = S32_MAX;
 	} else {
-		dst_reg->s32_min_value = dst_reg->u32_min_value;
-		dst_reg->s32_max_value = dst_reg->u32_max_value;
+		dst_reg->s32_min_value = min_array(tmp_prod, 4);
+		dst_reg->s32_max_value = max_array(tmp_prod, 4);
 	}
 }
 
 static void scalar_min_max_mul(struct bpf_reg_state *dst_reg,
 			       struct bpf_reg_state *src_reg)
 {
-	s64 smin_val = src_reg->smin_value;
-	u64 umin_val = src_reg->umin_value;
-	u64 umax_val = src_reg->umax_value;
+	s64 *dst_smin = &dst_reg->smin_value;
+	s64 *dst_smax = &dst_reg->smax_value;
+	u64 *dst_umin = &dst_reg->umin_value;
+	u64 *dst_umax = &dst_reg->umax_value;
+	s64 tmp_prod[4];
 
-	if (smin_val < 0 || dst_reg->smin_value < 0) {
-		/* Ain't nobody got time to multiply that sign */
-		__mark_reg64_unbounded(dst_reg);
-		return;
-	}
-	/* Both values are positive, so we can work with unsigned and
-	 * copy the result to signed (unless it exceeds S64_MAX).
-	 */
-	if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {
-		/* Potential overflow, we know nothing */
-		__mark_reg64_unbounded(dst_reg);
-		return;
+	if (check_mul_overflow(*dst_umax, src_reg->umax_value, dst_umax) ||
+	    check_mul_overflow(*dst_umin, src_reg->umin_value, dst_umin)) {
+		/* Overflow possible, we know nothing */
+		dst_reg->umin_value = 0;
+		dst_reg->umax_value = U64_MAX;
 	}
-	dst_reg->umin_value *= umin_val;
-	dst_reg->umax_value *= umax_val;
-	if (dst_reg->umax_value > S64_MAX) {
+	if (check_mul_overflow(*dst_smin, src_reg->smin_value, &tmp_prod[0]) ||
+	    check_mul_overflow(*dst_smin, src_reg->smax_value, &tmp_prod[1]) ||
+	    check_mul_overflow(*dst_smax, src_reg->smin_value, &tmp_prod[2]) ||
+	    check_mul_overflow(*dst_smax, src_reg->smax_value, &tmp_prod[3])) {
 		/* Overflow possible, we know nothing */
 		dst_reg->smin_value = S64_MIN;
 		dst_reg->smax_value = S64_MAX;
 	} else {
-		dst_reg->smin_value = dst_reg->umin_value;
-		dst_reg->smax_value = dst_reg->umax_value;
+		dst_reg->smin_value = min_array(tmp_prod, 4);
+		dst_reg->smax_value = max_array(tmp_prod, 4);
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for BPF_MUL
  2024-12-14 19:04 [PATCH bpf-next v3 0/2] bpf, verifier: Improve precision of BPF_MUL Matan Shachnai
  2024-12-14 19:04 ` [PATCH bpf-next v3 1/2] " Matan Shachnai
@ 2024-12-14 19:04 ` Matan Shachnai
  1 sibling, 0 replies; 5+ messages in thread
From: Matan Shachnai @ 2024-12-14 19:04 UTC (permalink / raw)
  To: ast
  Cc: harishankar.vishwanathan, srinivas.narayana, santosh.nagarakatte,
	m.shachnai, Matan Shachnai, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shuah Khan, Cupertino Miranda, Menglong Dong, bpf,
	linux-kernel, linux-kselftest

The previous commit improves precision of BPF_MUL.
Add tests to exercise updated BPF_MUL.

Signed-off-by: Matan Shachnai <m.shachnai@gmail.com>
---
 .../selftests/bpf/progs/verifier_bounds.c     | 134 ++++++++++++++++++
 1 file changed, 134 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
index a0bb7fb40ea5..0eb33bb801b5 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
@@ -1200,4 +1200,138 @@ l0_%=:	r0 = 0;						\
 	: __clobber_all);
 }
 
+SEC("tc")
+__description("multiply mixed sign bounds. test 1")
+__success __log_level(2)
+__msg("r6 *= r7 {{.*}}; R6_w=scalar(smin=umin=0x1bc16d5cd4927ee1,smax=umax=0x1bc16d674ec80000,smax32=0x7ffffeff,umax32=0xfffffeff,var_off=(0x1bc16d4000000000; 0x3ffffffeff))")
+__naked void mult_mixed0_sign(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= 0xf;"
+	"r6 -= 1000000000;"
+	"r7 &= 0xf;"
+	"r7 -= 2000000000;"
+	"r6 *= r7;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
+
+SEC("tc")
+__description("multiply mixed sign bounds. test 2")
+__success __log_level(2)
+__msg("r6 *= r7 {{.*}}; R6_w=scalar(smin=smin32=-100,smax=smax32=200)")
+__naked void mult_mixed1_sign(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= 0xf;"
+	"r6 -= 0xa;"
+	"r7 &= 0xf;"
+	"r7 -= 0x14;"
+	"r6 *= r7;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
+
+SEC("tc")
+__description("multiply negative bounds")
+__success __log_level(2)
+__msg("r6 *= r7 {{.*}}; R6_w=scalar(smin=umin=smin32=umin32=0x3ff280b0,smax=umax=smax32=umax32=0x3fff0001,var_off=(0x3ff00000; 0xf81ff))")
+__naked void mult_sign_bounds(void)
+{
+	asm volatile (
+	"r8 = 0x7fff;"
+	"call %[bpf_get_prandom_u32];"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= 0xa;"
+	"r6 -= r8;"
+	"r7 &= 0xf;"
+	"r7 -= r8;"
+	"r6 *= r7;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
+
+SEC("tc")
+__description("multiply bounds that don't cross signed boundary")
+__success __log_level(2)
+__msg("r8 *= r6 {{.*}}; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R8_w=scalar(smin=0,smax=umax=0x7b96bb0a94a3a7cd,var_off=(0x0; 0x7fffffffffffffff))")
+__naked void mult_no_sign_crossing(void)
+{
+	asm volatile (
+	"r6 = 0xb;"
+	"r8 = 0xb3c3f8c99262687 ll;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= r7;"
+	"r8 *= r6;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
+
+SEC("tc")
+__description("multiplication overflow, result in unbounded reg. test 1")
+__success __log_level(2)
+__msg("r6 *= r7 {{.*}}; R6_w=scalar()")
+__naked void mult_unsign_ovf(void)
+{
+	asm volatile (
+	"r8 = 0x7ffffffffff ll;"
+	"call %[bpf_get_prandom_u32];"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= 0x7fffffff;"
+	"r7 &= r8;"
+	"r6 *= r7;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
+
+SEC("tc")
+__description("multiplication overflow, result in unbounded reg. test 2")
+__success __log_level(2)
+__msg("r6 *= r7 {{.*}}; R6_w=scalar()")
+__naked void mult_sign_ovf(void)
+{
+	asm volatile (
+	"r8 = 0x7ffffffff ll;"
+	"call %[bpf_get_prandom_u32];"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r6 &= 0xa;"
+	"r6 -= r8;"
+	"r7 &= 0x7fffffff;"
+	"r6 *= r7;"
+	"exit"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_skb_store_bytes)
+	: __clobber_all);
+}
 char _license[] SEC("license") = "GPL";
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf, verifier: Improve precision of BPF_MUL
  2024-12-14 19:04 ` [PATCH bpf-next v3 1/2] " Matan Shachnai
@ 2024-12-16 20:31   ` Alexei Starovoitov
  2024-12-17 19:13     ` Matan Shachnai
  0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2024-12-16 20:31 UTC (permalink / raw)
  To: Matan Shachnai
  Cc: Alexei Starovoitov, Harishankar Vishwanathan, Srinivas Narayana,
	Santosh Nagarakatte, Matan Shachnai, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shuah Khan, Cupertino Miranda, Menglong Dong, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Sat, Dec 14, 2024 at 11:04 AM Matan Shachnai <m.shachnai@gmail.com> wrote:
>
> This patch improves (or maintains) the precision of register value tracking
> in BPF_MUL across all possible inputs. It also simplifies
> scalar32_min_max_mul() and scalar_min_max_mul().
>
> As it stands, BPF_MUL is composed of three functions:
>
> case BPF_MUL:
>   tnum_mul();
>   scalar32_min_max_mul();
>   scalar_min_max_mul();
>
> The current implementation of scalar_min_max_mul() restricts the u64 input
> ranges of dst_reg and src_reg to be within [0, U32_MAX]:
>
>     /* Both values are positive, so we can work with unsigned and
>      * copy the result to signed (unless it exceeds S64_MAX).
>      */
>     if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {
>         /* Potential overflow, we know nothing */
>         __mark_reg64_unbounded(dst_reg);
>         return;
>     }
>
> This restriction is done to avoid unsigned overflow, which could otherwise
> wrap the result around 0, and leave an unsound output where umin > umax. We
> also observe that limiting these u64 input ranges to [0, U32_MAX] leads to
> a loss of precision. Consider the case where the u64 bounds of dst_reg are
> [0, 2^34] and the u64 bounds of src_reg are [0, 2^2]. While the
> multiplication of these two bounds doesn't overflow and is sound [0, 2^36],
> the current scalar_min_max_mul() would set the entire register state to
> unbounded.
>
> Importantly, we update BPF_MUL to allow signed bound multiplication
> (i.e. multiplying negative bounds) as well as allow u64 inputs to take on
> values from [0, U64_MAX]. We perform signed multiplication on two bounds
> [a,b] and [c,d] by multiplying every combination of the bounds
> (i.e. a*c, a*d, b*c, and b*d) and checking for overflow of each product. If
> there is an overflow, we mark the signed bounds unbounded [S64_MIN, S64_MAX].
> In the case of no overflow, we take the minimum of these products to
> be the resulting smin, and the maximum to be the resulting smax.
>
> The key idea here is that if there’s no possibility of overflow, either
> when multiplying signed bounds or unsigned bounds, we can safely multiply the
> respective bounds; otherwise, we set the bounds that exhibit overflow
> (during multiplication) to unbounded.
>
> if (check_mul_overflow(*dst_umax, src_reg->umax_value, dst_umax) ||
>        (check_mul_overflow(*dst_umin, src_reg->umin_value, dst_umin))) {
>         /* Overflow possible, we know nothing */
>         dst_reg->umin_value = 0;
>         dst_reg->umax_value = U64_MAX;
>     }
>   ...
>
> Below, we provide an example BPF program (below) that exhibits the
> imprecision in the current BPF_MUL, where the outputs are all unbounded. In
> contrast, the updated BPF_MUL produces a bounded register state:
>
> BPF_LD_IMM64(BPF_REG_1, 11),
> BPF_LD_IMM64(BPF_REG_2, 4503599627370624),
> BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
> BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
> BPF_ALU64_REG(BPF_AND, BPF_REG_1, BPF_REG_2),
> BPF_LD_IMM64(BPF_REG_3, 809591906117232263),
> BPF_ALU64_REG(BPF_MUL, BPF_REG_3, BPF_REG_1),
> BPF_MOV64_IMM(BPF_REG_0, 1),
> BPF_EXIT_INSN(),
>
> Verifier log using the old BPF_MUL:
>
> func#0 @0
> 0: R1=ctx() R10=fp0
> 0: (18) r1 = 0xb                      ; R1_w=11
> 2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
> 4: (87) r2 = -r2                      ; R2_w=scalar()
> 5: (87) r2 = -r2                      ; R2_w=scalar()
> 6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
> 7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
> 9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar()
> ...
>
> Verifier using the new updated BPF_MUL (more precise bounds at label 9)
>
> func#0 @0
> 0: R1=ctx() R10=fp0
> 0: (18) r1 = 0xb                      ; R1_w=11
> 2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
> 4: (87) r2 = -r2                      ; R2_w=scalar()
> 5: (87) r2 = -r2                      ; R2_w=scalar()
> 6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
> 7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
> 9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar(smin=0,smax=umax=0x7b96bb0a94a3a7cd,var_off=(0x0; 0x7fffffffffffffff))
> ...
>
> Finally, we proved the soundness of the new scalar_min_max_mul() and
> scalar32_min_max_mul() functions. Typically, multiplication operations are
> expensive to check with bitvector-based solvers. We were able to prove the
> soundness of these functions using Non-Linear Integer Arithmetic (NIA)
> theory. Additionally, using Agni [2,3], we obtained the encodings for
> scalar32_min_max_mul() and scalar_min_max_mul() in bitvector theory, and
> were able to prove their soundness using 8-bit bitvectors (instead of
> 64-bit bitvectors that the functions actually use).
>
> In conclusion, with this patch,
>
> 1. We were able to show that we can improve the overall precision of
>    BPF_MUL. We proved (using an SMT solver) that this new version of
>    BPF_MUL is at least as precise as the current version for all inputs
>    and more precise for some inputs.
>
> 2. We are able to prove the soundness of the new scalar_min_max_mul() and
>    scalar32_min_max_mul(). By leveraging the existing proof of tnum_mul
>    [1], we can say that the composition of these three functions within
>    BPF_MUL is sound.
>
> [1] https://ieeexplore.ieee.org/abstract/document/9741267
> [2] https://link.springer.com/chapter/10.1007/978-3-031-37709-9_12
> [3] https://people.cs.rutgers.edu/~sn349/papers/sas24-preprint.pdf
>
> Co-developed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> Signed-off-by: Matan Shachnai <m.shachnai@gmail.com>
> ---
>  kernel/bpf/verifier.c | 72 +++++++++++++++++++------------------------
>  1 file changed, 32 insertions(+), 40 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c855e7905c35..5b0f83cc7f4d 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14118,64 +14118,56 @@ static void scalar_min_max_sub(struct bpf_reg_state *dst_reg,
>  static void scalar32_min_max_mul(struct bpf_reg_state *dst_reg,
>                                  struct bpf_reg_state *src_reg)
>  {
> -       s32 smin_val = src_reg->s32_min_value;
> -       u32 umin_val = src_reg->u32_min_value;
> -       u32 umax_val = src_reg->u32_max_value;
> +       s32 *dst_smin = &dst_reg->s32_min_value;
> +       s32 *dst_smax = &dst_reg->s32_max_value;
> +       u32 *dst_umin = &dst_reg->u32_min_value;
> +       u32 *dst_umax = &dst_reg->u32_max_value;
> +       s32 tmp_prod[4];
>
> -       if (smin_val < 0 || dst_reg->s32_min_value < 0) {
> -               /* Ain't nobody got time to multiply that sign */
> -               __mark_reg32_unbounded(dst_reg);
> -               return;
> -       }
> -       /* Both values are positive, so we can work with unsigned and
> -        * copy the result to signed (unless it exceeds S32_MAX).
> -        */
> -       if (umax_val > U16_MAX || dst_reg->u32_max_value > U16_MAX) {
> -               /* Potential overflow, we know nothing */
> -               __mark_reg32_unbounded(dst_reg);
> -               return;
> +       if (check_mul_overflow(*dst_umax, src_reg->u32_max_value, dst_umax) ||
> +           check_mul_overflow(*dst_umin, src_reg->u32_min_value, dst_umin)) {
> +               /* Overflow possible, we know nothing */
> +               dst_reg->u32_min_value = 0;
> +               dst_reg->u32_max_value = U32_MAX;

It would be cleaner to use:
*dst_umin = 0;
*dst_umax = U32_MAX;

to make it obvious that though check_mul_overflow()-s wrote something
into dst_umax and dst_umin,
we clean them up with these two assignments.

Just like scalar32_min_max_add() does already.

>         }
> -       dst_reg->u32_min_value *= umin_val;
> -       dst_reg->u32_max_value *= umax_val;
> -       if (dst_reg->u32_max_value > S32_MAX) {
> +       if (check_mul_overflow(*dst_smin, src_reg->s32_min_value, &tmp_prod[0]) ||
> +           check_mul_overflow(*dst_smin, src_reg->s32_max_value, &tmp_prod[1]) ||
> +           check_mul_overflow(*dst_smax, src_reg->s32_min_value, &tmp_prod[2]) ||
> +           check_mul_overflow(*dst_smax, src_reg->s32_max_value, &tmp_prod[3])) {
>                 /* Overflow possible, we know nothing */
>                 dst_reg->s32_min_value = S32_MIN;
>                 dst_reg->s32_max_value = S32_MAX;
>         } else {
> -               dst_reg->s32_min_value = dst_reg->u32_min_value;
> -               dst_reg->s32_max_value = dst_reg->u32_max_value;
> +               dst_reg->s32_min_value = min_array(tmp_prod, 4);
> +               dst_reg->s32_max_value = max_array(tmp_prod, 4);

dst_smin/smax here too.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf, verifier: Improve precision of BPF_MUL
  2024-12-16 20:31   ` Alexei Starovoitov
@ 2024-12-17 19:13     ` Matan Shachnai
  0 siblings, 0 replies; 5+ messages in thread
From: Matan Shachnai @ 2024-12-17 19:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Harishankar Vishwanathan, Srinivas Narayana,
	Santosh Nagarakatte, Matan Shachnai, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shuah Khan, Cupertino Miranda, Menglong Dong, bpf, LKML,
	open list:KERNEL SELFTEST FRAMEWORK

On Mon, Dec 16, 2024 at 3:31 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Dec 14, 2024 at 11:04 AM Matan Shachnai <m.shachnai@gmail.com> wrote:
> >
> > This patch improves (or maintains) the precision of register value tracking
> > in BPF_MUL across all possible inputs. It also simplifies
> > scalar32_min_max_mul() and scalar_min_max_mul().
> >
> > As it stands, BPF_MUL is composed of three functions:
> >
> > case BPF_MUL:
> >   tnum_mul();
> >   scalar32_min_max_mul();
> >   scalar_min_max_mul();
> >
> > The current implementation of scalar_min_max_mul() restricts the u64 input
> > ranges of dst_reg and src_reg to be within [0, U32_MAX]:
> >
> >     /* Both values are positive, so we can work with unsigned and
> >      * copy the result to signed (unless it exceeds S64_MAX).
> >      */
> >     if (umax_val > U32_MAX || dst_reg->umax_value > U32_MAX) {
> >         /* Potential overflow, we know nothing */
> >         __mark_reg64_unbounded(dst_reg);
> >         return;
> >     }
> >
> > This restriction is done to avoid unsigned overflow, which could otherwise
> > wrap the result around 0, and leave an unsound output where umin > umax. We
> > also observe that limiting these u64 input ranges to [0, U32_MAX] leads to
> > a loss of precision. Consider the case where the u64 bounds of dst_reg are
> > [0, 2^34] and the u64 bounds of src_reg are [0, 2^2]. While the
> > multiplication of these two bounds doesn't overflow and is sound [0, 2^36],
> > the current scalar_min_max_mul() would set the entire register state to
> > unbounded.
> >
> > Importantly, we update BPF_MUL to allow signed bound multiplication
> > (i.e. multiplying negative bounds) as well as allow u64 inputs to take on
> > values from [0, U64_MAX]. We perform signed multiplication on two bounds
> > [a,b] and [c,d] by multiplying every combination of the bounds
> > (i.e. a*c, a*d, b*c, and b*d) and checking for overflow of each product. If
> > there is an overflow, we mark the signed bounds unbounded [S64_MIN, S64_MAX].
> > In the case of no overflow, we take the minimum of these products to
> > be the resulting smin, and the maximum to be the resulting smax.
> >
> > The key idea here is that if there’s no possibility of overflow, either
> > when multiplying signed bounds or unsigned bounds, we can safely multiply the
> > respective bounds; otherwise, we set the bounds that exhibit overflow
> > (during multiplication) to unbounded.
> >
> > if (check_mul_overflow(*dst_umax, src_reg->umax_value, dst_umax) ||
> >        (check_mul_overflow(*dst_umin, src_reg->umin_value, dst_umin))) {
> >         /* Overflow possible, we know nothing */
> >         dst_reg->umin_value = 0;
> >         dst_reg->umax_value = U64_MAX;
> >     }
> >   ...
> >
> > Below, we provide an example BPF program (below) that exhibits the
> > imprecision in the current BPF_MUL, where the outputs are all unbounded. In
> > contrast, the updated BPF_MUL produces a bounded register state:
> >
> > BPF_LD_IMM64(BPF_REG_1, 11),
> > BPF_LD_IMM64(BPF_REG_2, 4503599627370624),
> > BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
> > BPF_ALU64_IMM(BPF_NEG, BPF_REG_2, 0),
> > BPF_ALU64_REG(BPF_AND, BPF_REG_1, BPF_REG_2),
> > BPF_LD_IMM64(BPF_REG_3, 809591906117232263),
> > BPF_ALU64_REG(BPF_MUL, BPF_REG_3, BPF_REG_1),
> > BPF_MOV64_IMM(BPF_REG_0, 1),
> > BPF_EXIT_INSN(),
> >
> > Verifier log using the old BPF_MUL:
> >
> > func#0 @0
> > 0: R1=ctx() R10=fp0
> > 0: (18) r1 = 0xb                      ; R1_w=11
> > 2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
> > 4: (87) r2 = -r2                      ; R2_w=scalar()
> > 5: (87) r2 = -r2                      ; R2_w=scalar()
> > 6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
> > 7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
> > 9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar()
> > ...
> >
> > Verifier using the new updated BPF_MUL (more precise bounds at label 9)
> >
> > func#0 @0
> > 0: R1=ctx() R10=fp0
> > 0: (18) r1 = 0xb                      ; R1_w=11
> > 2: (18) r2 = 0x10000000000080         ; R2_w=0x10000000000080
> > 4: (87) r2 = -r2                      ; R2_w=scalar()
> > 5: (87) r2 = -r2                      ; R2_w=scalar()
> > 6: (5f) r1 &= r2                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R2_w=scalar()
> > 7: (18) r3 = 0xb3c3f8c99262687        ; R3_w=0xb3c3f8c99262687
> > 9: (2f) r3 *= r1                      ; R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=11,var_off=(0x0; 0xb)) R3_w=scalar(smin=0,smax=umax=0x7b96bb0a94a3a7cd,var_off=(0x0; 0x7fffffffffffffff))
> > ...
> >
> > Finally, we proved the soundness of the new scalar_min_max_mul() and
> > scalar32_min_max_mul() functions. Typically, multiplication operations are
> > expensive to check with bitvector-based solvers. We were able to prove the
> > soundness of these functions using Non-Linear Integer Arithmetic (NIA)
> > theory. Additionally, using Agni [2,3], we obtained the encodings for
> > scalar32_min_max_mul() and scalar_min_max_mul() in bitvector theory, and
> > were able to prove their soundness using 8-bit bitvectors (instead of
> > 64-bit bitvectors that the functions actually use).
> >
> > In conclusion, with this patch,
> >
> > 1. We were able to show that we can improve the overall precision of
> >    BPF_MUL. We proved (using an SMT solver) that this new version of
> >    BPF_MUL is at least as precise as the current version for all inputs
> >    and more precise for some inputs.
> >
> > 2. We are able to prove the soundness of the new scalar_min_max_mul() and
> >    scalar32_min_max_mul(). By leveraging the existing proof of tnum_mul
> >    [1], we can say that the composition of these three functions within
> >    BPF_MUL is sound.
> >
> > [1] https://ieeexplore.ieee.org/abstract/document/9741267
> > [2] https://link.springer.com/chapter/10.1007/978-3-031-37709-9_12
> > [3] https://people.cs.rutgers.edu/~sn349/papers/sas24-preprint.pdf
> >
> > Co-developed-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> > Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
> > Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> > Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
> > Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> > Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
> > Signed-off-by: Matan Shachnai <m.shachnai@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 72 +++++++++++++++++++------------------------
> >  1 file changed, 32 insertions(+), 40 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index c855e7905c35..5b0f83cc7f4d 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -14118,64 +14118,56 @@ static void scalar_min_max_sub(struct bpf_reg_state *dst_reg,
> >  static void scalar32_min_max_mul(struct bpf_reg_state *dst_reg,
> >                                  struct bpf_reg_state *src_reg)
> >  {
> > -       s32 smin_val = src_reg->s32_min_value;
> > -       u32 umin_val = src_reg->u32_min_value;
> > -       u32 umax_val = src_reg->u32_max_value;
> > +       s32 *dst_smin = &dst_reg->s32_min_value;
> > +       s32 *dst_smax = &dst_reg->s32_max_value;
> > +       u32 *dst_umin = &dst_reg->u32_min_value;
> > +       u32 *dst_umax = &dst_reg->u32_max_value;
> > +       s32 tmp_prod[4];
> >
> > -       if (smin_val < 0 || dst_reg->s32_min_value < 0) {
> > -               /* Ain't nobody got time to multiply that sign */
> > -               __mark_reg32_unbounded(dst_reg);
> > -               return;
> > -       }
> > -       /* Both values are positive, so we can work with unsigned and
> > -        * copy the result to signed (unless it exceeds S32_MAX).
> > -        */
> > -       if (umax_val > U16_MAX || dst_reg->u32_max_value > U16_MAX) {
> > -               /* Potential overflow, we know nothing */
> > -               __mark_reg32_unbounded(dst_reg);
> > -               return;
> > +       if (check_mul_overflow(*dst_umax, src_reg->u32_max_value, dst_umax) ||
> > +           check_mul_overflow(*dst_umin, src_reg->u32_min_value, dst_umin)) {
> > +               /* Overflow possible, we know nothing */
> > +               dst_reg->u32_min_value = 0;
> > +               dst_reg->u32_max_value = U32_MAX;
>
> It would be cleaner to use:
> *dst_umin = 0;
> *dst_umax = U32_MAX;
>
> to make it obvious that though check_mul_overflow()-s wrote something
> into dst_umax and dst_umin,
> we clean them up with these two assignments.
>
> Just like scalar32_min_max_add() does already.

Thanks, Alexei.
I'll fix it up and follow up with a v4 soon.

>
> >         }
> > -       dst_reg->u32_min_value *= umin_val;
> > -       dst_reg->u32_max_value *= umax_val;
> > -       if (dst_reg->u32_max_value > S32_MAX) {
> > +       if (check_mul_overflow(*dst_smin, src_reg->s32_min_value, &tmp_prod[0]) ||
> > +           check_mul_overflow(*dst_smin, src_reg->s32_max_value, &tmp_prod[1]) ||
> > +           check_mul_overflow(*dst_smax, src_reg->s32_min_value, &tmp_prod[2]) ||
> > +           check_mul_overflow(*dst_smax, src_reg->s32_max_value, &tmp_prod[3])) {
> >                 /* Overflow possible, we know nothing */
> >                 dst_reg->s32_min_value = S32_MIN;
> >                 dst_reg->s32_max_value = S32_MAX;
> >         } else {
> > -               dst_reg->s32_min_value = dst_reg->u32_min_value;
> > -               dst_reg->s32_max_value = dst_reg->u32_max_value;
> > +               dst_reg->s32_min_value = min_array(tmp_prod, 4);
> > +               dst_reg->s32_max_value = max_array(tmp_prod, 4);
>
> dst_smin/smax here too.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-17 19:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-14 19:04 [PATCH bpf-next v3 0/2] bpf, verifier: Improve precision of BPF_MUL Matan Shachnai
2024-12-14 19:04 ` [PATCH bpf-next v3 1/2] " Matan Shachnai
2024-12-16 20:31   ` Alexei Starovoitov
2024-12-17 19:13     ` Matan Shachnai
2024-12-14 19:04 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for BPF_MUL Matan Shachnai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).