[PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare

BPF List
 help / color / mirror / Atom feed

* [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare
@ 2024-07-12 23:43 Yonghong Song
  2024-07-12 23:44 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
  2024-07-15 23:55 ` [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
  0 siblings, 2 replies; 8+ messages in thread
From: Yonghong Song @ 2024-07-12 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

With latest llvm19, the selftest iters/iter_arr_with_actual_elem_count
failed with -mcpu=v4.

The following are the details:
  0: R1=ctx() R10=fp0
  ; int iter_arr_with_actual_elem_count(const void *ctx) @ iters.c:1420
  0: (b4) w7 = 0                        ; R7_w=0
  ; int i, n = loop_data.n, sum = 0; @ iters.c:1422
  1: (18) r1 = 0xffffc90000191478       ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144)
  3: (61) r6 = *(u32 *)(r1 +128)        ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
  ; if (n > ARRAY_SIZE(loop_data.data)) @ iters.c:1424
  4: (26) if w6 > 0x20 goto pc+27       ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
  5: (bf) r8 = r10                      ; R8_w=fp0 R10=fp0
  6: (07) r8 += -8                      ; R8_w=fp-8
  ; bpf_for(i, 0, n) { @ iters.c:1427
  7: (bf) r1 = r8                       ; R1_w=fp-8 R8_w=fp-8
  8: (b4) w2 = 0                        ; R2_w=0
  9: (bc) w3 = w6                       ; R3_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R6_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
  10: (85) call bpf_iter_num_new#45179          ; R0=scalar() fp-8=iter_num(ref_id=2,state=active,depth=0) refs=2
  11: (bf) r1 = r8                      ; R1=fp-8 R8=fp-8 refs=2
  12: (85) call bpf_iter_num_next#45181 13: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
  ; bpf_for(i, 0, n) { @ iters.c:1427
  13: (15) if r0 == 0x0 goto pc+2       ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) refs=2
  14: (81) r1 = *(s32 *)(r0 +0)         ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
  15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
  ; sum += loop_data.data[i]; @ iters.c:1429
  20: (67) r1 <<= 2                     ; R1_w=scalar(smax=0x7ffffffc0000007c,umax=0xfffffffc0000007c,smin32=0,smax32=umax32=124,var_off=(0x0; 0xfffffffc0000007c)) refs=2
  21: (18) r2 = 0xffffc90000191478      ; R2_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) refs=2
  23: (0f) r2 += r1
  math between map_value pointer and register with unbounded min value is not allowed

The source code:
  int iter_arr_with_actual_elem_count(const void *ctx)
  {
        int i, n = loop_data.n, sum = 0;

        if (n > ARRAY_SIZE(loop_data.data))
                return 0;

        bpf_for(i, 0, n) {
                /* no rechecking of i against ARRAY_SIZE(loop_data.n) */
                sum += loop_data.data[i];
        }

        return sum;
  }

The insn #14 is a sign-extenstion load which is related to 'int i'.
The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
insn #23 failed verification due to unbounded min value.

Actually insn #15 R1 smin range can be better. Before insn #15, we have
  R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].

After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
smax = smax32.

This patch fixed the issue by adding additional register deduction after 32-bit compare
insn. If the signed 32-bit register range is non-negative then 64-bit smin is
in range of [S32_MIN, S32_MAX], then the actual 64-bit smin/smax should be the same
as 32-bit smin32/smax32.

With this patch, iters/iter_arr_with_actual_elem_count succeeded with better register range:

from 15 to 20: R0=rdonly_mem(id=7,ref_obj_id=2,sz=4) R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=3) refs=2

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 kernel/bpf/verifier.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8da132a1ef28..40abed6459f3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2182,6 +2182,42 @@ static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
 		reg->smin_value = max_t(s64, reg->smin_value, new_smin);
 		reg->smax_value = min_t(s64, reg->smax_value, new_smax);
 	}
+
+	/* Here we would like to handle a special case after sign extending load,
+	 * when upper bits for a 64-bit range are all 1s or all 0s.
+	 *
+	 * Upper bits are all 1s when register is in a range:
+	 *   [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
+	 * Upper bits are all 0s when register is in a range:
+	 *   [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
+	 * Together this forms are continuous range:
+	 *   [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
+	 *
+	 * Now, suppose that register range is in fact tighter:
+	 *   [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
+	 * Also suppose that it's 32-bit range is positive,
+	 * meaning that lower 32-bits of the full 64-bit register
+	 * are in the range:
+	 *   [0x0000_0000, 0x7fff_ffff] (W)
+	 *
+	 * It this happens, then any value in a range:
+	 *   [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
+	 * is smaller than a lowest bound of the range (R):
+	 *   0xffff_ffff_8000_0000
+	 * which means that upper bits of the full 64-bit register
+	 * can't be all 1s, when lower bits are in range (W).
+	 *
+	 * Note that:
+	 *  - 0xffff_ffff_8000_0000 == (s64)S32_MIN
+	 *  - 0x0000_0000_ffff_ffff == (s64)S32_MAX
+	 * These relations are used in the conditions below.
+	 */
+	if (reg->s32_min_value >= 0 && reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX) {
+		reg->smin_value = reg->umin_value = reg->s32_min_value;
+		reg->smax_value = reg->umax_value = reg->s32_max_value;
+		reg->var_off = tnum_intersect(reg->var_off,
+					      tnum_range(reg->smin_value, reg->smax_value));
+	}
 }
 
 static void __reg_deduce_bounds(struct bpf_reg_state *reg)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
  2024-07-12 23:43 [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
@ 2024-07-12 23:44 ` Yonghong Song
  2024-07-16  0:44   ` Eduard Zingerman
  2024-07-15 23:55 ` [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
  1 sibling, 1 reply; 8+ messages in thread
From: Yonghong Song @ 2024-07-12 23:44 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

Add a few selftests to test 32/16/8-bit ldsx followed by subreg comparison.
Without the previous patch, all added tests will fail.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../selftests/bpf/progs/verifier_ldsx.c       | 106 ++++++++++++++++++
 1 file changed, 106 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_ldsx.c b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
index d4427d8e1217..3b96a9436a0c 100644
--- a/tools/testing/selftests/bpf/progs/verifier_ldsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
@@ -144,6 +144,112 @@ __naked void ldsx_s32_range(void)
 	: __clobber_all);
 }
 
+#define MAX_ENTRIES 12
+
+struct test_val {
+        int foo[MAX_ENTRIES];
+};
+
+struct {
+        __uint(type, BPF_MAP_TYPE_HASH);
+        __uint(max_entries, 1);
+        __type(key, long long);
+        __type(value, struct test_val);
+} map_hash_48b SEC(".maps");
+
+SEC("socket")
+__description("LDSX, S8, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s8_subreg_compare(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"*(u64 *)(r10 - 8) = r0;"
+	"w6 = w0;"
+	"if w6 > 0x1f goto l0_%=;"
+	"r7 = *(s8 *)(r10 - 8);"
+	"if w7 > w6 goto l0_%=;"
+	"r1 = 0;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r2 = r10;"
+	"r2 += -8;"
+	"r1 = %[map_hash_48b] ll;"
+	"call %[bpf_map_lookup_elem];"
+	"if r0 == 0 goto l0_%=;"
+	"r0 += r7;"
+	"*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm_addr(map_hash_48b),
+	  __imm(bpf_map_lookup_elem)
+	: __clobber_all);
+}
+
+SEC("socket")
+__description("LDSX, S16, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s16_subreg_compare(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"*(u64 *)(r10 - 8) = r0;"
+	"w6 = w0;"
+	"if w6 > 0x1f goto l0_%=;"
+	"r7 = *(s16 *)(r10 - 8);"
+	"if w7 > w6 goto l0_%=;"
+	"r1 = 0;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r2 = r10;"
+	"r2 += -8;"
+	"r1 = %[map_hash_48b] ll;"
+	"call %[bpf_map_lookup_elem];"
+	"if r0 == 0 goto l0_%=;"
+	"r0 += r7;"
+	"*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm_addr(map_hash_48b),
+	  __imm(bpf_map_lookup_elem)
+	: __clobber_all);
+}
+
+SEC("socket")
+__description("LDSX, S32, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s32_subreg_compare(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"*(u64 *)(r10 - 8) = r0;"
+	"w6 = w0;"
+	"if w6 > 0x1f goto l0_%=;"
+	"r7 = *(s32 *)(r10 - 8);"
+	"if w7 > w6 goto l0_%=;"
+	"r1 = 0;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r2 = r10;"
+	"r2 += -8;"
+	"r1 = %[map_hash_48b] ll;"
+	"call %[bpf_map_lookup_elem];"
+	"if r0 == 0 goto l0_%=;"
+	"r0 += r7;"
+	"*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm_addr(map_hash_48b),
+	  __imm(bpf_map_lookup_elem)
+	: __clobber_all);
+}
+
 #else
 
 SEC("socket")
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare
  2024-07-12 23:43 [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
  2024-07-12 23:44 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
@ 2024-07-15 23:55 ` Eduard Zingerman
  2024-07-17  7:27   ` Shung-Hsi Yu
  1 sibling, 1 reply; 8+ messages in thread
From: Eduard Zingerman @ 2024-07-15 23:55 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

On Fri, 2024-07-12 at 16:43 -0700, Yonghong Song wrote:

[...]

> This patch fixed the issue by adding additional register deduction after 32-bit compare
> insn. If the signed 32-bit register range is non-negative then 64-bit smin is
> in range of [S32_MIN, S32_MAX], then the actual 64-bit smin/smax should be the same
> as 32-bit smin32/smax32.

[...]

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

> +	 * Now, suppose that register range is in fact tighter:
> +	 *   [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
> +	 * Also suppose that it's 32-bit range is positive,
> +	 * meaning that lower 32-bits of the full 64-bit register
> +	 * are in the range:
> +	 *   [0x0000_0000, 0x7fff_ffff] (W)
> +	 *
> +	 * It this happens, then any value in a range:
           ^^
Sorry, one more typo, should be "If".
Maybe could be changed when the patch would be applied.

[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
  2024-07-12 23:44 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
@ 2024-07-16  0:44   ` Eduard Zingerman
  2024-07-16 22:38     ` Yonghong Song
  0 siblings, 1 reply; 8+ messages in thread
From: Eduard Zingerman @ 2024-07-16  0:44 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

On Fri, 2024-07-12 at 16:44 -0700, Yonghong Song wrote:

Note: it feels like these test cases should be a part of the
      reg_bounds.c:test_reg_bounds_crafted(), e.g.:

  diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
  index eb74363f9f70..4918414f8e36 100644
  --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
  +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
  @@ -2108,6 +2108,7 @@ static struct subtest_case crafted_cases[] = {
          {S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
          {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
          {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
  +       {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x00000000ffffffffULL}},
   };
   
   /* Go over crafted hard-coded cases. This is fast, so we do it as part of

Which produces the following log:

  VERIFIER LOG:
  ========================
  ...
  21: (ae) if w6 < w7 goto pc+3         ; R6=scalar(id=1,smin=0xffffffff80000000,smax=0xffffffff)
                                          R7=scalar(id=2,smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f))
  ...  
  from 21 to 25: ... R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=30,var_off=(0x0; 0x1f))
                     R7=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)
  25: ... R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=30,var_off=(0x0; 0x1f))
          R7=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f))
  ...

However, this would require adjustments to reg_bounds.c logic to
include this new range rule.

[...]

> +SEC("socket")
> +__description("LDSX, S8, subreg compare")
                     ^^^ ^^^

Nit: test_progs parsing logic for -t option is too simplistic to
     understand comas inside description, for example here is happens
     after the command below:

       # ./test_progs-cpuv4 -t "verifier_ldsx/LDSX, S8, subreg compare"
       #455/1   verifier_ldsx/LDSX, S8:OK
       #455/2   verifier_ldsx/LDSX, S8 @unpriv:OK
       #455/3   verifier_ldsx/LDSX, S16:OK
       #455/4   verifier_ldsx/LDSX, S16 @unpriv:OK
       #455/5   verifier_ldsx/LDSX, S32:OK
       ...

     As far as I understand, this happens because test_progs tries to
     match words LDSX, S8 and "subreg compare" separately.

     This does not happen when comas are not used in the description
     (or if description is omitted in favor of the C function name of the test
      (it is not possible to do -t verifier_ldsx/ldsx_s8_subreg_compare
       if __description is provided)).

> +__success __success_unpriv
> +__naked void ldsx_s8_subreg_compare(void)
> +{
> +	asm volatile (
> +	"call %[bpf_get_prandom_u32];"
> +	"*(u64 *)(r10 - 8) = r0;"
> +	"w6 = w0;"
> +	"if w6 > 0x1f goto l0_%=;"
> +	"r7 = *(s8 *)(r10 - 8);"
> +	"if w7 > w6 goto l0_%=;"
> +	"r1 = 0;"
> +	"*(u64 *)(r10 - 8) = r1;"
> +	"r2 = r10;"
> +	"r2 += -8;"
> +	"r1 = %[map_hash_48b] ll;"
> +	"call %[bpf_map_lookup_elem];"
> +	"if r0 == 0 goto l0_%=;"
> +	"r0 += r7;"
> +	"*(u64 *)(r0 + 0) = 1;"
> +"l0_%=:"
> +	"r0 = 0;"
> +	"exit;"
> +	:
> +	: __imm(bpf_get_prandom_u32),
> +	  __imm_addr(map_hash_48b),
> +	  __imm(bpf_map_lookup_elem)
> +	: __clobber_all);
> +}

[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
  2024-07-16  0:44   ` Eduard Zingerman
@ 2024-07-16 22:38     ` Yonghong Song
  2024-07-17  0:12       ` Eduard Zingerman
  0 siblings, 1 reply; 8+ messages in thread
From: Yonghong Song @ 2024-07-16 22:38 UTC (permalink / raw)
  To: Eduard Zingerman, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau


On 7/15/24 5:44 PM, Eduard Zingerman wrote:
> On Fri, 2024-07-12 at 16:44 -0700, Yonghong Song wrote:
>
> Note: it feels like these test cases should be a part of the
>        reg_bounds.c:test_reg_bounds_crafted(), e.g.:
>
>    diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>    index eb74363f9f70..4918414f8e36 100644
>    --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>    +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>    @@ -2108,6 +2108,7 @@ static struct subtest_case crafted_cases[] = {
>            {S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
>            {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
>            {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
>    +       {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x00000000ffffffffULL}},
>     };
>     
>     /* Go over crafted hard-coded cases. This is fast, so we do it as part of
>
> Which produces the following log:
>
>    VERIFIER LOG:
>    ========================
>    ...
>    21: (ae) if w6 < w7 goto pc+3         ; R6=scalar(id=1,smin=0xffffffff80000000,smax=0xffffffff)
>                                            R7=scalar(id=2,smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f))
>    ...
>    from 21 to 25: ... R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=30,var_off=(0x0; 0x1f))
>                       R7=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)
>    25: ... R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=30,var_off=(0x0; 0x1f))
>            R7=scalar(id=2,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f))
>    ...
>
> However, this would require adjustments to reg_bounds.c logic to
> include this new range rule.

I added some logic in reg_bounds.c.

diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
index eb74363f9f70..c88602908cfe 100644
--- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
+++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
@@ -441,6 +441,22 @@ static struct range range_refine(enum num_t x_t, struct range x, enum num_t y_t,
         if (t_is_32(y_t) && !t_is_32(x_t)) {
                 struct range x_swap;
  
+               /* If we know that
+                *   - *x* is in the range of signed 32bit value
+                *   - *y_cast* range is 32-bit sign non-negative, and
+                * then *x* range can be narrowed to the interaction of
+                * *x* and *y_cast*. Otherwise, if the new range for *x*
+                * allows upper 32-bit 0xffffffff then the eventual new
+                * range for *x* will be out of signed 32-bit range
+                * which violates the origin *x* range.
+                */
+               if (x_t == S64 && y_t == S32 &&
+                   !(y_cast.a & 0xffffffff80000000ULL) && !(y_cast.b & 0xffffffff80000000) &&
+                   (long long)x.a >= S32_MIN && (long long)x.b <= S32_MAX) {
+                               return range(S64, max_t(S64, y_cast.a, x.a),
+                                            min_t(S64, y_cast.b, x.b));
+               }
+
                 /* some combinations of upper 32 bits and sign bit can lead to
                  * invalid ranges, in such cases it's easier to detect them
                  * after cast/swap than try to enumerate all the conditions
@@ -2108,6 +2124,9 @@ static struct subtest_case crafted_cases[] = {
         {S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
         {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
         {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
+       {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x000000007fffffffULL}},
+       {S64, U32, {0x0, 0x1f}, {0xffffffffffff8000ULL, 0x0000000000007fffULL}},
+       {S64, U32, {0x0, 0x1f}, {0xffffffffffffff80ULL, 0x000000000000007fULL}},
  };
  
  /* Go over crafted hard-coded cases. This is fast, so we do it as part of

The logic is very similar to kernel implementation but has a difference in generating
the final range. In reg_bounds implementation, the range is narrowed by intersecting
y_cast and x range which are necessary.

In kernel implementation, there is no interection since we only have one register
and two register has been compared before.

Eduard, could you take a look at the above code?

>
> [...]
>
>> +SEC("socket")
>> +__description("LDSX, S8, subreg compare")
>                       ^^^ ^^^
>
> Nit: test_progs parsing logic for -t option is too simplistic to
>       understand comas inside description, for example here is happens
>       after the command below:
>
>         # ./test_progs-cpuv4 -t "verifier_ldsx/LDSX, S8, subreg compare"
>         #455/1   verifier_ldsx/LDSX, S8:OK
>         #455/2   verifier_ldsx/LDSX, S8 @unpriv:OK
>         #455/3   verifier_ldsx/LDSX, S16:OK
>         #455/4   verifier_ldsx/LDSX, S16 @unpriv:OK
>         #455/5   verifier_ldsx/LDSX, S32:OK
>         ...
>
>       As far as I understand, this happens because test_progs tries to
>       match words LDSX, S8 and "subreg compare" separately.
>
>       This does not happen when comas are not used in the description
>       (or if description is omitted in favor of the C function name of the test
>        (it is not possible to do -t verifier_ldsx/ldsx_s8_subreg_compare
>         if __description is provided)).

There are around 10 (or more) verifier_*.c files which has ',' in their
description. So I think we should add ',' support in test_progs infrastructure
w.r.t. description tag. I think this can be a follow up.

>> +__success __success_unpriv
>> +__naked void ldsx_s8_subreg_compare(void)
>> +{
>> +	asm volatile (
>> +	"call %[bpf_get_prandom_u32];"
>> +	"*(u64 *)(r10 - 8) = r0;"
>> +	"w6 = w0;"
>> +	"if w6 > 0x1f goto l0_%=;"
>> +	"r7 = *(s8 *)(r10 - 8);"
>> +	"if w7 > w6 goto l0_%=;"
>> +	"r1 = 0;"
>> +	"*(u64 *)(r10 - 8) = r1;"
>> +	"r2 = r10;"
>> +	"r2 += -8;"
>> +	"r1 = %[map_hash_48b] ll;"
>> +	"call %[bpf_map_lookup_elem];"
>> +	"if r0 == 0 goto l0_%=;"
>> +	"r0 += r7;"
>> +	"*(u64 *)(r0 + 0) = 1;"
>> +"l0_%=:"
>> +	"r0 = 0;"
>> +	"exit;"
>> +	:
>> +	: __imm(bpf_get_prandom_u32),
>> +	  __imm_addr(map_hash_48b),
>> +	  __imm(bpf_map_lookup_elem)
>> +	: __clobber_all);
>> +}
> [...]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
  2024-07-16 22:38     ` Yonghong Song
@ 2024-07-17  0:12       ` Eduard Zingerman
  2024-07-17  6:06         ` Yonghong Song
  0 siblings, 1 reply; 8+ messages in thread
From: Eduard Zingerman @ 2024-07-17  0:12 UTC (permalink / raw)
  To: Yonghong Song, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

On Tue, 2024-07-16 at 15:38 -0700, Yonghong Song wrote:

[...]

> diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
> index eb74363f9f70..c88602908cfe 100644
> --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
> +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
> @@ -441,6 +441,22 @@ static struct range range_refine(enum num_t x_t, struct range x, enum num_t y_t,
>          if (t_is_32(y_t) && !t_is_32(x_t)) {
>                  struct range x_swap;
>   
> +               /* If we know that
> +                *   - *x* is in the range of signed 32bit value
> +                *   - *y_cast* range is 32-bit sign non-negative, and
> +                * then *x* range can be narrowed to the interaction of
> +                * *x* and *y_cast*. Otherwise, if the new range for *x*
> +                * allows upper 32-bit 0xffffffff then the eventual new
> +                * range for *x* will be out of signed 32-bit range
> +                * which violates the origin *x* range.
> +                */
> +               if (x_t == S64 && y_t == S32 &&
> +                   !(y_cast.a & 0xffffffff80000000ULL) && !(y_cast.b & 0xffffffff80000000) &&
> +                   (long long)x.a >= S32_MIN && (long long)x.b <= S32_MAX) {
> +                               return range(S64, max_t(S64, y_cast.a, x.a),
> +                                            min_t(S64, y_cast.b, x.b));
> +               }
> +
>                  /* some combinations of upper 32 bits and sign bit can lead to
>                   * invalid ranges, in such cases it's easier to detect them
>                   * after cast/swap than try to enumerate all the conditions
> @@ -2108,6 +2124,9 @@ static struct subtest_case crafted_cases[] = {
>          {S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
>          {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
>          {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
> +       {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x000000007fffffffULL}},
> +       {S64, U32, {0x0, 0x1f}, {0xffffffffffff8000ULL, 0x0000000000007fffULL}},
> +       {S64, U32, {0x0, 0x1f}, {0xffffffffffffff80ULL, 0x000000000000007fULL}},
>   };
>   
>   /* Go over crafted hard-coded cases. This is fast, so we do it as part of
> 
> The logic is very similar to kernel implementation but has a difference in generating
> the final range. In reg_bounds implementation, the range is narrowed by intersecting
> y_cast and x range which are necessary.
> 
> In kernel implementation, there is no interection since we only have one register
> and two register has been compared before.
> 
> Eduard, could you take a look at the above code?

I think this change is correct.
The return clause could be simplified a bit:

	return range_improve(x_t, x, y_cast);

[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
  2024-07-17  0:12       ` Eduard Zingerman
@ 2024-07-17  6:06         ` Yonghong Song
  0 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2024-07-17  6:06 UTC (permalink / raw)
  To: Eduard Zingerman, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau


On 7/16/24 5:12 PM, Eduard Zingerman wrote:
> On Tue, 2024-07-16 at 15:38 -0700, Yonghong Song wrote:
>
> [...]
>
>> diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>> index eb74363f9f70..c88602908cfe 100644
>> --- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>> +++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
>> @@ -441,6 +441,22 @@ static struct range range_refine(enum num_t x_t, struct range x, enum num_t y_t,
>>           if (t_is_32(y_t) && !t_is_32(x_t)) {
>>                   struct range x_swap;
>>    
>> +               /* If we know that
>> +                *   - *x* is in the range of signed 32bit value
>> +                *   - *y_cast* range is 32-bit sign non-negative, and
>> +                * then *x* range can be narrowed to the interaction of
>> +                * *x* and *y_cast*. Otherwise, if the new range for *x*
>> +                * allows upper 32-bit 0xffffffff then the eventual new
>> +                * range for *x* will be out of signed 32-bit range
>> +                * which violates the origin *x* range.
>> +                */
>> +               if (x_t == S64 && y_t == S32 &&
>> +                   !(y_cast.a & 0xffffffff80000000ULL) && !(y_cast.b & 0xffffffff80000000) &&
>> +                   (long long)x.a >= S32_MIN && (long long)x.b <= S32_MAX) {
>> +                               return range(S64, max_t(S64, y_cast.a, x.a),
>> +                                            min_t(S64, y_cast.b, x.b));
>> +               }
>> +
>>                   /* some combinations of upper 32 bits and sign bit can lead to
>>                    * invalid ranges, in such cases it's easier to detect them
>>                    * after cast/swap than try to enumerate all the conditions
>> @@ -2108,6 +2124,9 @@ static struct subtest_case crafted_cases[] = {
>>           {S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
>>           {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
>>           {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
>> +       {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x000000007fffffffULL}},
>> +       {S64, U32, {0x0, 0x1f}, {0xffffffffffff8000ULL, 0x0000000000007fffULL}},
>> +       {S64, U32, {0x0, 0x1f}, {0xffffffffffffff80ULL, 0x000000000000007fULL}},
>>    };
>>    
>>    /* Go over crafted hard-coded cases. This is fast, so we do it as part of
>>
>> The logic is very similar to kernel implementation but has a difference in generating
>> the final range. In reg_bounds implementation, the range is narrowed by intersecting
>> y_cast and x range which are necessary.
>>
>> In kernel implementation, there is no interection since we only have one register
>> and two register has been compared before.
>>
>> Eduard, could you take a look at the above code?
> I think this change is correct.
> The return clause could be simplified a bit:
>
> 	return range_improve(x_t, x, y_cast);

Indeed. This is much simpler. I will use reg_bounds testing instead of verifier_ldsx testing
in next revision.

>
> [...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare
  2024-07-15 23:55 ` [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
@ 2024-07-17  7:27   ` Shung-Hsi Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Shung-Hsi Yu @ 2024-07-17  7:27 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, kernel-team, Martin KaFai Lau

On Mon, Jul 15, 2024 at 04:55:42PM GMT, Eduard Zingerman wrote:
> On Fri, 2024-07-12 at 16:43 -0700, Yonghong Song wrote:
> 
> [...]
> 
> > This patch fixed the issue by adding additional register deduction after 32-bit compare
> > insn. If the signed 32-bit register range is non-negative then 64-bit smin is
> > in range of [S32_MIN, S32_MAX], then the actual 64-bit smin/smax should be the same
> > as 32-bit smin32/smax32.
> 
> [...]
> 
> Acked-by: Eduard Zingerman <eddyz87@gmail.com>

Other than the already mentioned typo LGTM,

Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>

> [...]
> 
> > +	 * Now, suppose that register range is in fact tighter:
> > +	 *   [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
> > +	 * Also suppose that it's 32-bit range is positive,
> > +	 * meaning that lower 32-bits of the full 64-bit register
> > +	 * are in the range:
> > +	 *   [0x0000_0000, 0x7fff_ffff] (W)
> > +	 *
> > +	 * It this happens, then any value in a range:
>            ^^
> Sorry, one more typo, should be "If".
> Maybe could be changed when the patch would be applied.
> 
> [...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-17  7:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-12 23:43 [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
2024-07-12 23:44 ` [PATCH bpf-next v3 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
2024-07-16  0:44   ` Eduard Zingerman
2024-07-16 22:38     ` Yonghong Song
2024-07-17  0:12       ` Eduard Zingerman
2024-07-17  6:06         ` Yonghong Song
2024-07-15 23:55 ` [PATCH bpf-next v3 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
2024-07-17  7:27   ` Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox