* [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare
@ 2024-07-10 4:29 Yonghong Song
2024-07-10 4:29 ` [PATCH bpf-next 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
2024-07-11 22:20 ` [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
0 siblings, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2024-07-10 4:29 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau
With latest llvm19, the selftest iters/iter_arr_with_actual_elem_count
failed with -mcpu=v4.
The following are the details:
0: R1=ctx() R10=fp0
; int iter_arr_with_actual_elem_count(const void *ctx) @ iters.c:1420
0: (b4) w7 = 0 ; R7_w=0
; int i, n = loop_data.n, sum = 0; @ iters.c:1422
1: (18) r1 = 0xffffc90000191478 ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144)
3: (61) r6 = *(u32 *)(r1 +128) ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
; if (n > ARRAY_SIZE(loop_data.data)) @ iters.c:1424
4: (26) if w6 > 0x20 goto pc+27 ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
5: (bf) r8 = r10 ; R8_w=fp0 R10=fp0
6: (07) r8 += -8 ; R8_w=fp-8
; bpf_for(i, 0, n) { @ iters.c:1427
7: (bf) r1 = r8 ; R1_w=fp-8 R8_w=fp-8
8: (b4) w2 = 0 ; R2_w=0
9: (bc) w3 = w6 ; R3_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R6_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
10: (85) call bpf_iter_num_new#45179 ; R0=scalar() fp-8=iter_num(ref_id=2,state=active,depth=0) refs=2
11: (bf) r1 = r8 ; R1=fp-8 R8=fp-8 refs=2
12: (85) call bpf_iter_num_next#45181 13: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
; bpf_for(i, 0, n) { @ iters.c:1427
13: (15) if r0 == 0x0 goto pc+2 ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) refs=2
14: (81) r1 = *(s32 *)(r0 +0) ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
; sum += loop_data.data[i]; @ iters.c:1429
20: (67) r1 <<= 2 ; R1_w=scalar(smax=0x7ffffffc0000007c,umax=0xfffffffc0000007c,smin32=0,smax32=umax32=124,var_off=(0x0; 0xfffffffc0000007c)) refs=2
21: (18) r2 = 0xffffc90000191478 ; R2_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) refs=2
23: (0f) r2 += r1
math between map_value pointer and register with unbounded min value is not allowed
The source code:
int iter_arr_with_actual_elem_count(const void *ctx)
{
int i, n = loop_data.n, sum = 0;
if (n > ARRAY_SIZE(loop_data.data))
return 0;
bpf_for(i, 0, n) {
/* no rechecking of i against ARRAY_SIZE(loop_data.n) */
sum += loop_data.data[i];
}
return sum;
}
The insn #14 is a sign-extenstion load which is related to 'int i'.
The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
insn #23 failed verification due to unbounded min value.
Actually insn #15 R1 smin range can be better. Before insn #15, we have
R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].
After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
smax = smax32.
This patch fixed the issue by adding additional register deduction after 32-bit compare
insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
{S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
signed extension load.
With this patch, iters/iter_arr_with_actual_elem_count succeeded with better register range:
from 15 to 20: R0=rdonly_mem(id=7,ref_obj_id=2,sz=4) R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=3) refs=2
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c0263fb5ca4b..3fc557f99b24 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2182,6 +2182,21 @@ static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
reg->smin_value = max_t(s64, reg->smin_value, new_smin);
reg->smax_value = min_t(s64, reg->smax_value, new_smax);
}
+
+ /* if s32 range is non-negative and s64 range is in [S32/S16/S8_MIN, <= S32/S16/S8_MAX],
+ * the s64/u64 range can be refined.
+ */
+ if (reg->s32_min_value >= 0) {
+ if ((reg->smin_value == S32_MIN && reg->smax_value <= S32_MAX) ||
+ (reg->smin_value == S16_MIN && reg->smax_value <= S16_MAX) ||
+ (reg->smin_value == S8_MIN && reg->smax_value <= S8_MAX)) {
+ reg->smin_value = reg->umin_value = reg->s32_min_value;
+ reg->smax_value = reg->umax_value = reg->s32_max_value;
+ reg->var_off = tnum_intersect(reg->var_off,
+ tnum_range(reg->smin_value,
+ reg->smax_value));
+ }
+ }
}
static void __reg_deduce_bounds(struct bpf_reg_state *reg)
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH bpf-next 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare
2024-07-10 4:29 [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
@ 2024-07-10 4:29 ` Yonghong Song
2024-07-11 22:20 ` [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
1 sibling, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-07-10 4:29 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau
Add a few selftests to test 32/16/8-bit ldsx followed by subreg comparison.
Without the previous patch, all added tests will fail.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../selftests/bpf/progs/verifier_ldsx.c | 106 ++++++++++++++++++
1 file changed, 106 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/verifier_ldsx.c b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
index d4427d8e1217..3b96a9436a0c 100644
--- a/tools/testing/selftests/bpf/progs/verifier_ldsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_ldsx.c
@@ -144,6 +144,112 @@ __naked void ldsx_s32_range(void)
: __clobber_all);
}
+#define MAX_ENTRIES 12
+
+struct test_val {
+ int foo[MAX_ENTRIES];
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_HASH);
+ __uint(max_entries, 1);
+ __type(key, long long);
+ __type(value, struct test_val);
+} map_hash_48b SEC(".maps");
+
+SEC("socket")
+__description("LDSX, S8, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s8_subreg_compare(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "*(u64 *)(r10 - 8) = r0;"
+ "w6 = w0;"
+ "if w6 > 0x1f goto l0_%=;"
+ "r7 = *(s8 *)(r10 - 8);"
+ "if w7 > w6 goto l0_%=;"
+ "r1 = 0;"
+ "*(u64 *)(r10 - 8) = r1;"
+ "r2 = r10;"
+ "r2 += -8;"
+ "r1 = %[map_hash_48b] ll;"
+ "call %[bpf_map_lookup_elem];"
+ "if r0 == 0 goto l0_%=;"
+ "r0 += r7;"
+ "*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_get_prandom_u32),
+ __imm_addr(map_hash_48b),
+ __imm(bpf_map_lookup_elem)
+ : __clobber_all);
+}
+
+SEC("socket")
+__description("LDSX, S16, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s16_subreg_compare(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "*(u64 *)(r10 - 8) = r0;"
+ "w6 = w0;"
+ "if w6 > 0x1f goto l0_%=;"
+ "r7 = *(s16 *)(r10 - 8);"
+ "if w7 > w6 goto l0_%=;"
+ "r1 = 0;"
+ "*(u64 *)(r10 - 8) = r1;"
+ "r2 = r10;"
+ "r2 += -8;"
+ "r1 = %[map_hash_48b] ll;"
+ "call %[bpf_map_lookup_elem];"
+ "if r0 == 0 goto l0_%=;"
+ "r0 += r7;"
+ "*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_get_prandom_u32),
+ __imm_addr(map_hash_48b),
+ __imm(bpf_map_lookup_elem)
+ : __clobber_all);
+}
+
+SEC("socket")
+__description("LDSX, S32, subreg compare")
+__success __success_unpriv
+__naked void ldsx_s32_subreg_compare(void)
+{
+ asm volatile (
+ "call %[bpf_get_prandom_u32];"
+ "*(u64 *)(r10 - 8) = r0;"
+ "w6 = w0;"
+ "if w6 > 0x1f goto l0_%=;"
+ "r7 = *(s32 *)(r10 - 8);"
+ "if w7 > w6 goto l0_%=;"
+ "r1 = 0;"
+ "*(u64 *)(r10 - 8) = r1;"
+ "r2 = r10;"
+ "r2 += -8;"
+ "r1 = %[map_hash_48b] ll;"
+ "call %[bpf_map_lookup_elem];"
+ "if r0 == 0 goto l0_%=;"
+ "r0 += r7;"
+ "*(u64 *)(r0 + 0) = 1;"
+"l0_%=:"
+ "r0 = 0;"
+ "exit;"
+ :
+ : __imm(bpf_get_prandom_u32),
+ __imm_addr(map_hash_48b),
+ __imm(bpf_map_lookup_elem)
+ : __clobber_all);
+}
+
#else
SEC("socket")
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare
2024-07-10 4:29 [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
2024-07-10 4:29 ` [PATCH bpf-next 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
@ 2024-07-11 22:20 ` Eduard Zingerman
2024-07-12 5:07 ` Yonghong Song
1 sibling, 1 reply; 6+ messages in thread
From: Eduard Zingerman @ 2024-07-11 22:20 UTC (permalink / raw)
To: Yonghong Song, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau
On Tue, 2024-07-09 at 21:29 -0700, Yonghong Song wrote:
[...]
> 14: (81) r1 = *(s32 *)(r0 +0) ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
> 15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
[...]
> The insn #14 is a sign-extenstion load which is related to 'int i'.
> The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
> insn #23 failed verification due to unbounded min value.
>
> Actually insn #15 R1 smin range can be better. Before insn #15, we have
> R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
> With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
> Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].
>
> After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
> then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
> obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
> range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
> smax = smax32.
>
> This patch fixed the issue by adding additional register deduction after 32-bit compare
> insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
> {S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
> Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
> signed extension load.
[...]
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> ---
> kernel/bpf/verifier.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c0263fb5ca4b..3fc557f99b24 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2182,6 +2182,21 @@ static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
> reg->smin_value = max_t(s64, reg->smin_value, new_smin);
> reg->smax_value = min_t(s64, reg->smax_value, new_smax);
> }
> +
> + /* if s32 range is non-negative and s64 range is in [S32/S16/S8_MIN, <= S32/S16/S8_MAX],
> + * the s64/u64 range can be refined.
> + */
Hi Yonghong,
Sorry for delayed response, nice patch, it finally clicked for me.
I'd suggest a slightly different comment, maybe it's just me being
slow, but it took a while to understand why is this correct.
How about a text like below:
Here we would like to handle a special case after sign extending load,
when upper bits for a 64-bit range are all 1s or all 0s.
Upper bits are all 1s when register is in a rage:
[0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
Upper bits are all 0s when register is in a range:
[0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
Together this forms are continuous range:
[0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
Now, suppose that register range is in fact tighter:
[0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
Also suppose that it's 32-bit range is positive,
meaning that lower 32-bits of the full 64-bit register
are in the range:
[0x0000_0000, 0x7fff_ffff] (W)
It so happens, that any value in a range:
[0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
is smaller than a lowest bound of the range (R):
0xffff_ffff_8000_0000
which means that upper bits of the full 64-bit register
can't be all 1s, when lower bits are in range (W).
Note that:
- 0xffff_ffff_8000_0000 == (s64)S32_MIN
- 0x0000_0000_ffff_ffff == (s64)S32_MAX
These relations are used in the conditions below.
> + if (reg->s32_min_value >= 0) {
> + if ((reg->smin_value == S32_MIN && reg->smax_value <= S32_MAX) ||
> + (reg->smin_value == S16_MIN && reg->smax_value <= S16_MAX) ||
> + (reg->smin_value == S8_MIN && reg->smax_value <= S8_MAX)) {
The explanation above also lands a question, would it be correct to
replace the checks above by a single one?
reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX
> + reg->smin_value = reg->umin_value = reg->s32_min_value;
> + reg->smax_value = reg->umax_value = reg->s32_max_value;
> + reg->var_off = tnum_intersect(reg->var_off,
> + tnum_range(reg->smin_value,
> + reg->smax_value));
> + }
> + }
> }
>
> static void __reg_deduce_bounds(struct bpf_reg_state *reg)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare
2024-07-11 22:20 ` [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
@ 2024-07-12 5:07 ` Yonghong Song
2024-07-12 18:30 ` Alexei Starovoitov
0 siblings, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2024-07-12 5:07 UTC (permalink / raw)
To: Eduard Zingerman, bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau
On 7/11/24 3:20 PM, Eduard Zingerman wrote:
> On Tue, 2024-07-09 at 21:29 -0700, Yonghong Song wrote:
>
> [...]
>
>> 14: (81) r1 = *(s32 *)(r0 +0) ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
>> 15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
> [...]
>
>> The insn #14 is a sign-extenstion load which is related to 'int i'.
>> The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
>> insn #23 failed verification due to unbounded min value.
>>
>> Actually insn #15 R1 smin range can be better. Before insn #15, we have
>> R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
>> With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
>> Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].
>>
>> After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
>> then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
>> obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
>> range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
>> smax = smax32.
>>
>> This patch fixed the issue by adding additional register deduction after 32-bit compare
>> insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
>> {S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
>> Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
>> signed extension load.
> [...]
>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> kernel/bpf/verifier.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>>
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index c0263fb5ca4b..3fc557f99b24 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -2182,6 +2182,21 @@ static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
>> reg->smin_value = max_t(s64, reg->smin_value, new_smin);
>> reg->smax_value = min_t(s64, reg->smax_value, new_smax);
>> }
>> +
>> + /* if s32 range is non-negative and s64 range is in [S32/S16/S8_MIN, <= S32/S16/S8_MAX],
>> + * the s64/u64 range can be refined.
>> + */
> Hi Yonghong,
>
> Sorry for delayed response, nice patch, it finally clicked for me.
> I'd suggest a slightly different comment, maybe it's just me being
> slow, but it took a while to understand why is this correct.
> How about a text like below:
>
> Here we would like to handle a special case after sign extending load,
> when upper bits for a 64-bit range are all 1s or all 0s.
>
> Upper bits are all 1s when register is in a rage:
> [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
> Upper bits are all 0s when register is in a range:
> [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
> Together this forms are continuous range:
> [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
>
> Now, suppose that register range is in fact tighter:
> [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
> Also suppose that it's 32-bit range is positive,
> meaning that lower 32-bits of the full 64-bit register
> are in the range:
> [0x0000_0000, 0x7fff_ffff] (W)
>
> It so happens, that any value in a range:
> [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
> is smaller than a lowest bound of the range (R):
> 0xffff_ffff_8000_0000
> which means that upper bits of the full 64-bit register
> can't be all 1s, when lower bits are in range (W).
>
> Note that:
> - 0xffff_ffff_8000_0000 == (s64)S32_MIN
> - 0x0000_0000_ffff_ffff == (s64)S32_MAX
> These relations are used in the conditions below.
Sounds good. I will add some comments like the above in v2.
>
>> + if (reg->s32_min_value >= 0) {
>> + if ((reg->smin_value == S32_MIN && reg->smax_value <= S32_MAX) ||
>> + (reg->smin_value == S16_MIN && reg->smax_value <= S16_MAX) ||
>> + (reg->smin_value == S8_MIN && reg->smax_value <= S8_MAX)) {
> The explanation above also lands a question, would it be correct to
> replace the checks above by a single one?
>
> reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX
You are correct, the range check can be better. The following is the related
description in the commit message:
> This patch fixed the issue by adding additional register deduction after 32-bit compare
> insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
> {S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
> Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
> signed extension load.
The corrent code simply represents the most common pattern.
Since you mention this, I will resive it as below in v2:
reg->smin_value >= S32_MIN && reg->smin_value < 0 && reg->smax_value <= S32_MAX
>
>> + reg->smin_value = reg->umin_value = reg->s32_min_value;
>> + reg->smax_value = reg->umax_value = reg->s32_max_value;
>> + reg->var_off = tnum_intersect(reg->var_off,
>> + tnum_range(reg->smin_value,
>> + reg->smax_value));
>> + }
>> + }
>> }
>>
>> static void __reg_deduce_bounds(struct bpf_reg_state *reg)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare
2024-07-12 5:07 ` Yonghong Song
@ 2024-07-12 18:30 ` Alexei Starovoitov
2024-07-12 20:10 ` Yonghong Song
0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2024-07-12 18:30 UTC (permalink / raw)
To: Yonghong Song
Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Kernel Team, Martin KaFai Lau
On Thu, Jul 11, 2024 at 10:07 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> >
> > Here we would like to handle a special case after sign extending load,
> > when upper bits for a 64-bit range are all 1s or all 0s.
> >
> > Upper bits are all 1s when register is in a rage:
> > [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
> > Upper bits are all 0s when register is in a range:
> > [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
> > Together this forms are continuous range:
> > [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
> >
> > Now, suppose that register range is in fact tighter:
> > [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
> > Also suppose that it's 32-bit range is positive,
> > meaning that lower 32-bits of the full 64-bit register
> > are in the range:
> > [0x0000_0000, 0x7fff_ffff] (W)
> >
> > It so happens, that any value in a range:
> > [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
> > is smaller than a lowest bound of the range (R):
> > 0xffff_ffff_8000_0000
> > which means that upper bits of the full 64-bit register
> > can't be all 1s, when lower bits are in range (W).
> >
> > Note that:
> > - 0xffff_ffff_8000_0000 == (s64)S32_MIN
> > - 0x0000_0000_ffff_ffff == (s64)S32_MAX
> > These relations are used in the conditions below.
>
> Sounds good. I will add some comments like the above in v2.
I would add Ed's explanation verbatim as a comment to verifier.c
> >
> >> + if (reg->s32_min_value >= 0) {
> >> + if ((reg->smin_value == S32_MIN && reg->smax_value <= S32_MAX) ||
> >> + (reg->smin_value == S16_MIN && reg->smax_value <= S16_MAX) ||
> >> + (reg->smin_value == S8_MIN && reg->smax_value <= S8_MAX)) {
> > The explanation above also lands a question, would it be correct to
> > replace the checks above by a single one?
> >
> > reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX
>
> You are correct, the range check can be better. The following is the related
> description in the commit message:
>
> > This patch fixed the issue by adding additional register deduction after 32-bit compare
> > insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
> > {S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
> > Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
> > signed extension load.
>
> The corrent code simply represents the most common pattern.
> Since you mention this, I will resive it as below in v2:
> reg->smin_value >= S32_MIN && reg->smin_value < 0 && reg->smax_value <= S32_MAX
Why add smin_value < 0 check ?
I'd think
if (reg->s32_min_value >= 0 && reg->smin_value >= S32_MIN &&
reg->smax_value <= S32_MAX)
is enough?
If smin_value is >=0 it's fine to reassign it with s32_min_value
which is positive as well.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare
2024-07-12 18:30 ` Alexei Starovoitov
@ 2024-07-12 20:10 ` Yonghong Song
0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-07-12 20:10 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Andrii Nakryiko,
Daniel Borkmann, Kernel Team, Martin KaFai Lau
On 7/12/24 11:30 AM, Alexei Starovoitov wrote:
> On Thu, Jul 11, 2024 at 10:07 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>> Here we would like to handle a special case after sign extending load,
>>> when upper bits for a 64-bit range are all 1s or all 0s.
>>>
>>> Upper bits are all 1s when register is in a rage:
>>> [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
>>> Upper bits are all 0s when register is in a range:
>>> [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
>>> Together this forms are continuous range:
>>> [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
>>>
>>> Now, suppose that register range is in fact tighter:
>>> [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
>>> Also suppose that it's 32-bit range is positive,
>>> meaning that lower 32-bits of the full 64-bit register
>>> are in the range:
>>> [0x0000_0000, 0x7fff_ffff] (W)
>>>
>>> It so happens, that any value in a range:
>>> [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
>>> is smaller than a lowest bound of the range (R):
>>> 0xffff_ffff_8000_0000
>>> which means that upper bits of the full 64-bit register
>>> can't be all 1s, when lower bits are in range (W).
>>>
>>> Note that:
>>> - 0xffff_ffff_8000_0000 == (s64)S32_MIN
>>> - 0x0000_0000_ffff_ffff == (s64)S32_MAX
>>> These relations are used in the conditions below.
>> Sounds good. I will add some comments like the above in v2.
> I would add Ed's explanation verbatim as a comment to verifier.c
>
>>>> + if (reg->s32_min_value >= 0) {
>>>> + if ((reg->smin_value == S32_MIN && reg->smax_value <= S32_MAX) ||
>>>> + (reg->smin_value == S16_MIN && reg->smax_value <= S16_MAX) ||
>>>> + (reg->smin_value == S8_MIN && reg->smax_value <= S8_MAX)) {
>>> The explanation above also lands a question, would it be correct to
>>> replace the checks above by a single one?
>>>
>>> reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX
>> You are correct, the range check can be better. The following is the related
>> description in the commit message:
>>
>>> This patch fixed the issue by adding additional register deduction after 32-bit compare
>>> insn such that if the signed 32-bit register range is non-negative and 64-bit smin is
>>> {S32/S16/S8}_MIN and 64-bit max is no greater than {U32/U16/U8}_MAX.
>>> Here, we check smin with {S32/S16/S8}_MIN since this is the most common result related to
>>> signed extension load.
>> The corrent code simply represents the most common pattern.
>> Since you mention this, I will resive it as below in v2:
>> reg->smin_value >= S32_MIN && reg->smin_value < 0 && reg->smax_value <= S32_MAX
> Why add smin_value < 0 check ?
>
> I'd think
> if (reg->s32_min_value >= 0 && reg->smin_value >= S32_MIN &&
> reg->smax_value <= S32_MAX)
>
> is enough?
This is enough and correct. As you mentioned below, if smin_value >= 0 it is just
a redundant work but no hurt.
I will do as you suggested.
>
> If smin_value is >=0 it's fine to reassign it with s32_min_value
> which is positive as well.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-07-12 20:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-10 4:29 [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
2024-07-10 4:29 ` [PATCH bpf-next 2/2] selftests/bpf: Add ldsx selftests for ldsx and subreg compare Yonghong Song
2024-07-11 22:20 ` [PATCH bpf-next 1/2] bpf: Get better reg range with ldsx and 32bit compare Eduard Zingerman
2024-07-12 5:07 ` Yonghong Song
2024-07-12 18:30 ` Alexei Starovoitov
2024-07-12 20:10 ` Yonghong Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox