* [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare
@ 2024-07-23 16:29 Yonghong Song
2024-07-23 16:29 ` [PATCH bpf-next v5 2/2] selftests/bpf: Add reg_bounds tests for ldsx and subreg compare Yonghong Song
2024-07-24 2:50 ` [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare patchwork-bot+netdevbpf
0 siblings, 2 replies; 3+ messages in thread
From: Yonghong Song @ 2024-07-23 16:29 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau, Eduard Zingerman, Shung-Hsi Yu
With latest llvm19, the selftest iters/iter_arr_with_actual_elem_count
failed with -mcpu=v4.
The following are the details:
0: R1=ctx() R10=fp0
; int iter_arr_with_actual_elem_count(const void *ctx) @ iters.c:1420
0: (b4) w7 = 0 ; R7_w=0
; int i, n = loop_data.n, sum = 0; @ iters.c:1422
1: (18) r1 = 0xffffc90000191478 ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144)
3: (61) r6 = *(u32 *)(r1 +128) ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
; if (n > ARRAY_SIZE(loop_data.data)) @ iters.c:1424
4: (26) if w6 > 0x20 goto pc+27 ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
5: (bf) r8 = r10 ; R8_w=fp0 R10=fp0
6: (07) r8 += -8 ; R8_w=fp-8
; bpf_for(i, 0, n) { @ iters.c:1427
7: (bf) r1 = r8 ; R1_w=fp-8 R8_w=fp-8
8: (b4) w2 = 0 ; R2_w=0
9: (bc) w3 = w6 ; R3_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R6_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
10: (85) call bpf_iter_num_new#45179 ; R0=scalar() fp-8=iter_num(ref_id=2,state=active,depth=0) refs=2
11: (bf) r1 = r8 ; R1=fp-8 R8=fp-8 refs=2
12: (85) call bpf_iter_num_next#45181 13: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
; bpf_for(i, 0, n) { @ iters.c:1427
13: (15) if r0 == 0x0 goto pc+2 ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) refs=2
14: (81) r1 = *(s32 *)(r0 +0) ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
; sum += loop_data.data[i]; @ iters.c:1429
20: (67) r1 <<= 2 ; R1_w=scalar(smax=0x7ffffffc0000007c,umax=0xfffffffc0000007c,smin32=0,smax32=umax32=124,var_off=(0x0; 0xfffffffc0000007c)) refs=2
21: (18) r2 = 0xffffc90000191478 ; R2_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) refs=2
23: (0f) r2 += r1
math between map_value pointer and register with unbounded min value is not allowed
The source code:
int iter_arr_with_actual_elem_count(const void *ctx)
{
int i, n = loop_data.n, sum = 0;
if (n > ARRAY_SIZE(loop_data.data))
return 0;
bpf_for(i, 0, n) {
/* no rechecking of i against ARRAY_SIZE(loop_data.n) */
sum += loop_data.data[i];
}
return sum;
}
The insn #14 is a sign-extenstion load which is related to 'int i'.
The insn #15 did a subreg comparision. Note that smin=0xffffffff80000000 and this caused later
insn #23 failed verification due to unbounded min value.
Actually insn #15 R1 smin range can be better. Before insn #15, we have
R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
With the above range, we know for R1, upper 32bit can only be 0xffffffff or 0.
Otherwise, the value range for R1 could be beyond [smin=0xffffffff80000000,smax=0x7fffffff].
After insn #15, for the true patch, we know smin32=0 and smax32=32. With the upper 32bit 0xffffffff,
then the corresponding value is [0xffffffff00000000, 0xffffffff00000020]. The range is
obviously beyond the original range [smin=0xffffffff80000000,smax=0x7fffffff] and the
range is not possible. So the upper 32bit must be 0, which implies smin = smin32 and
smax = smax32.
This patch fixed the issue by adding additional register deduction after 32-bit compare
insn. If the signed 32-bit register range is non-negative then 64-bit smin is
in range of [S32_MIN, S32_MAX], then the actual 64-bit smin/smax should be the same
as 32-bit smin32/smax32.
With this patch, iters/iter_arr_with_actual_elem_count succeeded with better register range:
from 15 to 20: R0=rdonly_mem(id=7,ref_obj_id=2,sz=4) R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=31,var_off=(0x0; 0x1f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R8=scalar(id=9,smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=3) refs=2
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6de17b99c74d..4fd164c6d1e6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2182,6 +2182,44 @@ static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
reg->smin_value = max_t(s64, reg->smin_value, new_smin);
reg->smax_value = min_t(s64, reg->smax_value, new_smax);
}
+
+ /* Here we would like to handle a special case after sign extending load,
+ * when upper bits for a 64-bit range are all 1s or all 0s.
+ *
+ * Upper bits are all 1s when register is in a range:
+ * [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff]
+ * Upper bits are all 0s when register is in a range:
+ * [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff]
+ * Together this forms are continuous range:
+ * [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff]
+ *
+ * Now, suppose that register range is in fact tighter:
+ * [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R)
+ * Also suppose that it's 32-bit range is positive,
+ * meaning that lower 32-bits of the full 64-bit register
+ * are in the range:
+ * [0x0000_0000, 0x7fff_ffff] (W)
+ *
+ * If this happens, then any value in a range:
+ * [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff]
+ * is smaller than a lowest bound of the range (R):
+ * 0xffff_ffff_8000_0000
+ * which means that upper bits of the full 64-bit register
+ * can't be all 1s, when lower bits are in range (W).
+ *
+ * Note that:
+ * - 0xffff_ffff_8000_0000 == (s64)S32_MIN
+ * - 0x0000_0000_7fff_ffff == (s64)S32_MAX
+ * These relations are used in the conditions below.
+ */
+ if (reg->s32_min_value >= 0 && reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX) {
+ reg->smin_value = reg->s32_min_value;
+ reg->smax_value = reg->s32_max_value;
+ reg->umin_value = reg->s32_min_value;
+ reg->umax_value = reg->s32_max_value;
+ reg->var_off = tnum_intersect(reg->var_off,
+ tnum_range(reg->smin_value, reg->smax_value));
+ }
}
static void __reg_deduce_bounds(struct bpf_reg_state *reg)
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH bpf-next v5 2/2] selftests/bpf: Add reg_bounds tests for ldsx and subreg compare
2024-07-23 16:29 [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
@ 2024-07-23 16:29 ` Yonghong Song
2024-07-24 2:50 ` [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: Yonghong Song @ 2024-07-23 16:29 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau, Eduard Zingerman
Add a few reg_bounds selftests to test 32/16/8-bit ldsx and subreg comparison.
Without the previous patch, all added tests will fail.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
.../selftests/bpf/prog_tests/reg_bounds.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
index eb74363f9f70..0da4225749bd 100644
--- a/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
+++ b/tools/testing/selftests/bpf/prog_tests/reg_bounds.c
@@ -433,6 +433,19 @@ static struct range range_refine(enum num_t x_t, struct range x, enum num_t y_t,
y_cast = range_cast(y_t, x_t, y);
+ /* If we know that
+ * - *x* is in the range of signed 32bit value, and
+ * - *y_cast* range is 32-bit signed non-negative
+ * then *x* range can be improved with *y_cast* such that *x* range
+ * is 32-bit signed non-negative. Otherwise, if the new range for *x*
+ * allows upper 32-bit * 0xffffffff then the eventual new range for
+ * *x* will be out of signed 32-bit range which violates the origin
+ * *x* range.
+ */
+ if (x_t == S64 && y_t == S32 && y_cast.a <= S32_MAX && y_cast.b <= S32_MAX &&
+ (s64)x.a >= S32_MIN && (s64)x.b <= S32_MAX)
+ return range_improve(x_t, x, y_cast);
+
/* the case when new range knowledge, *y*, is a 32-bit subregister
* range, while previous range knowledge, *x*, is a full register
* 64-bit range, needs special treatment to take into account upper 32
@@ -2108,6 +2121,9 @@ static struct subtest_case crafted_cases[] = {
{S32, U32, {(u32)S32_MIN, 0}, {0, 0}},
{S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}},
{S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}},
+ {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x000000007fffffffULL}},
+ {S64, U32, {0x0, 0x1f}, {0xffffffffffff8000ULL, 0x0000000000007fffULL}},
+ {S64, U32, {0x0, 0x1f}, {0xffffffffffffff80ULL, 0x000000000000007fULL}},
};
/* Go over crafted hard-coded cases. This is fast, so we do it as part of
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare
2024-07-23 16:29 [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
2024-07-23 16:29 ` [PATCH bpf-next v5 2/2] selftests/bpf: Add reg_bounds tests for ldsx and subreg compare Yonghong Song
@ 2024-07-24 2:50 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-24 2:50 UTC (permalink / raw)
To: Yonghong Song
Cc: bpf, ast, andrii, daniel, kernel-team, martin.lau, eddyz87,
shung-hsi.yu
Hello:
This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Tue, 23 Jul 2024 09:29:33 -0700 you wrote:
> With latest llvm19, the selftest iters/iter_arr_with_actual_elem_count
> failed with -mcpu=v4.
>
> The following are the details:
> 0: R1=ctx() R10=fp0
> ; int iter_arr_with_actual_elem_count(const void *ctx) @ iters.c:1420
> 0: (b4) w7 = 0 ; R7_w=0
> ; int i, n = loop_data.n, sum = 0; @ iters.c:1422
> 1: (18) r1 = 0xffffc90000191478 ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144)
> 3: (61) r6 = *(u32 *)(r1 +128) ; R1_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) R6_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> ; if (n > ARRAY_SIZE(loop_data.data)) @ iters.c:1424
> 4: (26) if w6 > 0x20 goto pc+27 ; R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
> 5: (bf) r8 = r10 ; R8_w=fp0 R10=fp0
> 6: (07) r8 += -8 ; R8_w=fp-8
> ; bpf_for(i, 0, n) { @ iters.c:1427
> 7: (bf) r1 = r8 ; R1_w=fp-8 R8_w=fp-8
> 8: (b4) w2 = 0 ; R2_w=0
> 9: (bc) w3 = w6 ; R3_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R6_w=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f))
> 10: (85) call bpf_iter_num_new#45179 ; R0=scalar() fp-8=iter_num(ref_id=2,state=active,depth=0) refs=2
> 11: (bf) r1 = r8 ; R1=fp-8 R8=fp-8 refs=2
> 12: (85) call bpf_iter_num_next#45181 13: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R6=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
> ; bpf_for(i, 0, n) { @ iters.c:1427
> 13: (15) if r0 == 0x0 goto pc+2 ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) refs=2
> 14: (81) r1 = *(s32 *)(r0 +0) ; R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) refs=2
> 15: (ae) if w1 < w6 goto pc+4 20: R0=rdonly_mem(id=3,ref_obj_id=2,sz=4) R1=scalar(smin=0xffffffff80000000,smax=smax32=umax32=31,umax=0xffffffff0000001f,smin32=0,var_off=(0x0; 0xffffffff0000001f)) R6=scalar(id=1,smin=umin=smin32=umin32=1,smax=umax=smax32=umax32=32,var_off=(0x0; 0x3f)) R7=0 R8=fp-8 R10=fp0 fp-8=iter_num(ref_id=2,state=active,depth=1) refs=2
> ; sum += loop_data.data[i]; @ iters.c:1429
> 20: (67) r1 <<= 2 ; R1_w=scalar(smax=0x7ffffffc0000007c,umax=0xfffffffc0000007c,smin32=0,smax32=umax32=124,var_off=(0x0; 0xfffffffc0000007c)) refs=2
> 21: (18) r2 = 0xffffc90000191478 ; R2_w=map_value(map=iters.bss,ks=4,vs=1280,off=1144) refs=2
> 23: (0f) r2 += r1
> math between map_value pointer and register with unbounded min value is not allowed
>
> [...]
Here is the summary with links:
- [bpf-next,v5,1/2] bpf: Get better reg range with ldsx and 32bit compare
https://git.kernel.org/bpf/bpf-next/c/2e716457a823
- [bpf-next,v5,2/2] selftests/bpf: Add reg_bounds tests for ldsx and subreg compare
https://git.kernel.org/bpf/bpf-next/c/1d6189935f1d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-07-24 2:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-23 16:29 [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare Yonghong Song
2024-07-23 16:29 ` [PATCH bpf-next v5 2/2] selftests/bpf: Add reg_bounds tests for ldsx and subreg compare Yonghong Song
2024-07-24 2:50 ` [PATCH bpf-next v5 1/2] bpf: Get better reg range with ldsx and 32bit compare patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox