* [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers
@ 2024-01-03 23:26 Yonghong Song
2024-01-03 23:26 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST Yonghong Song
` (5 more replies)
0 siblings, 6 replies; 26+ messages in thread
From: Yonghong Song @ 2024-01-03 23:26 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau
With patch set [1], precision backtracing supports register spill/fill
to/from the stack. The patch [2] allows initial imprecise register spill
with content 0. This is a common case for cpuv3 and lower for
initializing the stack variables with pattern
r1 = 0
*(u64 *)(r10 - 8) = r1
and the [2] has demonstrated good verification improvement.
For cpuv4, the initialization could be
*(u64 *)(r10 - 8) = 0
The current verifier marks the r10-8 contents with STACK_ZERO.
Similar to [2], let us permit the above insn to behave like
imprecise register spill which can reduce number of verified states.
The change is in function check_stack_write_fixed_off().
Before this patch, spilled zero will be marked as STACK_ZERO
which can provide precise values. In check_stack_write_var_off(),
STACK_ZERO will be maintained if writing a const zero
so later it can provide precise values if needed.
The above handling of '*(u64 *)(r10 - 8) = 0' as a spill
will have issues in check_stack_write_var_off() as the spill
will be converted to STACK_MISC and the precise value 0
is lost. To fix this issue, if the spill slots with const
zero and the BPF_ST write also with const zero, the spill slots
are preserved, which can later provide precise values
if needed. Without the change in check_stack_write_var_off(),
the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset'
will fail.
I checked cpuv3 and cpuv4 with and without this patch with veristat.
There is no state change for cpuv3 since '*(u64 *)(r10 - 8) = 0'
is only generated with cpuv4.
For cpuv4:
$ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0'
File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF)
------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- -------------
local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%)
pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%)
test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%)
test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%)
test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%)
test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%)
verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%)
xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%)
xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%)
test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but
pyperf600_bpf_loop and local_storage_bench gets pretty good improvement.
[1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/
[2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 21 +++++++++++++++++--
.../selftests/bpf/progs/verifier_spill_fill.c | 16 +++++++-------
2 files changed, 27 insertions(+), 10 deletions(-)
Changelogs:
v1 -> v2:
- Preserve with-const-zero spill if writing is also zero
in check_stack_write_var_off().
- Add a test with not-8-byte-aligned BPF_ST store.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d4e31f61de0e..cfe7a68d90a5 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
if (fls64(reg->umax_value) > BITS_PER_BYTE * size)
state->stack[spi].spilled_ptr.id = 0;
} else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) &&
- insn->imm != 0 && env->bpf_capable) {
+ env->bpf_capable) {
struct bpf_reg_state fake_reg = {};
__mark_reg_known(&fake_reg, insn->imm);
@@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
/* Variable offset writes destroy any spilled pointers in range. */
for (i = min_off; i < max_off; i++) {
+ struct bpf_reg_state *spill_reg;
u8 new_type, *stype;
- int slot, spi;
+ int slot, spi, j;
slot = -i - 1;
spi = slot / BPF_REG_SIZE;
+
+ /* If writing_zero and the the spi slot contains a spill of value 0,
+ * maintain the spill type.
+ */
+ if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) {
+ spill_reg = &state->stack[spi].spilled_ptr;
+ if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) {
+ for (j = BPF_REG_SIZE; j > 0; j--) {
+ if (state->stack[spi].slot_type[j - 1] != STACK_SPILL)
+ break;
+ }
+ i += BPF_REG_SIZE - j - 1;
+ continue;
+ }
+ }
+
stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE];
mark_stack_slot_scratched(env, spi);
diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
index 39fe3372e0e0..d4b3188afe07 100644
--- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
+++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c
@@ -495,14 +495,14 @@ char single_byte_buf[1] SEC(".data.single_byte_buf");
SEC("raw_tp")
__log_level(2)
__success
-/* make sure fp-8 is all STACK_ZERO */
-__msg("2: (7a) *(u64 *)(r10 -8) = 0 ; R10=fp0 fp-8_w=00000000")
+/* fp-8 is spilled IMPRECISE value zero (represented by a zero value fake reg) */
+__msg("2: (7a) *(u64 *)(r10 -8) = 0 ; R10=fp0 fp-8_w=0")
/* but fp-16 is spilled IMPRECISE zero const reg */
__msg("4: (7b) *(u64 *)(r10 -16) = r0 ; R0_w=0 R10=fp0 fp-16_w=0")
-/* validate that assigning R2 from STACK_ZERO doesn't mark register
+/* validate that assigning R2 from STACK_SPILL with zero value doesn't mark register
* precise immediately; if necessary, it will be marked precise later
*/
-__msg("6: (71) r2 = *(u8 *)(r10 -1) ; R2_w=0 R10=fp0 fp-8_w=00000000")
+__msg("6: (71) r2 = *(u8 *)(r10 -1) ; R2_w=0 R10=fp0 fp-8_w=0")
/* similarly, when R2 is assigned from spilled register, it is initially
* imprecise, but will be marked precise later once it is used in precise context
*/
@@ -520,14 +520,14 @@ __msg("mark_precise: frame0: regs=r0 stack= before 3: (b7) r0 = 0")
__naked void partial_stack_load_preserves_zeros(void)
{
asm volatile (
- /* fp-8 is all STACK_ZERO */
+ /* fp-8 is value zero (represented by a zero value fake reg) */
".8byte %[fp8_st_zero];" /* LLVM-18+: *(u64 *)(r10 -8) = 0; */
/* fp-16 is const zero register */
"r0 = 0;"
"*(u64 *)(r10 -16) = r0;"
- /* load single U8 from non-aligned STACK_ZERO slot */
+ /* load single U8 from non-aligned spilled value zero slot */
"r1 = %[single_byte_buf];"
"r2 = *(u8 *)(r10 -1);"
"r1 += r2;"
@@ -539,7 +539,7 @@ __naked void partial_stack_load_preserves_zeros(void)
"r1 += r2;"
"*(u8 *)(r1 + 0) = r2;" /* this should be fine */
- /* load single U16 from non-aligned STACK_ZERO slot */
+ /* load single U16 from non-aligned spilled value zero slot */
"r1 = %[single_byte_buf];"
"r2 = *(u16 *)(r10 -2);"
"r1 += r2;"
@@ -551,7 +551,7 @@ __naked void partial_stack_load_preserves_zeros(void)
"r1 += r2;"
"*(u8 *)(r1 + 0) = r2;" /* this should be fine */
- /* load single U32 from non-aligned STACK_ZERO slot */
+ /* load single U32 from non-aligned spilled value zero slot */
"r1 = %[single_byte_buf];"
"r2 = *(u32 *)(r10 -4);"
"r1 += r2;"
--
2.34.1
^ permalink raw reply related [flat|nested] 26+ messages in thread* [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song @ 2024-01-03 23:26 ` Yonghong Song 2024-01-04 16:37 ` [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Eduard Zingerman ` (4 subsequent siblings) 5 siblings, 0 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-03 23:26 UTC (permalink / raw) To: bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau Add a selftest with a 4 bytes BPF_ST of 0 where the store is not 8-byte aligned. The goal is to ensure that STACK_ZERO is properly marked for the spill and the STACK_ZERO value can propagate properly during the load. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- .../selftests/bpf/progs/verifier_spill_fill.c | 44 +++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c index d4b3188afe07..6017b26d957d 100644 --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c @@ -583,6 +583,50 @@ __naked void partial_stack_load_preserves_zeros(void) : __clobber_common); } +SEC("raw_tp") +__log_level(2) +__success +/* fp-4 is STACK_ZERO */ +__msg("2: (62) *(u32 *)(r10 -4) = 0 ; R10=fp0 fp-8=0000????") +/* validate that assigning R2 from STACK_ZERO with zero value doesn't mark register + * precise immediately; if necessary, it will be marked precise later + */ +__msg("4: (71) r2 = *(u8 *)(r10 -1) ; R2_w=0 R10=fp0 fp-8=0000????") +__msg("5: (0f) r1 += r2") +__msg("mark_precise: frame0: last_idx 5 first_idx 0 subseq_idx -1") +__msg("mark_precise: frame0: regs=r2 stack= before 4: (71) r2 = *(u8 *)(r10 -1)") +__naked void partial_stack_load_preserves_partial_zeros(void) +{ + asm volatile ( + /* fp-4 is value zero */ + ".8byte %[fp4_st_zero];" /* LLVM-18+: *(u32 *)(r10 -4) = 0; */ + + /* load single U8 from non-aligned stack zero slot */ + "r1 = %[single_byte_buf];" + "r2 = *(u8 *)(r10 -1);" + "r1 += r2;" + "*(u8 *)(r1 + 0) = r2;" /* this should be fine */ + + /* load single U16 from non-aligned stack zero slot */ + "r1 = %[single_byte_buf];" + "r2 = *(u16 *)(r10 -2);" + "r1 += r2;" + "*(u8 *)(r1 + 0) = r2;" /* this should be fine */ + + /* load single U32 from non-aligned stack zero slot */ + "r1 = %[single_byte_buf];" + "r2 = *(u32 *)(r10 -4);" + "r1 += r2;" + "*(u8 *)(r1 + 0) = r2;" /* this should be fine */ + + "r0 = 0;" + "exit;" + : + : __imm_ptr(single_byte_buf), + __imm_insn(fp4_st_zero, BPF_ST_MEM(BPF_W, BPF_REG_FP, -4, 0)) + : __clobber_common); +} + char two_byte_buf[2] SEC(".data.two_byte_buf"); SEC("raw_tp") -- 2.34.1 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song 2024-01-03 23:26 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST Yonghong Song @ 2024-01-04 16:37 ` Eduard Zingerman 2024-01-04 17:13 ` Yonghong Song 2024-01-04 18:30 ` Eduard Zingerman ` (3 subsequent siblings) 5 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-04 16:37 UTC (permalink / raw) To: Yonghong Song, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Wed, 2024-01-03 at 15:26 -0800, Yonghong Song wrote: > With patch set [1], precision backtracing supports register spill/fill > to/from the stack. The patch [2] allows initial imprecise register spill > with content 0. This is a common case for cpuv3 and lower for > initializing the stack variables with pattern > r1 = 0 > *(u64 *)(r10 - 8) = r1 > and the [2] has demonstrated good verification improvement. > > For cpuv4, the initialization could be > *(u64 *)(r10 - 8) = 0 > The current verifier marks the r10-8 contents with STACK_ZERO. > Similar to [2], let us permit the above insn to behave like > imprecise register spill which can reduce number of verified states. > The change is in function check_stack_write_fixed_off(). Hi Yonghong, I agree with this change, but I don't understand under which conditions current STACK_ZERO logic is sub-optimal. I tried executing test case from patch #2 w/o applying patch #1 and it passes. Could you please elaborate / conjure a test case that would fail w/o patch #1? Thanks, Eduard [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 16:37 ` [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Eduard Zingerman @ 2024-01-04 17:13 ` Yonghong Song 2024-01-04 18:43 ` Eduard Zingerman 0 siblings, 1 reply; 26+ messages in thread From: Yonghong Song @ 2024-01-04 17:13 UTC (permalink / raw) To: Eduard Zingerman, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/4/24 8:37 AM, Eduard Zingerman wrote: > On Wed, 2024-01-03 at 15:26 -0800, Yonghong Song wrote: >> With patch set [1], precision backtracing supports register spill/fill >> to/from the stack. The patch [2] allows initial imprecise register spill >> with content 0. This is a common case for cpuv3 and lower for >> initializing the stack variables with pattern >> r1 = 0 >> *(u64 *)(r10 - 8) = r1 >> and the [2] has demonstrated good verification improvement. >> >> For cpuv4, the initialization could be >> *(u64 *)(r10 - 8) = 0 >> The current verifier marks the r10-8 contents with STACK_ZERO. >> Similar to [2], let us permit the above insn to behave like >> imprecise register spill which can reduce number of verified states. >> The change is in function check_stack_write_fixed_off(). > Hi Yonghong, > > I agree with this change, but I don't understand under which conditions > current STACK_ZERO logic is sub-optimal. > I tried executing test case from patch #2 w/o applying patch #1 and it passes. > Could you please elaborate / conjure a test case that would fail w/o patch #1? The logic is similar to https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ STACK_ZERO logic is sub-optimal in some cases only w.r.t. the number of verifier states. So there is no correctness issue. Patch 2 is added in response to Andrii's request in https://lore.kernel.org/all/CAEf4BzaWets3fHUGtctwCNWecR9ASRCO2kFagNy8jJZmPBWYDA@mail.gmail.com/ Since with patch 1 the original STACK_ZERO case is converted to STACK_SPILL, Patch 2 is added to cover STACK_ZERO case. So with or with patch 1, patch 2 will succeed since it uses STACK_ZERO logic. > > Thanks, > Eduard > > [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 17:13 ` Yonghong Song @ 2024-01-04 18:43 ` Eduard Zingerman 0 siblings, 0 replies; 26+ messages in thread From: Eduard Zingerman @ 2024-01-04 18:43 UTC (permalink / raw) To: Yonghong Song, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, 2024-01-04 at 09:13 -0800, Yonghong Song wrote: > On 1/4/24 8:37 AM, Eduard Zingerman wrote: [...] > > I agree with this change, but I don't understand under which conditions > > current STACK_ZERO logic is sub-optimal. > > I tried executing test case from patch #2 w/o applying patch #1 and it passes. > > Could you please elaborate / conjure a test case that would fail w/o patch #1? > > The logic is similar to > https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ > > STACK_ZERO logic is sub-optimal in some cases only w.r.t. the number of > verifier states. So there is no correctness issue. > Patch 2 is added in response to Andrii's request in > https://lore.kernel.org/all/CAEf4BzaWets3fHUGtctwCNWecR9ASRCO2kFagNy8jJZmPBWYDA@mail.gmail.com/ > Since with patch 1 the original STACK_ZERO case is converted to STACK_SPILL, > Patch 2 is added to cover STACK_ZERO case. So with or with patch 1, patch 2 > will succeed since it uses STACK_ZERO logic. Understood, thank you. Probably no need to add more tests then. A patch [1] might be related, it handles STACK_ZERO vs zero spill vs unbound scalar spill on regsafe side. [1] https://lore.kernel.org/bpf/20231220214013.3327288-15-maxtram95@gmail.com/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song 2024-01-03 23:26 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST Yonghong Song 2024-01-04 16:37 ` [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Eduard Zingerman @ 2024-01-04 18:30 ` Eduard Zingerman 2024-01-04 20:12 ` Yonghong Song 2024-01-04 23:03 ` Andrii Nakryiko ` (2 subsequent siblings) 5 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-04 18:30 UTC (permalink / raw) To: Yonghong Song, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Wed, 2024-01-03 at 15:26 -0800, Yonghong Song wrote: I missed one thing while looking at this patch, please see below. [...] > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d4e31f61de0e..cfe7a68d90a5 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > state->stack[spi].spilled_ptr.id = 0; > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > - insn->imm != 0 && env->bpf_capable) { > + env->bpf_capable) { > struct bpf_reg_state fake_reg = {}; > > __mark_reg_known(&fake_reg, insn->imm); > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > /* Variable offset writes destroy any spilled pointers in range. */ > for (i = min_off; i < max_off; i++) { > + struct bpf_reg_state *spill_reg; > u8 new_type, *stype; > - int slot, spi; > + int slot, spi, j; > > slot = -i - 1; > spi = slot / BPF_REG_SIZE; > + > + /* If writing_zero and the the spi slot contains a spill of value 0, > + * maintain the spill type. > + */ > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { Talked to Andrii today, and he noted that spilled reg should be marked precise at this point. > + spill_reg = &state->stack[spi].spilled_ptr; > + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { > + for (j = BPF_REG_SIZE; j > 0; j--) { > + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) > + break; > + } > + i += BPF_REG_SIZE - j - 1; > + continue; > + } > + } > + > stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; > mark_stack_slot_scratched(env, spi); ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 18:30 ` Eduard Zingerman @ 2024-01-04 20:12 ` Yonghong Song 2024-01-04 21:10 ` Eduard Zingerman 0 siblings, 1 reply; 26+ messages in thread From: Yonghong Song @ 2024-01-04 20:12 UTC (permalink / raw) To: Eduard Zingerman, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/4/24 10:30 AM, Eduard Zingerman wrote: > On Wed, 2024-01-03 at 15:26 -0800, Yonghong Song wrote: > > I missed one thing while looking at this patch, please see below. > > [...] > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index d4e31f61de0e..cfe7a68d90a5 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, >> if (fls64(reg->umax_value) > BITS_PER_BYTE * size) >> state->stack[spi].spilled_ptr.id = 0; >> } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && >> - insn->imm != 0 && env->bpf_capable) { >> + env->bpf_capable) { >> struct bpf_reg_state fake_reg = {}; >> >> __mark_reg_known(&fake_reg, insn->imm); >> @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, >> >> /* Variable offset writes destroy any spilled pointers in range. */ >> for (i = min_off; i < max_off; i++) { >> + struct bpf_reg_state *spill_reg; >> u8 new_type, *stype; >> - int slot, spi; >> + int slot, spi, j; >> >> slot = -i - 1; >> spi = slot / BPF_REG_SIZE; >> + >> + /* If writing_zero and the the spi slot contains a spill of value 0, >> + * maintain the spill type. >> + */ >> + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > Talked to Andrii today, and he noted that spilled reg should be marked > precise at this point. Could you help explain why? Looks we did not mark reg as precise with fixed offset as below: if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && env->bpf_capable) { save_register_state(env, state, spi, reg, size); /* Break the relation on a narrowing spill. */ if (fls64(reg->umax_value) > BITS_PER_BYTE * size) state->stack[spi].spilled_ptr.id = 0; } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && insn->imm != 0 && env->bpf_capable) { I probably missed something about precision tracking... > >> + spill_reg = &state->stack[spi].spilled_ptr; >> + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { >> + for (j = BPF_REG_SIZE; j > 0; j--) { >> + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) >> + break; >> + } >> + i += BPF_REG_SIZE - j - 1; >> + continue; >> + } >> + } >> + >> stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; >> mark_stack_slot_scratched(env, spi); ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 20:12 ` Yonghong Song @ 2024-01-04 21:10 ` Eduard Zingerman 2024-01-04 23:09 ` Andrii Nakryiko 0 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-04 21:10 UTC (permalink / raw) To: Yonghong Song, bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, 2024-01-04 at 12:12 -0800, Yonghong Song wrote: [...] > > > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > > > > > /* Variable offset writes destroy any spilled pointers in range. */ > > > for (i = min_off; i < max_off; i++) { > > > + struct bpf_reg_state *spill_reg; > > > u8 new_type, *stype; > > > - int slot, spi; > > > + int slot, spi, j; > > > > > > slot = -i - 1; > > > spi = slot / BPF_REG_SIZE; > > > + > > > + /* If writing_zero and the the spi slot contains a spill of value 0, > > > + * maintain the spill type. > > > + */ > > > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > > Talked to Andrii today, and he noted that spilled reg should be marked > > precise at this point. > > Could you help explain why? > > Looks we did not mark reg as precise with fixed offset as below: > > if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && env->bpf_capable) { > save_register_state(env, state, spi, reg, size); > /* Break the relation on a narrowing spill. */ > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > state->stack[spi].spilled_ptr.id = 0; > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > insn->imm != 0 && env->bpf_capable) { > > I probably missed something about precision tracking... The discussed justification was that if verifier assumes something about the value of scalar (in this case that it is 0) such scalar should be marked precise (e.g. this is done for value_regno in check_stack_write_var_off()). This seemed logical at the time of discussion, however, I can't figure a counter example at the moment. It appears that whatever are assumptions in check_stack_write_var_off() if spill is used in the precise context it would be marked eventually. E.g. the following is correctly rejected: SEC("raw_tp") __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) __failure __naked void var_stack_1(void) { asm volatile ( "call %[bpf_get_prandom_u32];" "r9 = 100500;" "if r0 > 42 goto +1;" "r9 = 0;" "*(u64 *)(r10 - 16) = r9;" "call %[bpf_get_prandom_u32];" "r0 &= 0xf;" "r1 = -1;" "r1 -= r0;" "r2 = r10;" "r2 += r1;" "r0 = 0;" "*(u8 *)(r2 + 0) = r0;" "r1 = %[two_byte_buf];" "r2 = *(u32 *)(r10 -16);" "r1 += r2;" "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ "exit;" : : __imm_ptr(two_byte_buf), __imm(bpf_get_prandom_u32) : __clobber_common); } So now I'm not sure :( Sorry for too much noise. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 21:10 ` Eduard Zingerman @ 2024-01-04 23:09 ` Andrii Nakryiko 2024-01-04 23:29 ` Eduard Zingerman 0 siblings, 1 reply; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-04 23:09 UTC (permalink / raw) To: Eduard Zingerman Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, Jan 4, 2024 at 1:11 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2024-01-04 at 12:12 -0800, Yonghong Song wrote: > [...] > > > > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > > > > > > > /* Variable offset writes destroy any spilled pointers in range. */ > > > > for (i = min_off; i < max_off; i++) { > > > > + struct bpf_reg_state *spill_reg; > > > > u8 new_type, *stype; > > > > - int slot, spi; > > > > + int slot, spi, j; > > > > > > > > slot = -i - 1; > > > > spi = slot / BPF_REG_SIZE; > > > > + > > > > + /* If writing_zero and the the spi slot contains a spill of value 0, > > > > + * maintain the spill type. > > > > + */ > > > > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > > > Talked to Andrii today, and he noted that spilled reg should be marked > > > precise at this point. > > > > Could you help explain why? > > > > Looks we did not mark reg as precise with fixed offset as below: > > > > if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) && env->bpf_capable) { > > save_register_state(env, state, spi, reg, size); > > /* Break the relation on a narrowing spill. */ > > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > > state->stack[spi].spilled_ptr.id = 0; > > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > > insn->imm != 0 && env->bpf_capable) { > > > > I probably missed something about precision tracking... > > The discussed justification was that if verifier assumes something > about the value of scalar (in this case that it is 0) such scalar > should be marked precise (e.g. this is done for value_regno in > check_stack_write_var_off()). > > This seemed logical at the time of discussion, however, I can't figure > a counter example at the moment. It appears that whatever are > assumptions in check_stack_write_var_off() if spill is used in the > precise context it would be marked eventually. > E.g. the following is correctly rejected: > > SEC("raw_tp") > __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) > __failure > __naked void var_stack_1(void) > { > asm volatile ( > "call %[bpf_get_prandom_u32];" > "r9 = 100500;" > "if r0 > 42 goto +1;" > "r9 = 0;" > "*(u64 *)(r10 - 16) = r9;" > "call %[bpf_get_prandom_u32];" > "r0 &= 0xf;" > "r1 = -1;" > "r1 -= r0;" > "r2 = r10;" > "r2 += r1;" > "r0 = 0;" > "*(u8 *)(r2 + 0) = r0;" > "r1 = %[two_byte_buf];" > "r2 = *(u32 *)(r10 -16);" > "r1 += r2;" > "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ > "exit;" > : > : __imm_ptr(two_byte_buf), > __imm(bpf_get_prandom_u32) > : __clobber_common); > } > > So now I'm not sure :( > Sorry for too much noise. hm... does that test have to do so many things and do all these u64 vs u32 vs u8 conversions? Can we try a simple test were we spill u64 SCALAR (imprecise) zero register to fp-8 or fp-16, and then use those fp-8|fp-16 slot as an index into an array in precise context. Then have a separate delayed branch that will write non-zero to fp-8|fp-16. States shouldn't converge and this should be rejected. Yonghong, the reason fixed offset stack write works is because we know exactly the stack slot in which spilled register is and we can backtrack and mark it as precise, if necessary. With variable offset stack access there is no single stack slot (in general case), so we lose the link to that spilled register. So we need to either eagerly mark spilled registers as precise or just do STACK_MISC kind of logic. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 23:09 ` Andrii Nakryiko @ 2024-01-04 23:29 ` Eduard Zingerman 2024-01-05 1:05 ` Andrii Nakryiko 0 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-04 23:29 UTC (permalink / raw) To: Andrii Nakryiko Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, 2024-01-04 at 15:09 -0800, Andrii Nakryiko wrote: [...] > > This seemed logical at the time of discussion, however, I can't figure > > a counter example at the moment. It appears that whatever are > > assumptions in check_stack_write_var_off() if spill is used in the > > precise context it would be marked eventually. > > E.g. the following is correctly rejected: > > > > SEC("raw_tp") > > __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) > > __failure > > __naked void var_stack_1(void) > > { > > asm volatile ( > > "call %[bpf_get_prandom_u32];" > > "r9 = 100500;" > > "if r0 > 42 goto +1;" > > "r9 = 0;" > > "*(u64 *)(r10 - 16) = r9;" > > "call %[bpf_get_prandom_u32];" > > "r0 &= 0xf;" > > "r1 = -1;" > > "r1 -= r0;" > > "r2 = r10;" > > "r2 += r1;" > > "r0 = 0;" > > "*(u8 *)(r2 + 0) = r0;" > > "r1 = %[two_byte_buf];" > > "r2 = *(u32 *)(r10 -16);" > > "r1 += r2;" > > "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ > > "exit;" > > : > > : __imm_ptr(two_byte_buf), > > __imm(bpf_get_prandom_u32) > > : __clobber_common); > > } > > > > So now I'm not sure :( > > Sorry for too much noise. > > > hm... does that test have to do so many things and do all these u64 vs > u32 vs u8 conversions? The test is actually quite minimal, the longest part is conjuring of varying offset pointer in r2, here it is with additional comments: /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ "call %[bpf_get_prandom_u32];" "r9 = 100500;" "if r0 > 42 goto +1;" "r9 = 0;" "*(u64 *)(r10 - 16) = r9;" /* prepare a variable length access */ "call %[bpf_get_prandom_u32];" "r0 &= 0xf;" /* r0 range is [0; 15] */ "r1 = -1;" "r1 -= r0;" /* r1 range is [-16; -1] */ "r2 = r10;" "r2 += r1;" /* r2 range is [fp-16; fp-1] */ /* do a variable length write of constant 0 */ "r0 = 0;" "*(u8 *)(r2 + 0) = r0;" /* use fp-16 to access an array of length 2 */ "r1 = %[two_byte_buf];" "r2 = *(u32 *)(r10 -16);" "r1 += r2;" "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ "exit;" > Can we try a simple test were we spill u64 > SCALAR (imprecise) zero register to fp-8 or fp-16, and then use those > fp-8|fp-16 slot as an index into an array in precise context. Then > have a separate delayed branch that will write non-zero to fp-8|fp-16. > States shouldn't converge and this should be rejected. That is what test above does but it also includes varying offset access. [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-04 23:29 ` Eduard Zingerman @ 2024-01-05 1:05 ` Andrii Nakryiko 2024-01-05 7:14 ` Yonghong Song 2024-01-05 23:52 ` Eduard Zingerman 0 siblings, 2 replies; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-05 1:05 UTC (permalink / raw) To: Eduard Zingerman Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, Jan 4, 2024 at 3:29 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2024-01-04 at 15:09 -0800, Andrii Nakryiko wrote: > [...] > > > This seemed logical at the time of discussion, however, I can't figure > > > a counter example at the moment. It appears that whatever are > > > assumptions in check_stack_write_var_off() if spill is used in the > > > precise context it would be marked eventually. > > > E.g. the following is correctly rejected: > > > > > > SEC("raw_tp") > > > __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) > > > __failure > > > __naked void var_stack_1(void) > > > { > > > asm volatile ( > > > "call %[bpf_get_prandom_u32];" > > > "r9 = 100500;" > > > "if r0 > 42 goto +1;" > > > "r9 = 0;" > > > "*(u64 *)(r10 - 16) = r9;" > > > "call %[bpf_get_prandom_u32];" > > > "r0 &= 0xf;" > > > "r1 = -1;" > > > "r1 -= r0;" > > > "r2 = r10;" > > > "r2 += r1;" > > > "r0 = 0;" > > > "*(u8 *)(r2 + 0) = r0;" > > > "r1 = %[two_byte_buf];" > > > "r2 = *(u32 *)(r10 -16);" > > > "r1 += r2;" > > > "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ > > > "exit;" > > > : > > > : __imm_ptr(two_byte_buf), > > > __imm(bpf_get_prandom_u32) > > > : __clobber_common); > > > } > > > > > > So now I'm not sure :( > > > Sorry for too much noise. > > > > > > hm... does that test have to do so many things and do all these u64 vs > > u32 vs u8 conversions? > > The test is actually quite minimal, the longest part is conjuring of > varying offset pointer in r2, here it is with additional comments: > > /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ > "call %[bpf_get_prandom_u32];" > "r9 = 100500;" > "if r0 > 42 goto +1;" > "r9 = 0;" > "*(u64 *)(r10 - 16) = r9;" > /* prepare a variable length access */ > "call %[bpf_get_prandom_u32];" > "r0 &= 0xf;" /* r0 range is [0; 15] */ > "r1 = -1;" > "r1 -= r0;" /* r1 range is [-16; -1] */ > "r2 = r10;" > "r2 += r1;" /* r2 range is [fp-16; fp-1] */ > /* do a variable length write of constant 0 */ > "r0 = 0;" > "*(u8 *)(r2 + 0) = r0;" I meant this u8 > /* use fp-16 to access an array of length 2 */ > "r1 = %[two_byte_buf];" > "r2 = *(u32 *)(r10 -16);" and this u32. I'm not saying it's anything wrong, but it's simpler to deal with u64 consistently. There is nothing wrong with the test per se, I'm just saying we should try eliminate unnecessary cross-plays with narrowing/widening stores/loads. But that's offtopic, sorry. > "r1 += r2;" > "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ > "exit;" > > > Can we try a simple test were we spill u64 > > SCALAR (imprecise) zero register to fp-8 or fp-16, and then use those > > fp-8|fp-16 slot as an index into an array in precise context. Then > > have a separate delayed branch that will write non-zero to fp-8|fp-16. > > States shouldn't converge and this should be rejected. > > That is what test above does but it also includes varying offset access. > Yes, and the test fails, but if you read the log, you'll see that fp-8 is never marked precise, but it should. So we need more elaborate test that would somehow exploit fp-8 imprecision. I ran out of time. But what I tried was replacing "r2 = *(u32 *)(r10 -16);" with "r2 = *(u8 *)(r2 +0);" So keep both read and write as variable offset. And we are saved by some missing logic in read_var_off that would set r2 as known zero (because it should be for the branch where both fp-8 and fp-16 are zero). But that fails in the branch that should succeed, and if that branch actually succeeds, I suspect the branch where we initialize with non-zero r9 will erroneously succeed. Anyways, I still claim that we are mishandling a precision of spilled register when doing zero var_off writes. > [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 1:05 ` Andrii Nakryiko @ 2024-01-05 7:14 ` Yonghong Song 2024-01-05 8:10 ` Yonghong Song 2024-01-05 23:37 ` Eduard Zingerman 2024-01-05 23:52 ` Eduard Zingerman 1 sibling, 2 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-05 7:14 UTC (permalink / raw) To: Andrii Nakryiko, Eduard Zingerman Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/4/24 5:05 PM, Andrii Nakryiko wrote: > On Thu, Jan 4, 2024 at 3:29 PM Eduard Zingerman <eddyz87@gmail.com> wrote: >> On Thu, 2024-01-04 at 15:09 -0800, Andrii Nakryiko wrote: >> [...] >>>> This seemed logical at the time of discussion, however, I can't figure >>>> a counter example at the moment. It appears that whatever are >>>> assumptions in check_stack_write_var_off() if spill is used in the >>>> precise context it would be marked eventually. >>>> E.g. the following is correctly rejected: >>>> >>>> SEC("raw_tp") >>>> __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) >>>> __failure >>>> __naked void var_stack_1(void) >>>> { >>>> asm volatile ( >>>> "call %[bpf_get_prandom_u32];" >>>> "r9 = 100500;" >>>> "if r0 > 42 goto +1;" >>>> "r9 = 0;" >>>> "*(u64 *)(r10 - 16) = r9;" >>>> "call %[bpf_get_prandom_u32];" >>>> "r0 &= 0xf;" >>>> "r1 = -1;" >>>> "r1 -= r0;" >>>> "r2 = r10;" >>>> "r2 += r1;" >>>> "r0 = 0;" >>>> "*(u8 *)(r2 + 0) = r0;" >>>> "r1 = %[two_byte_buf];" >>>> "r2 = *(u32 *)(r10 -16);" >>>> "r1 += r2;" >>>> "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ >>>> "exit;" >>>> : >>>> : __imm_ptr(two_byte_buf), >>>> __imm(bpf_get_prandom_u32) >>>> : __clobber_common); >>>> } >>>> >>>> So now I'm not sure :( >>>> Sorry for too much noise. >>> >>> hm... does that test have to do so many things and do all these u64 vs >>> u32 vs u8 conversions? >> The test is actually quite minimal, the longest part is conjuring of >> varying offset pointer in r2, here it is with additional comments: >> >> /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ >> "call %[bpf_get_prandom_u32];" >> "r9 = 100500;" >> "if r0 > 42 goto +1;" >> "r9 = 0;" >> "*(u64 *)(r10 - 16) = r9;" >> /* prepare a variable length access */ >> "call %[bpf_get_prandom_u32];" >> "r0 &= 0xf;" /* r0 range is [0; 15] */ >> "r1 = -1;" >> "r1 -= r0;" /* r1 range is [-16; -1] */ >> "r2 = r10;" >> "r2 += r1;" /* r2 range is [fp-16; fp-1] */ >> /* do a variable length write of constant 0 */ >> "r0 = 0;" >> "*(u8 *)(r2 + 0) = r0;" > I meant this u8 > >> /* use fp-16 to access an array of length 2 */ >> "r1 = %[two_byte_buf];" >> "r2 = *(u32 *)(r10 -16);" > and this u32. I'm not saying it's anything wrong, but it's simpler to > deal with u64 consistently. There is nothing wrong with the test per > se, I'm just saying we should try eliminate unnecessary cross-plays > with narrowing/widening stores/loads. > > But that's offtopic, sorry. > >> "r1 += r2;" >> "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ >> "exit;" >> >>> Can we try a simple test were we spill u64 >>> SCALAR (imprecise) zero register to fp-8 or fp-16, and then use those >>> fp-8|fp-16 slot as an index into an array in precise context. Then >>> have a separate delayed branch that will write non-zero to fp-8|fp-16. >>> States shouldn't converge and this should be rejected. >> That is what test above does but it also includes varying offset access. >> > Yes, and the test fails, but if you read the log, you'll see that fp-8 > is never marked precise, but it should. So we need more elaborate test > that would somehow exploit fp-8 imprecision. > > I ran out of time. But what I tried was replacing > > > "r2 = *(u32 *)(r10 -16);" > > with > > "r2 = *(u8 *)(r2 +0);" > > So keep both read and write as variable offset. And we are saved by > some missing logic in read_var_off that would set r2 as known zero > (because it should be for the branch where both fp-8 and fp-16 are > zero). But that fails in the branch that should succeed, and if that > branch actually succeeds, I suspect the branch where we initialize > with non-zero r9 will erroneously succeed. I did some experiments but still confused. With the current patch set and the above Andrii's suggested changes, we have ... 13: R1_w=scalar(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) 13: (b7) r0 = 0 ; R0_w=0 14: (73) *(u8 *)(r2 +0) = r0 ; R0_w=0 R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) fp-8=mmmmmmmm fp-16=0 15: (bf) r1 = r6 ; R1_w=map_value(map=.data.two_byte_,ks=4,vs=2) R6=map_value(map=.data.two_byte_,ks=4,vs=2) 16: (71) r2 = *(u8 *)(r2 +0) ; R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) fp-16=0 17: (0f) r1 += r2 mark_precise: frame0: last_idx 17 first_idx 8 subseq_idx -1 mark_precise: frame0: regs=r2 stack= before 16: (71) r2 = *(u8 *)(r2 +0) 18: R1_w=map_value(map=.data.two_byte_,ks=4,vs=2,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 18: (73) *(u8 *)(r1 +0) = r2 invalid access to map value, value_size=2 off=255 size=1 R1 max value is outside of the allowed memory range Now, let us add the precision marking, --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4619,11 +4619,18 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, slot = -i - 1; spi = slot / BPF_REG_SIZE; + mark_stack_slot_scratched(env, spi); /* If writing_zero and the the spi slot contains a spill of value 0, * maintain the spill type. */ if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { + if (value_regno >= 0) { + err = mark_chain_precision(env, value_regno); + if (err) + return err; + } + spill_reg = &state->stack[spi].spilled_ptr; if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { for (j = BPF_REG_SIZE; j > 0; j--) { @@ -4636,7 +4643,6 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, } stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; - mark_stack_slot_scratched(env, spi); if (!env->allow_ptr_leaks && *stype != STACK_MISC && *stype != STACK_ZERO) { /* Reject the write if range we may write to has not With the above change, the verifier output: ... 13: R1_w=scalar(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) 13: (b7) r0 = 0 ; R0_w=0 14: (73) *(u8 *)(r2 +0) = r0 mark_precise: frame0: last_idx 14 first_idx 8 subseq_idx -1 mark_precise: frame0: regs=r0 stack= before 13: (b7) r0 = 0 <==== added precision marking for the value register 15: R0_w=0 R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; 0xf)) fp-8=mmmmmmmm fp-16=0 15: (bf) r1 = r6 ; R1_w=map_value(map=.data.two_byte_,ks=4,vs=2) R6=map_value(map=.data.two_byte_,ks=4,vs=2) 16: (71) r2 = *(u8 *)(r2 +0) ; R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) fp-16=0 17: (0f) r1 += r2 mark_precise: frame0: last_idx 17 first_idx 8 subseq_idx -1 mark_precise: frame0: regs=r2 stack= before 16: (71) r2 = *(u8 *)(r2 +0) 18: R1_w=map_value(map=.data.two_byte_,ks=4,vs=2,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) 18: (73) *(u8 *)(r1 +0) = r2 invalid access to map value, value_size=2 off=255 size=1 R1 max value is outside of the allowed memory range Note that we do have precision marking for register r0 at insn 14. But backtracking at insn 17 stops at insn 16 and it did not reach back to insn 14, so precision marking is not really needed in this particular case. Maybe I missed something here. There is an alternative implementation in check_stack_write_var_off(). For a spill of value/reg 0, we can convert it to STACK_ZERO instead of trying to maintain STACK_SPILL. If we convert it to STACK_ZERO, then we can reuse the rest of logic in check_stack_write_var_off() and at the end we have if (zero_used) { /* backtracking doesn't work for STACK_ZERO yet. */ err = mark_chain_precision(env, value_regno); if (err) return err; } although I do not fully understand the above either. Need to go back to git history to find why. > > Anyways, I still claim that we are mishandling a precision of spilled > register when doing zero var_off writes. > > > >> [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 7:14 ` Yonghong Song @ 2024-01-05 8:10 ` Yonghong Song 2024-01-05 23:37 ` Eduard Zingerman 1 sibling, 0 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-05 8:10 UTC (permalink / raw) To: Andrii Nakryiko, Eduard Zingerman Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/4/24 11:14 PM, Yonghong Song wrote: > > On 1/4/24 5:05 PM, Andrii Nakryiko wrote: >> On Thu, Jan 4, 2024 at 3:29 PM Eduard Zingerman <eddyz87@gmail.com> >> wrote: >>> On Thu, 2024-01-04 at 15:09 -0800, Andrii Nakryiko wrote: >>> [...] >>>>> This seemed logical at the time of discussion, however, I can't >>>>> figure >>>>> a counter example at the moment. It appears that whatever are >>>>> assumptions in check_stack_write_var_off() if spill is used in the >>>>> precise context it would be marked eventually. >>>>> E.g. the following is correctly rejected: >>>>> >>>>> SEC("raw_tp") >>>>> __log_level(2) __flag(BPF_F_TEST_STATE_FREQ) >>>>> __failure >>>>> __naked void var_stack_1(void) >>>>> { >>>>> asm volatile ( >>>>> "call %[bpf_get_prandom_u32];" >>>>> "r9 = 100500;" >>>>> "if r0 > 42 goto +1;" >>>>> "r9 = 0;" >>>>> "*(u64 *)(r10 - 16) = r9;" >>>>> "call %[bpf_get_prandom_u32];" >>>>> "r0 &= 0xf;" >>>>> "r1 = -1;" >>>>> "r1 -= r0;" >>>>> "r2 = r10;" >>>>> "r2 += r1;" >>>>> "r0 = 0;" >>>>> "*(u8 *)(r2 + 0) = r0;" >>>>> "r1 = %[two_byte_buf];" >>>>> "r2 = *(u32 *)(r10 -16);" >>>>> "r1 += r2;" >>>>> "*(u8 *)(r1 + 0) = r2;" /* this should not be >>>>> fine */ >>>>> "exit;" >>>>> : >>>>> : __imm_ptr(two_byte_buf), >>>>> __imm(bpf_get_prandom_u32) >>>>> : __clobber_common); >>>>> } >>>>> >>>>> So now I'm not sure :( >>>>> Sorry for too much noise. >>>> >>>> hm... does that test have to do so many things and do all these u64 vs >>>> u32 vs u8 conversions? >>> The test is actually quite minimal, the longest part is conjuring of >>> varying offset pointer in r2, here it is with additional comments: >>> >>> /* Write 0 or 100500 to fp-16, 0 is on the first verification >>> pass */ >>> "call %[bpf_get_prandom_u32];" >>> "r9 = 100500;" >>> "if r0 > 42 goto +1;" >>> "r9 = 0;" >>> "*(u64 *)(r10 - 16) = r9;" >>> /* prepare a variable length access */ >>> "call %[bpf_get_prandom_u32];" >>> "r0 &= 0xf;" /* r0 range is [0; 15] */ >>> "r1 = -1;" >>> "r1 -= r0;" /* r1 range is [-16; -1] */ >>> "r2 = r10;" >>> "r2 += r1;" /* r2 range is [fp-16; fp-1] */ >>> /* do a variable length write of constant 0 */ >>> "r0 = 0;" >>> "*(u8 *)(r2 + 0) = r0;" >> I meant this u8 >> >>> /* use fp-16 to access an array of length 2 */ >>> "r1 = %[two_byte_buf];" >>> "r2 = *(u32 *)(r10 -16);" >> and this u32. I'm not saying it's anything wrong, but it's simpler to >> deal with u64 consistently. There is nothing wrong with the test per >> se, I'm just saying we should try eliminate unnecessary cross-plays >> with narrowing/widening stores/loads. >> >> But that's offtopic, sorry. >> >>> "r1 += r2;" >>> "*(u8 *)(r1 + 0) = r2;" /* this should not be fine */ >>> "exit;" >>> >>>> Can we try a simple test were we spill u64 >>>> SCALAR (imprecise) zero register to fp-8 or fp-16, and then use those >>>> fp-8|fp-16 slot as an index into an array in precise context. Then >>>> have a separate delayed branch that will write non-zero to fp-8|fp-16. >>>> States shouldn't converge and this should be rejected. >>> That is what test above does but it also includes varying offset >>> access. >>> >> Yes, and the test fails, but if you read the log, you'll see that fp-8 >> is never marked precise, but it should. So we need more elaborate test >> that would somehow exploit fp-8 imprecision. >> >> I ran out of time. But what I tried was replacing >> >> >> "r2 = *(u32 *)(r10 -16);" >> >> with >> >> "r2 = *(u8 *)(r2 +0);" >> >> So keep both read and write as variable offset. And we are saved by >> some missing logic in read_var_off that would set r2 as known zero >> (because it should be for the branch where both fp-8 and fp-16 are >> zero). But that fails in the branch that should succeed, and if that >> branch actually succeeds, I suspect the branch where we initialize >> with non-zero r9 will erroneously succeed. > > I did some experiments but still confused. > With the current patch set and the above Andrii's suggested changes, > we have > ... > 13: > R1_w=scalar(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) > R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) > 13: (b7) r0 = 0 ; R0_w=0 > 14: (73) *(u8 *)(r2 +0) = r0 ; R0_w=0 > R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) fp-8=mmmmmmmm fp-16=0 > 15: (bf) r1 = r6 ; > R1_w=map_value(map=.data.two_byte_,ks=4,vs=2) > R6=map_value(map=.data.two_byte_,ks=4,vs=2) > 16: (71) r2 = *(u8 *)(r2 +0) ; > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) fp-16=0 > 17: (0f) r1 += r2 > mark_precise: frame0: last_idx 17 first_idx 8 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 16: (71) r2 = *(u8 *)(r2 +0) > 18: > R1_w=map_value(map=.data.two_byte_,ks=4,vs=2,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) > 18: (73) *(u8 *)(r1 +0) = r2 > invalid access to map value, value_size=2 off=255 size=1 > R1 max value is outside of the allowed memory range > > Now, let us add the precision marking, > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4619,11 +4619,18 @@ static int check_stack_write_var_off(struct > bpf_verifier_env *env, > > slot = -i - 1; > spi = slot / BPF_REG_SIZE; > + mark_stack_slot_scratched(env, spi); > > /* If writing_zero and the the spi slot contains a > spill of value 0, > * maintain the spill type. > */ > if (writing_zero && !(i % BPF_REG_SIZE) && > is_spilled_scalar_reg(&state->stack[spi])) { > + if (value_regno >= 0) { > + err = mark_chain_precision(env, > value_regno); > + if (err) > + return err; > + } > + > spill_reg = &state->stack[spi].spilled_ptr; > if (tnum_is_const(spill_reg->var_off) && > spill_reg->var_off.value == 0) { > for (j = BPF_REG_SIZE; j > 0; j--) { > @@ -4636,7 +4643,6 @@ static int check_stack_write_var_off(struct > bpf_verifier_env *env, > } > > stype = &state->stack[spi].slot_type[slot % > BPF_REG_SIZE]; > - mark_stack_slot_scratched(env, spi); > > if (!env->allow_ptr_leaks && *stype != STACK_MISC && > *stype != STACK_ZERO) { > /* Reject the write if range we may write to > has not > > > With the above change, the verifier output: > ... > 13: > R1_w=scalar(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) > R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) > 13: (b7) r0 = 0 ; R0_w=0 > 14: (73) *(u8 *)(r2 +0) = r0 > mark_precise: frame0: last_idx 14 first_idx 8 subseq_idx -1 > mark_precise: frame0: regs=r0 stack= before 13: (b7) r0 = 0 > <==== added precision marking for the value register > 15: R0_w=0 > R2_w=fp(smin=smin32=-16,smax=smax32=-1,umin=0xfffffffffffffff0,umin32=0xfffffff0,var_off=(0xfffffffffffffff0; > 0xf)) fp-8=mmmmmmmm fp-16=0 > 15: (bf) r1 = r6 ; > R1_w=map_value(map=.data.two_byte_,ks=4,vs=2) > R6=map_value(map=.data.two_byte_,ks=4,vs=2) > 16: (71) r2 = *(u8 *)(r2 +0) ; > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) fp-16=0 > 17: (0f) r1 += r2 > mark_precise: frame0: last_idx 17 first_idx 8 subseq_idx -1 > mark_precise: frame0: regs=r2 stack= before 16: (71) r2 = *(u8 *)(r2 +0) > 18: > R1_w=map_value(map=.data.two_byte_,ks=4,vs=2,smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) > R2_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; > 0xff)) > 18: (73) *(u8 *)(r1 +0) = r2 > invalid access to map value, value_size=2 off=255 size=1 > R1 max value is outside of the allowed memory range > > Note that we do have precision marking for register r0 at insn 14. > But backtracking at insn 17 stops at insn 16 and it did not reach back > to insn 14, so precision marking is not really needed in this particular > case. Maybe I missed something here. > > There is an alternative implementation in check_stack_write_var_off(). > For a spill of value/reg 0, we can convert it to STACK_ZERO instead > of trying to maintain STACK_SPILL. If we convert it to STACK_ZERO, > then we can reuse the rest of logic in check_stack_write_var_off() > and at the end we have > > if (zero_used) { > /* backtracking doesn't work for STACK_ZERO yet. */ > err = mark_chain_precision(env, value_regno); > if (err) > return err; > } > > although I do not fully understand the above either. Need to go back to > git history to find why. Did some investigation with selftests. test_progs/test_progs-cpuv4 run did not hit the above mark_chain_precision. Only one test in test_verifier hits the above code. { "BPF_ST_MEM stack imm zero, variable offset", .insns = { /* set fp[-16], fp[-24] to zeros */ BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0), BPF_ST_MEM(BPF_DW, BPF_REG_10, -24, 0), /* r0 = random value in range [-32, -15] */ BPF_EMIT_CALL(BPF_FUNC_get_prandom_u32), BPF_JMP_IMM(BPF_JLE, BPF_REG_0, 16, 2), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), BPF_ALU64_IMM(BPF_SUB, BPF_REG_0, 32), /* fp[r0] = 0, make a variable offset write of zero, * this should preserve zero marks on stack. */ BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_10), BPF_ST_MEM(BPF_B, BPF_REG_0, 0, 0), /* r0 = fp[-20], if variable offset write was tracked correctly * r0 would be a known zero. */ BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_10, -20), /* Would fail return code verification if r0 range is not tracked correctly. */ BPF_EXIT_INSN(), }, .result = ACCEPT, /* Use prog type that requires return value in range [0, 1] */ .prog_type = BPF_PROG_TYPE_SK_LOOKUP, .expected_attach_type = BPF_SK_LOOKUP, .runs = -1, }, The verifier output for the insn with variable range and with writing 0: 8: (72) *(u8 *)(r0 +0) = 0 ; R0_w=fp(smin=smin32=-32,smax=smax32=-16,umin=0xffffffffffffffe0,umax=0xfffffffffffffff0,umin32=0xffffffe0,umax32=0xfffffff0,var_off=(0xffffffffffffffe0; 0x1f)) fp-16_w=00000000 fp-24_w=00000000 fp-32=mmmmmmmm 9: (71) r0 = *(u8 *)(r10 -20) ; R0_w=0 R10=fp0 fp-24_w=00000000 In the above, value_regno is -1, so backtracking is actually a noop. > > >> >> Anyways, I still claim that we are mishandling a precision of spilled >> register when doing zero var_off writes. >> >> >> >>> [...] > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 7:14 ` Yonghong Song 2024-01-05 8:10 ` Yonghong Song @ 2024-01-05 23:37 ` Eduard Zingerman 2024-01-08 18:59 ` Yonghong Song 1 sibling, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-05 23:37 UTC (permalink / raw) To: Yonghong Song, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, 2024-01-04 at 23:14 -0800, Yonghong Song wrote: [...] > There is an alternative implementation in check_stack_write_var_off(). > For a spill of value/reg 0, we can convert it to STACK_ZERO instead > of trying to maintain STACK_SPILL. If we convert it to STACK_ZERO, > then we can reuse the rest of logic in check_stack_write_var_off() > and at the end we have > > if (zero_used) { > /* backtracking doesn't work for STACK_ZERO yet. */ > err = mark_chain_precision(env, value_regno); > if (err) > return err; > } > > although I do not fully understand the above either. Need to go back to > git history to find why. Regarding this particular code (unrelated to this the patch-set). Both check_stack_read_fixed_off() and check_stack_read_var_off() set destination register to zero if all bytes at varying offset are STACK_ZERO. Backtracking can handle fixed offset writes, but does not know how to handle varying offset writes. E.g. if: - there is some code 'arr[i] = r0'; - and r0 happens to be zero for some state; - later arr[i] is used in precise context; Verifier would have no means to propagate precision mark to r0. Hence apply precision mark conservatively. Does that makes sense? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 23:37 ` Eduard Zingerman @ 2024-01-08 18:59 ` Yonghong Song 2024-01-08 19:06 ` Eduard Zingerman 0 siblings, 1 reply; 26+ messages in thread From: Yonghong Song @ 2024-01-08 18:59 UTC (permalink / raw) To: Eduard Zingerman, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/5/24 3:37 PM, Eduard Zingerman wrote: > On Thu, 2024-01-04 at 23:14 -0800, Yonghong Song wrote: > [...] >> There is an alternative implementation in check_stack_write_var_off(). >> For a spill of value/reg 0, we can convert it to STACK_ZERO instead >> of trying to maintain STACK_SPILL. If we convert it to STACK_ZERO, >> then we can reuse the rest of logic in check_stack_write_var_off() >> and at the end we have >> >> if (zero_used) { >> /* backtracking doesn't work for STACK_ZERO yet. */ >> err = mark_chain_precision(env, value_regno); >> if (err) >> return err; >> } >> >> although I do not fully understand the above either. Need to go back to >> git history to find why. > Regarding this particular code (unrelated to this the patch-set). > > Both check_stack_read_fixed_off() and check_stack_read_var_off() > set destination register to zero if all bytes at varying offset are STACK_ZERO. > Backtracking can handle fixed offset writes, but does not know how to > handle varying offset writes. E.g. if: > - there is some code 'arr[i] = r0'; > - and r0 happens to be zero for some state; > - later arr[i] is used in precise context; > Verifier would have no means to propagate precision mark to r0. > Hence apply precision mark conservatively. > > Does that makes sense? Thanks for explanation. I guess I understand now, it does make sense. let us say arr array's element type is long (8 byte) and the range of i could be from -32 to -1. I guess one way could be doing backtracking with "... = arr[i]" is to have four ranges, [-32, -24), [-24, -16), [-16, -8), [-8, 0). Later, when we see arr[i] = r0 and i has range [-32, 0). Since it covers [-32, -24), etc., precision marking can proceed with 'r0'. But I guess this can potentially increase verifier backtracking states a lot and is not scalable. Conservatively doing precision marking with 'r0' (in arr[i] = r0) is a better idea. Andrii has similar comments in https://lore.kernel.org/bpf/CAEf4Bzb0LdSPnFZ-kPRftofA6LsaOkxXLN4_fr9BLR3iG-te-g@mail.gmail.com/ ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-08 18:59 ` Yonghong Song @ 2024-01-08 19:06 ` Eduard Zingerman 2024-01-08 19:40 ` Yonghong Song 0 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-08 19:06 UTC (permalink / raw) To: Yonghong Song, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Mon, 2024-01-08 at 10:59 -0800, Yonghong Song wrote: [...] > I guess one way could be doing backtracking with "... = arr[i]" > is to have four ranges, [-32, -24), [-24, -16), [-16, -8), [-8, 0). > Later, when we see arr[i] = r0 and i has range [-32, 0). Since it covers [-32, -24), etc., > precision marking can proceed with 'r0'. But I guess this can potentially > increase verifier backtracking states a lot and is not scalable. Conservatively > doing precision marking with 'r0' (in arr[i] = r0) is a better idea. In theory it should be possible to collapse this range to min/max pair. But it is a complication, and I'd say it shouldn't be implemented unless we have evidence that it significantly improves verification performance. > > Andrii has similar comments in > https://lore.kernel.org/bpf/CAEf4Bzb0LdSPnFZ-kPRftofA6LsaOkxXLN4_fr9BLR3iG-te-g@mail.gmail.com/ > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-08 19:06 ` Eduard Zingerman @ 2024-01-08 19:40 ` Yonghong Song 0 siblings, 0 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-08 19:40 UTC (permalink / raw) To: Eduard Zingerman, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/8/24 11:06 AM, Eduard Zingerman wrote: > On Mon, 2024-01-08 at 10:59 -0800, Yonghong Song wrote: > [...] > >> I guess one way could be doing backtracking with "... = arr[i]" >> is to have four ranges, [-32, -24), [-24, -16), [-16, -8), [-8, 0). >> Later, when we see arr[i] = r0 and i has range [-32, 0). Since it covers [-32, -24), etc., >> precision marking can proceed with 'r0'. But I guess this can potentially >> increase verifier backtracking states a lot and is not scalable. Conservatively >> doing precision marking with 'r0' (in arr[i] = r0) is a better idea. > In theory it should be possible to collapse this range to min/max pair. > But it is a complication, and I'd say it shouldn't be implemented > unless we have evidence that it significantly improves verification > performance. Ack. We do not need to introduce this yet as the variable index range should be much much less common. > >> Andrii has similar comments in >> https://lore.kernel.org/bpf/CAEf4Bzb0LdSPnFZ-kPRftofA6LsaOkxXLN4_fr9BLR3iG-te-g@mail.gmail.com/ >> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 1:05 ` Andrii Nakryiko 2024-01-05 7:14 ` Yonghong Song @ 2024-01-05 23:52 ` Eduard Zingerman 2024-01-08 19:51 ` Yonghong Song 2024-01-08 23:18 ` Andrii Nakryiko 1 sibling, 2 replies; 26+ messages in thread From: Eduard Zingerman @ 2024-01-05 23:52 UTC (permalink / raw) To: Andrii Nakryiko Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Thu, 2024-01-04 at 17:05 -0800, Andrii Nakryiko wrote: [...] > > The test is actually quite minimal, the longest part is conjuring of > > varying offset pointer in r2, here it is with additional comments: > > > > /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ > > "call %[bpf_get_prandom_u32];" > > "r9 = 100500;" > > "if r0 > 42 goto +1;" > > "r9 = 0;" > > "*(u64 *)(r10 - 16) = r9;" > > /* prepare a variable length access */ > > "call %[bpf_get_prandom_u32];" > > "r0 &= 0xf;" /* r0 range is [0; 15] */ > > "r1 = -1;" > > "r1 -= r0;" /* r1 range is [-16; -1] */ > > "r2 = r10;" > > "r2 += r1;" /* r2 range is [fp-16; fp-1] */ > > /* do a variable length write of constant 0 */ > > "r0 = 0;" > > "*(u8 *)(r2 + 0) = r0;" [...] > Yes, and the test fails, but if you read the log, you'll see that fp-8 > is never marked precise, but it should. So we need more elaborate test > that would somehow exploit fp-8 imprecision. Sorry, I don't understand why fp-8 should be precise for this particular test. Only value read from fp-16 is used in precise context. [...] > So keep both read and write as variable offset. And we are saved by > some missing logic in read_var_off that would set r2 as known zero > (because it should be for the branch where both fp-8 and fp-16 are > zero). But that fails in the branch that should succeed, and if that > branch actually succeeds, I suspect the branch where we initialize > with non-zero r9 will erroneously succeed. > > Anyways, I still claim that we are mishandling a precision of spilled > register when doing zero var_off writes. Currently check_stack_read_var_off() has two possible outcomes: - if all bytes at varying offset are STACK_ZERO destination register is set to zero; - otherwise destination register is set to unbound scalar. Unless I missed something, STACK_ZERO is assigned to .slot_type only in check_stack_write_fixed_off(), and there the source register is marked as precise immediately. So, it appears to me that current state of patch #1 is ok. In case if check_stack_read_var_off() would be modified to check not only for STACK_ZERO, but also for zero spills, I think that all such spills would have to be marked precise at the time of read, as backtracking would not be able to find those later. But that is not related to change in check_stack_write_var_off() introduced by this patch. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 23:52 ` Eduard Zingerman @ 2024-01-08 19:51 ` Yonghong Song 2024-01-08 20:05 ` Eduard Zingerman 2024-01-08 23:18 ` Andrii Nakryiko 1 sibling, 1 reply; 26+ messages in thread From: Yonghong Song @ 2024-01-08 19:51 UTC (permalink / raw) To: Eduard Zingerman, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/5/24 3:52 PM, Eduard Zingerman wrote: > On Thu, 2024-01-04 at 17:05 -0800, Andrii Nakryiko wrote: > [...] >>> The test is actually quite minimal, the longest part is conjuring of >>> varying offset pointer in r2, here it is with additional comments: >>> >>> /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ >>> "call %[bpf_get_prandom_u32];" >>> "r9 = 100500;" >>> "if r0 > 42 goto +1;" >>> "r9 = 0;" >>> "*(u64 *)(r10 - 16) = r9;" >>> /* prepare a variable length access */ >>> "call %[bpf_get_prandom_u32];" >>> "r0 &= 0xf;" /* r0 range is [0; 15] */ >>> "r1 = -1;" >>> "r1 -= r0;" /* r1 range is [-16; -1] */ >>> "r2 = r10;" >>> "r2 += r1;" /* r2 range is [fp-16; fp-1] */ >>> /* do a variable length write of constant 0 */ >>> "r0 = 0;" >>> "*(u8 *)(r2 + 0) = r0;" > [...] >> Yes, and the test fails, but if you read the log, you'll see that fp-8 >> is never marked precise, but it should. So we need more elaborate test >> that would somehow exploit fp-8 imprecision. > Sorry, I don't understand why fp-8 should be precise for this particular test. > Only value read from fp-16 is used in precise context. > > [...] >> So keep both read and write as variable offset. And we are saved by >> some missing logic in read_var_off that would set r2 as known zero >> (because it should be for the branch where both fp-8 and fp-16 are >> zero). But that fails in the branch that should succeed, and if that >> branch actually succeeds, I suspect the branch where we initialize >> with non-zero r9 will erroneously succeed. >> >> Anyways, I still claim that we are mishandling a precision of spilled >> register when doing zero var_off writes. > Currently check_stack_read_var_off() has two possible outcomes: > - if all bytes at varying offset are STACK_ZERO destination register > is set to zero; > - otherwise destination register is set to unbound scalar. > > Unless I missed something, STACK_ZERO is assigned to .slot_type only > in check_stack_write_fixed_off(), and there the source register is > marked as precise immediately. > > So, it appears to me that current state of patch #1 is ok. > > In case if check_stack_read_var_off() would be modified to check not > only for STACK_ZERO, but also for zero spills, I think that all such > spills would have to be marked precise at the time of read, > as backtracking would not be able to find those later. I don't understand the above. If the code pattern looks like r1 = ...; /* r1 range [-32, -16); *(u8 *)(r10 + r1) = r2; ... r3 = *(u8 *)(r10 + r1); r3 needs to be marked as precise. Conservatively marking r2 in '*(u8 *)(r10 + r1) = r2' as precise should be the correct way to do. Or you are thinking even more complex code pattern like *(u64 *)(r10 - 32) = r4; *(u64 *)(r10 - 24) = r5; ... r1 = ...; /* r1 range [-32, -16) */ r3 = *(u8 *)(r10 + r1); r3 needs to be marked as precise. In this case, we should proactively mark r4 and r5 as precise. But currently we did not do it, right? I think this later case is a very unlikely case. > But that is not related to change in check_stack_write_var_off() > introduced by this patch. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-08 19:51 ` Yonghong Song @ 2024-01-08 20:05 ` Eduard Zingerman 2024-01-08 21:51 ` Yonghong Song 0 siblings, 1 reply; 26+ messages in thread From: Eduard Zingerman @ 2024-01-08 20:05 UTC (permalink / raw) To: Yonghong Song, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Mon, 2024-01-08 at 11:51 -0800, Yonghong Song wrote: [...] > > In case if check_stack_read_var_off() would be modified to check not > > only for STACK_ZERO, but also for zero spills, I think that all such > > spills would have to be marked precise at the time of read, > > as backtracking would not be able to find those later. > > I don't understand the above. If the code pattern looks like > r1 = ...; /* r1 range [-32, -16); > *(u8 *)(r10 + r1) = r2; > ... > r3 = *(u8 *)(r10 + r1); > r3 needs to be marked as precise. > > Conservatively marking r2 in '*(u8 *)(r10 + r1) = r2' as precise > should be the correct way to do. > > Or you are thinking even more complex code pattern like > *(u64 *)(r10 - 32) = r4; > *(u64 *)(r10 - 24) = r5; > ... > r1 = ...; /* r1 range [-32, -16) */ > r3 = *(u8 *)(r10 + r1); > r3 needs to be marked as precise. > > In this case, we should proactively mark r4 and r5 as precise. > But currently we did not do it, right? Yes, I'm thinking about the latter scenario. There would be zero spills for fp-32 and fp-24. If check_stack_read_var_off() is modified to handle zero spills, then it would conclude that r3 is zero. But if r3 is later marked precise, there would be no info for backtracking to mark fp-32, fp-24, r4, r5 as precise: - either backtracking info would have to be supplemented with a list of stack locations that were spilled zeros at time of check_stack_read_var_off(); - or check_stack_read_var_off() would need to conservatively mark all spilled zeros as precise. Nothing like that is needed now, because check_stack_read_var_off() would return unbound scalar for r3 upon seeing zero spill. > I think this later case is a very unlikely case. But it is possible and verifier has to be conservative. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-08 20:05 ` Eduard Zingerman @ 2024-01-08 21:51 ` Yonghong Song 0 siblings, 0 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-08 21:51 UTC (permalink / raw) To: Eduard Zingerman, Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/8/24 12:05 PM, Eduard Zingerman wrote: > On Mon, 2024-01-08 at 11:51 -0800, Yonghong Song wrote: > [...] >>> In case if check_stack_read_var_off() would be modified to check not >>> only for STACK_ZERO, but also for zero spills, I think that all such >>> spills would have to be marked precise at the time of read, >>> as backtracking would not be able to find those later. >> I don't understand the above. If the code pattern looks like >> r1 = ...; /* r1 range [-32, -16); >> *(u8 *)(r10 + r1) = r2; >> ... >> r3 = *(u8 *)(r10 + r1); >> r3 needs to be marked as precise. >> >> Conservatively marking r2 in '*(u8 *)(r10 + r1) = r2' as precise >> should be the correct way to do. >> >> Or you are thinking even more complex code pattern like >> *(u64 *)(r10 - 32) = r4; >> *(u64 *)(r10 - 24) = r5; >> ... >> r1 = ...; /* r1 range [-32, -16) */ >> r3 = *(u8 *)(r10 + r1); >> r3 needs to be marked as precise. >> >> In this case, we should proactively mark r4 and r5 as precise. >> But currently we did not do it, right? > Yes, I'm thinking about the latter scenario. > There would be zero spills for fp-32 and fp-24. > If check_stack_read_var_off() is modified to handle zero spills, > then it would conclude that r3 is zero. > But if r3 is later marked precise, there would be no info for > backtracking to mark fp-32, fp-24, r4, r5 as precise: > - either backtracking info would have to be supplemented with a list > of stack locations that were spilled zeros at time of > check_stack_read_var_off(); > - or check_stack_read_var_off() would need to conservatively mark > all spilled zeros as precise. > > Nothing like that is needed now, because check_stack_read_var_off() > would return unbound scalar for r3 upon seeing zero spill. > >> I think this later case is a very unlikely case. > But it is possible and verifier has to be conservative. Indeed. Such variable stack read has been handled properly. Now I think I understand better on how verifier works for load/store with var offsets. Basically some small changes in check_stack_write_var_off(). Will post v3 soon. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-05 23:52 ` Eduard Zingerman 2024-01-08 19:51 ` Yonghong Song @ 2024-01-08 23:18 ` Andrii Nakryiko 1 sibling, 0 replies; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-08 23:18 UTC (permalink / raw) To: Eduard Zingerman Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Fri, Jan 5, 2024 at 3:52 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2024-01-04 at 17:05 -0800, Andrii Nakryiko wrote: > [...] > > > The test is actually quite minimal, the longest part is conjuring of > > > varying offset pointer in r2, here it is with additional comments: > > > > > > /* Write 0 or 100500 to fp-16, 0 is on the first verification pass */ > > > "call %[bpf_get_prandom_u32];" > > > "r9 = 100500;" > > > "if r0 > 42 goto +1;" > > > "r9 = 0;" > > > "*(u64 *)(r10 - 16) = r9;" > > > /* prepare a variable length access */ > > > "call %[bpf_get_prandom_u32];" > > > "r0 &= 0xf;" /* r0 range is [0; 15] */ > > > "r1 = -1;" > > > "r1 -= r0;" /* r1 range is [-16; -1] */ > > > "r2 = r10;" > > > "r2 += r1;" /* r2 range is [fp-16; fp-1] */ > > > /* do a variable length write of constant 0 */ > > > "r0 = 0;" > > > "*(u8 *)(r2 + 0) = r0;" > [...] > > Yes, and the test fails, but if you read the log, you'll see that fp-8 > > is never marked precise, but it should. So we need more elaborate test > > that would somehow exploit fp-8 imprecision. > > Sorry, I don't understand why fp-8 should be precise for this particular test. > Only value read from fp-16 is used in precise context. > > [...] > > So keep both read and write as variable offset. And we are saved by > > some missing logic in read_var_off that would set r2 as known zero > > (because it should be for the branch where both fp-8 and fp-16 are > > zero). But that fails in the branch that should succeed, and if that > > branch actually succeeds, I suspect the branch where we initialize > > with non-zero r9 will erroneously succeed. > > > > Anyways, I still claim that we are mishandling a precision of spilled > > register when doing zero var_off writes. > > Currently check_stack_read_var_off() has two possible outcomes: > - if all bytes at varying offset are STACK_ZERO destination register > is set to zero; > - otherwise destination register is set to unbound scalar. > > Unless I missed something, STACK_ZERO is assigned to .slot_type only > in check_stack_write_fixed_off(), and there the source register is > marked as precise immediately. > > So, it appears to me that current state of patch #1 is ok. I agree. Thinking some more I agree with you, what I was concerned about should be handled properly at read time. So I think what we have in this patch is ok. Sorry for the noise :) > > In case if check_stack_read_var_off() would be modified to check not > only for STACK_ZERO, but also for zero spills, I think that all such > spills would have to be marked precise at the time of read, > as backtracking would not be able to find those later. > But that is not related to change in check_stack_write_var_off() > introduced by this patch. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song ` (2 preceding siblings ...) 2024-01-04 18:30 ` Eduard Zingerman @ 2024-01-04 23:03 ` Andrii Nakryiko 2024-01-05 0:49 ` Andrii Nakryiko 2024-01-08 23:23 ` Andrii Nakryiko 5 siblings, 0 replies; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-04 23:03 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Wed, Jan 3, 2024 at 3:26 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > With patch set [1], precision backtracing supports register spill/fill > to/from the stack. The patch [2] allows initial imprecise register spill > with content 0. This is a common case for cpuv3 and lower for > initializing the stack variables with pattern > r1 = 0 > *(u64 *)(r10 - 8) = r1 > and the [2] has demonstrated good verification improvement. > > For cpuv4, the initialization could be > *(u64 *)(r10 - 8) = 0 > The current verifier marks the r10-8 contents with STACK_ZERO. > Similar to [2], let us permit the above insn to behave like > imprecise register spill which can reduce number of verified states. > The change is in function check_stack_write_fixed_off(). > > Before this patch, spilled zero will be marked as STACK_ZERO > which can provide precise values. In check_stack_write_var_off(), > STACK_ZERO will be maintained if writing a const zero > so later it can provide precise values if needed. > > The above handling of '*(u64 *)(r10 - 8) = 0' as a spill > will have issues in check_stack_write_var_off() as the spill > will be converted to STACK_MISC and the precise value 0 > is lost. To fix this issue, if the spill slots with const > zero and the BPF_ST write also with const zero, the spill slots > are preserved, which can later provide precise values > if needed. Without the change in check_stack_write_var_off(), > the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset' > will fail. > > I checked cpuv3 and cpuv4 with and without this patch with veristat. > There is no state change for cpuv3 since '*(u64 *)(r10 - 8) = 0' > is only generated with cpuv4. > > For cpuv4: > $ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0' > File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) > ------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- ------------- > local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%) > pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%) > test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%) > test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%) > test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%) > test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%) > verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%) > xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%) > xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%) > > test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but > pyperf600_bpf_loop and local_storage_bench gets pretty good improvement. > > [1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/ > [2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ > > Cc: Kuniyuki Iwashima <kuniyu@amazon.com> > Cc: Martin KaFai Lau <kafai@fb.com> > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 21 +++++++++++++++++-- > .../selftests/bpf/progs/verifier_spill_fill.c | 16 +++++++------- > 2 files changed, 27 insertions(+), 10 deletions(-) > > Changelogs: > v1 -> v2: > - Preserve with-const-zero spill if writing is also zero > in check_stack_write_var_off(). > - Add a test with not-8-byte-aligned BPF_ST store. > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d4e31f61de0e..cfe7a68d90a5 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > state->stack[spi].spilled_ptr.id = 0; > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > - insn->imm != 0 && env->bpf_capable) { > + env->bpf_capable) { > struct bpf_reg_state fake_reg = {}; > > __mark_reg_known(&fake_reg, insn->imm); > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > /* Variable offset writes destroy any spilled pointers in range. */ > for (i = min_off; i < max_off; i++) { > + struct bpf_reg_state *spill_reg; > u8 new_type, *stype; > - int slot, spi; > + int slot, spi, j; > > slot = -i - 1; > spi = slot / BPF_REG_SIZE; > + > + /* If writing_zero and the the spi slot contains a spill of value 0, > + * maintain the spill type. > + */ > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > + spill_reg = &state->stack[spi].spilled_ptr; > + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { here we assume that *spilled* register is zero and will stay zero, even if it's imprecise. This is wrong, because imprecise SCALAR 0 is actually an unknown scalar when doing state pruning. So we need to either force the spilled register to be precise, or overwrite it with STACK_MISC. > + for (j = BPF_REG_SIZE; j > 0; j--) { > + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) > + break; > + } > + i += BPF_REG_SIZE - j - 1; > + continue; > + } > + } > + > stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; > mark_stack_slot_scratched(env, spi); > [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song ` (3 preceding siblings ...) 2024-01-04 23:03 ` Andrii Nakryiko @ 2024-01-05 0:49 ` Andrii Nakryiko 2024-01-08 23:23 ` Andrii Nakryiko 5 siblings, 0 replies; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-05 0:49 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Wed, Jan 3, 2024 at 3:26 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > With patch set [1], precision backtracing supports register spill/fill > to/from the stack. The patch [2] allows initial imprecise register spill > with content 0. This is a common case for cpuv3 and lower for > initializing the stack variables with pattern > r1 = 0 > *(u64 *)(r10 - 8) = r1 > and the [2] has demonstrated good verification improvement. > > For cpuv4, the initialization could be > *(u64 *)(r10 - 8) = 0 > The current verifier marks the r10-8 contents with STACK_ZERO. > Similar to [2], let us permit the above insn to behave like > imprecise register spill which can reduce number of verified states. > The change is in function check_stack_write_fixed_off(). > > Before this patch, spilled zero will be marked as STACK_ZERO > which can provide precise values. In check_stack_write_var_off(), > STACK_ZERO will be maintained if writing a const zero > so later it can provide precise values if needed. > > The above handling of '*(u64 *)(r10 - 8) = 0' as a spill > will have issues in check_stack_write_var_off() as the spill > will be converted to STACK_MISC and the precise value 0 > is lost. To fix this issue, if the spill slots with const > zero and the BPF_ST write also with const zero, the spill slots > are preserved, which can later provide precise values > if needed. Without the change in check_stack_write_var_off(), > the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset' > will fail. > > I checked cpuv3 and cpuv4 with and without this patch with veristat. > There is no state change for cpuv3 since '*(u64 *)(r10 - 8) = 0' > is only generated with cpuv4. > > For cpuv4: > $ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0' > File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) > ------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- ------------- > local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%) > pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%) > test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%) > test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%) > test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%) > test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%) > verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%) > xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%) > xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%) > > test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but > pyperf600_bpf_loop and local_storage_bench gets pretty good improvement. > > [1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/ > [2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ > > Cc: Kuniyuki Iwashima <kuniyu@amazon.com> > Cc: Martin KaFai Lau <kafai@fb.com> > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 21 +++++++++++++++++-- > .../selftests/bpf/progs/verifier_spill_fill.c | 16 +++++++------- > 2 files changed, 27 insertions(+), 10 deletions(-) > > Changelogs: > v1 -> v2: > - Preserve with-const-zero spill if writing is also zero > in check_stack_write_var_off(). > - Add a test with not-8-byte-aligned BPF_ST store. > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d4e31f61de0e..cfe7a68d90a5 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > state->stack[spi].spilled_ptr.id = 0; > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > - insn->imm != 0 && env->bpf_capable) { > + env->bpf_capable) { > struct bpf_reg_state fake_reg = {}; > > __mark_reg_known(&fake_reg, insn->imm); > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > /* Variable offset writes destroy any spilled pointers in range. */ > for (i = min_off; i < max_off; i++) { > + struct bpf_reg_state *spill_reg; > u8 new_type, *stype; > - int slot, spi; > + int slot, spi, j; > > slot = -i - 1; > spi = slot / BPF_REG_SIZE; > + > + /* If writing_zero and the the spi slot contains a spill of value 0, > + * maintain the spill type. > + */ > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > + spill_reg = &state->stack[spi].spilled_ptr; > + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { > + for (j = BPF_REG_SIZE; j > 0; j--) { > + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) > + break; > + } > + i += BPF_REG_SIZE - j - 1; > + continue; > + } > + } > + > stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; > mark_stack_slot_scratched(env, spi); > you are skipping this in some situations, which makes debugging harder because we won't see state of all affected slots, so please move it up and set it right after calculating spi > diff --git a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c > index 39fe3372e0e0..d4b3188afe07 100644 > --- a/tools/testing/selftests/bpf/progs/verifier_spill_fill.c > +++ b/tools/testing/selftests/bpf/progs/verifier_spill_fill.c > @@ -495,14 +495,14 @@ char single_byte_buf[1] SEC(".data.single_byte_buf"); > SEC("raw_tp") > __log_level(2) > __success [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song ` (4 preceding siblings ...) 2024-01-05 0:49 ` Andrii Nakryiko @ 2024-01-08 23:23 ` Andrii Nakryiko 2024-01-08 23:39 ` Yonghong Song 5 siblings, 1 reply; 26+ messages in thread From: Andrii Nakryiko @ 2024-01-08 23:23 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On Wed, Jan 3, 2024 at 3:26 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > With patch set [1], precision backtracing supports register spill/fill > to/from the stack. The patch [2] allows initial imprecise register spill > with content 0. This is a common case for cpuv3 and lower for > initializing the stack variables with pattern > r1 = 0 > *(u64 *)(r10 - 8) = r1 > and the [2] has demonstrated good verification improvement. > > For cpuv4, the initialization could be > *(u64 *)(r10 - 8) = 0 > The current verifier marks the r10-8 contents with STACK_ZERO. > Similar to [2], let us permit the above insn to behave like > imprecise register spill which can reduce number of verified states. > The change is in function check_stack_write_fixed_off(). > > Before this patch, spilled zero will be marked as STACK_ZERO > which can provide precise values. In check_stack_write_var_off(), > STACK_ZERO will be maintained if writing a const zero > so later it can provide precise values if needed. > > The above handling of '*(u64 *)(r10 - 8) = 0' as a spill > will have issues in check_stack_write_var_off() as the spill > will be converted to STACK_MISC and the precise value 0 > is lost. To fix this issue, if the spill slots with const > zero and the BPF_ST write also with const zero, the spill slots > are preserved, which can later provide precise values > if needed. Without the change in check_stack_write_var_off(), > the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset' > will fail. > > I checked cpuv3 and cpuv4 with and without this patch with veristat. > There is no state change for cpuv3 since '*(u64 *)(r10 - 8) = 0' > is only generated with cpuv4. > > For cpuv4: > $ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0' > File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) > ------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- ------------- > local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%) > pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%) > test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%) > test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%) > test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%) > test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%) > verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%) > xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%) > xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%) > > test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but > pyperf600_bpf_loop and local_storage_bench gets pretty good improvement. > > [1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/ > [2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ > > Cc: Kuniyuki Iwashima <kuniyu@amazon.com> > Cc: Martin KaFai Lau <kafai@fb.com> > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 21 +++++++++++++++++-- > .../selftests/bpf/progs/verifier_spill_fill.c | 16 +++++++------- > 2 files changed, 27 insertions(+), 10 deletions(-) > > Changelogs: > v1 -> v2: > - Preserve with-const-zero spill if writing is also zero > in check_stack_write_var_off(). > - Add a test with not-8-byte-aligned BPF_ST store. > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index d4e31f61de0e..cfe7a68d90a5 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, > if (fls64(reg->umax_value) > BITS_PER_BYTE * size) > state->stack[spi].spilled_ptr.id = 0; > } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && > - insn->imm != 0 && env->bpf_capable) { > + env->bpf_capable) { > struct bpf_reg_state fake_reg = {}; > > __mark_reg_known(&fake_reg, insn->imm); > @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, > > /* Variable offset writes destroy any spilled pointers in range. */ > for (i = min_off; i < max_off; i++) { > + struct bpf_reg_state *spill_reg; > u8 new_type, *stype; > - int slot, spi; > + int slot, spi, j; > > slot = -i - 1; > spi = slot / BPF_REG_SIZE; > + > + /* If writing_zero and the the spi slot contains a spill of value 0, > + * maintain the spill type. > + */ > + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { > + spill_reg = &state->stack[spi].spilled_ptr; > + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { > + for (j = BPF_REG_SIZE; j > 0; j--) { > + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) > + break; > + } > + i += BPF_REG_SIZE - j - 1; > + continue; > + } > + } > + Yonghong, I just replied to one of Eduard's email. I think the overall approach will be correct. But two small things. Above, if you detect tnum_is_conxt() and value is zero, it seems like you'd need to set zero_used=true. But I actually want to propose to implement this slightly differently. Instead of skipping multiple bytes, I think it would be better to just check one byte at a time. Just like we have if (writing_zero && *stype == STACK_ZERO) { new_type = STACK_ZERO; zero_used = true; } we can insert if (writing_zero && *stype == STACK_SPILL && tnum_is_const(..) && var_off.value == 0) { zero_used = true; continue; } It will be very similar to STACK_ZERO handling, but we won't be overwriting slot type. But handling one byte at a time is more in line with the rest of the logic. WDYT? > stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; > mark_stack_slot_scratched(env, spi); > [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers 2024-01-08 23:23 ` Andrii Nakryiko @ 2024-01-08 23:39 ` Yonghong Song 0 siblings, 0 replies; 26+ messages in thread From: Yonghong Song @ 2024-01-08 23:39 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Kuniyuki Iwashima, Martin KaFai Lau On 1/8/24 3:23 PM, Andrii Nakryiko wrote: > On Wed, Jan 3, 2024 at 3:26 PM Yonghong Song <yonghong.song@linux.dev> wrote: >> With patch set [1], precision backtracing supports register spill/fill >> to/from the stack. The patch [2] allows initial imprecise register spill >> with content 0. This is a common case for cpuv3 and lower for >> initializing the stack variables with pattern >> r1 = 0 >> *(u64 *)(r10 - 8) = r1 >> and the [2] has demonstrated good verification improvement. >> >> For cpuv4, the initialization could be >> *(u64 *)(r10 - 8) = 0 >> The current verifier marks the r10-8 contents with STACK_ZERO. >> Similar to [2], let us permit the above insn to behave like >> imprecise register spill which can reduce number of verified states. >> The change is in function check_stack_write_fixed_off(). >> >> Before this patch, spilled zero will be marked as STACK_ZERO >> which can provide precise values. In check_stack_write_var_off(), >> STACK_ZERO will be maintained if writing a const zero >> so later it can provide precise values if needed. >> >> The above handling of '*(u64 *)(r10 - 8) = 0' as a spill >> will have issues in check_stack_write_var_off() as the spill >> will be converted to STACK_MISC and the precise value 0 >> is lost. To fix this issue, if the spill slots with const >> zero and the BPF_ST write also with const zero, the spill slots >> are preserved, which can later provide precise values >> if needed. Without the change in check_stack_write_var_off(), >> the test_verifier subtest 'BPF_ST_MEM stack imm zero, variable offset' >> will fail. >> >> I checked cpuv3 and cpuv4 with and without this patch with veristat. >> There is no state change for cpuv3 since '*(u64 *)(r10 - 8) = 0' >> is only generated with cpuv4. >> >> For cpuv4: >> $ ../veristat -C old.cpuv4.csv new.cpuv4.csv -e file,prog,insns,states -f 'insns_diff!=0' >> File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) >> ------------------------------------------ ------------------- --------- --------- --------------- ---------- ---------- ------------- >> local_storage_bench.bpf.linked3.o get_local 228 168 -60 (-26.32%) 17 14 -3 (-17.65%) >> pyperf600_bpf_loop.bpf.linked3.o on_event 6066 4889 -1177 (-19.40%) 403 321 -82 (-20.35%) >> test_cls_redirect.bpf.linked3.o cls_redirect 35483 35387 -96 (-0.27%) 2179 2177 -2 (-0.09%) >> test_l4lb_noinline.bpf.linked3.o balancer_ingress 4494 4522 +28 (+0.62%) 217 219 +2 (+0.92%) >> test_l4lb_noinline_dynptr.bpf.linked3.o balancer_ingress 1432 1455 +23 (+1.61%) 92 94 +2 (+2.17%) >> test_xdp_noinline.bpf.linked3.o balancer_ingress_v6 3462 3458 -4 (-0.12%) 216 216 +0 (+0.00%) >> verifier_iterating_callbacks.bpf.linked3.o widening 52 41 -11 (-21.15%) 4 3 -1 (-25.00%) >> xdp_synproxy_kern.bpf.linked3.o syncookie_tc 12412 11719 -693 (-5.58%) 345 330 -15 (-4.35%) >> xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 12478 11794 -684 (-5.48%) 346 331 -15 (-4.34%) >> >> test_l4lb_noinline and test_l4lb_noinline_dynptr has minor regression, but >> pyperf600_bpf_loop and local_storage_bench gets pretty good improvement. >> >> [1] https://lore.kernel.org/all/20231205184248.1502704-1-andrii@kernel.org/ >> [2] https://lore.kernel.org/all/20231205184248.1502704-9-andrii@kernel.org/ >> >> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> >> Cc: Martin KaFai Lau <kafai@fb.com> >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >> --- >> kernel/bpf/verifier.c | 21 +++++++++++++++++-- >> .../selftests/bpf/progs/verifier_spill_fill.c | 16 +++++++------- >> 2 files changed, 27 insertions(+), 10 deletions(-) >> >> Changelogs: >> v1 -> v2: >> - Preserve with-const-zero spill if writing is also zero >> in check_stack_write_var_off(). >> - Add a test with not-8-byte-aligned BPF_ST store. >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index d4e31f61de0e..cfe7a68d90a5 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -4491,7 +4491,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, >> if (fls64(reg->umax_value) > BITS_PER_BYTE * size) >> state->stack[spi].spilled_ptr.id = 0; >> } else if (!reg && !(off % BPF_REG_SIZE) && is_bpf_st_mem(insn) && >> - insn->imm != 0 && env->bpf_capable) { >> + env->bpf_capable) { >> struct bpf_reg_state fake_reg = {}; >> >> __mark_reg_known(&fake_reg, insn->imm); >> @@ -4613,11 +4613,28 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env, >> >> /* Variable offset writes destroy any spilled pointers in range. */ >> for (i = min_off; i < max_off; i++) { >> + struct bpf_reg_state *spill_reg; >> u8 new_type, *stype; >> - int slot, spi; >> + int slot, spi, j; >> >> slot = -i - 1; >> spi = slot / BPF_REG_SIZE; >> + >> + /* If writing_zero and the the spi slot contains a spill of value 0, >> + * maintain the spill type. >> + */ >> + if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { >> + spill_reg = &state->stack[spi].spilled_ptr; >> + if (tnum_is_const(spill_reg->var_off) && spill_reg->var_off.value == 0) { >> + for (j = BPF_REG_SIZE; j > 0; j--) { >> + if (state->stack[spi].slot_type[j - 1] != STACK_SPILL) >> + break; >> + } >> + i += BPF_REG_SIZE - j - 1; >> + continue; >> + } >> + } >> + > Yonghong, I just replied to one of Eduard's email. I think the overall > approach will be correct. > > But two small things. Above, if you detect tnum_is_conxt() and value > is zero, it seems like you'd need to set zero_used=true. Yes, my planned change is to add mark_chain_precision() explicitly after if (writing_zero && !(i % BPF_REG_SIZE) && is_spilled_scalar_reg(&state->stack[spi])) { But yes, setting zero_used=true much simpler. > > But I actually want to propose to implement this slightly differently. > Instead of skipping multiple bytes, I think it would be better to just > check one byte at a time. Just like we have > > > if (writing_zero && *stype == STACK_ZERO) { > new_type = STACK_ZERO; > zero_used = true; > } > > we can insert > > if (writing_zero && *stype == STACK_SPILL && tnum_is_const(..) && > var_off.value == 0) { > zero_used = true; > continue; > } > > It will be very similar to STACK_ZERO handling, but we won't be > overwriting slot type. But handling one byte at a time is more in line > with the rest of the logic. > > WDYT? Thanks for suggestion. Sounds good. Will do. > >> stype = &state->stack[spi].slot_type[slot % BPF_REG_SIZE]; >> mark_stack_slot_scratched(env, spi); >> > [...] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2024-01-08 23:39 UTC | newest] Thread overview: 26+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-03 23:26 [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Yonghong Song 2024-01-03 23:26 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest with not-8-byte aligned BPF_ST Yonghong Song 2024-01-04 16:37 ` [PATCH bpf-next v2 1/2] bpf: Track aligned st store as imprecise spilled registers Eduard Zingerman 2024-01-04 17:13 ` Yonghong Song 2024-01-04 18:43 ` Eduard Zingerman 2024-01-04 18:30 ` Eduard Zingerman 2024-01-04 20:12 ` Yonghong Song 2024-01-04 21:10 ` Eduard Zingerman 2024-01-04 23:09 ` Andrii Nakryiko 2024-01-04 23:29 ` Eduard Zingerman 2024-01-05 1:05 ` Andrii Nakryiko 2024-01-05 7:14 ` Yonghong Song 2024-01-05 8:10 ` Yonghong Song 2024-01-05 23:37 ` Eduard Zingerman 2024-01-08 18:59 ` Yonghong Song 2024-01-08 19:06 ` Eduard Zingerman 2024-01-08 19:40 ` Yonghong Song 2024-01-05 23:52 ` Eduard Zingerman 2024-01-08 19:51 ` Yonghong Song 2024-01-08 20:05 ` Eduard Zingerman 2024-01-08 21:51 ` Yonghong Song 2024-01-08 23:18 ` Andrii Nakryiko 2024-01-04 23:03 ` Andrii Nakryiko 2024-01-05 0:49 ` Andrii Nakryiko 2024-01-08 23:23 ` Andrii Nakryiko 2024-01-08 23:39 ` Yonghong Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox