* [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
@ 2026-04-13 23:30 Eduard Zingerman
2026-04-13 23:30 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Eduard Zingerman @ 2026-04-13 23:30 UTC (permalink / raw)
To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87
When the static arg tracking analysis encounters a store through a
pointer with imprecise or multi-offset destination, it must use weak
updates (join) instead of strong updates (overwrite) for the affected
at_stack slots. At runtime only one slot is actually written; the
others retain their old values.
Two cases are addressed:
- BPF_STX, handled by spill_to_stack(). It was gated on
`dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
pointers entirely.
- BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
clear_overlapping_stack_slots() which unconditionally set
`at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
when multiple candidate slots exist (cnt != 1), so that untouched
slots preserve their tracked values.
No veristat diff compared to current master when tested on selftests,
sched_ext, cilium and a set of Meta internal programs.
This addresses issues reported by sashiko for patch #7 in [1].
[1] https://sashiko.dev/#/patchset/20260410-patch-set-v4-0-5d4eecb343db%40gmail.com
Changelog:
v2 -> v3:
- Use check_add_overflow() in arg_add() (Alexei).
- Add missing fixes tag (CI bot).
- Remove unused __imm in the selftest (sashiko).
v1 -> v2:
- Delete the OFF_IMPRECISE constant, always rely on
arg_track->cnt == 0 as a marker the offset is imprecise.
(Alexei).
- Squash all patches together to simplify backporting to
'bpf' branch (Alexei).
v1: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v1-0-9f48a9999d6e@gmail.com/T/
v2: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v2-0-ff91c4f8d273@gmail.com/T/
---
Eduard Zingerman (2):
bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
kernel/bpf/liveness.c | 114 ++++++------
.../selftests/bpf/progs/verifier_live_stack.c | 193 +++++++++++++++++++++
2 files changed, 255 insertions(+), 52 deletions(-)
---
base-commit: 71b500afd2f7336f5b6c6026f2af546fc079be26
change-id: 20260413-stacklive-fixes-42e258cf0397
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX 2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman @ 2026-04-13 23:30 ` Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman 2026-04-15 16:00 ` [PATCH bpf-next v2 0/2] bpf: " patchwork-bot+netdevbpf 2 siblings, 0 replies; 5+ messages in thread From: Eduard Zingerman @ 2026-04-13 23:30 UTC (permalink / raw) To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87 BPF_STX through ARG_IMPRECISE dst should be recognized as a local spill and join at_stack with the written value. For example, consider the following situation: // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)} *(u64 *)(r1 + 0) = r8 Here the analysis should produce an equivalent of at_stack[*] = join(old, r8) BPF_ST through multi-offset or imprecise dst should join at_stack with none instead of overwriting the slots. For example, consider the following situation: // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)} *(u64 *)(r1 + 0) = 0 Here the analysis should produce an equivalent of at_stack[*r1] = join(old, none). Move the definition of the clear_overlapping_stack_slots() in order to have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to avoid having two ways to express imprecise offset. Only 'offset-imprecise {frame=N, cnt=0}' remains. Fixes: bf0c571f7feb ("bpf: introduce forward arg-tracking dataflow analysis") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- kernel/bpf/liveness.c | 114 +++++++++++++++++++++++++++----------------------- 1 file changed, 62 insertions(+), 52 deletions(-) diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c index 1fb4c511db5a..332e6e003f27 100644 --- a/kernel/bpf/liveness.c +++ b/kernel/bpf/liveness.c @@ -574,7 +574,7 @@ static int print_instances(struct bpf_verifier_env *env) * * precise {frame=N, off=V} -- known absolute frame index and byte offset * | - * offset-imprecise {frame=N, off=OFF_IMPRECISE} + * offset-imprecise {frame=N, cnt=0} * | -- known frame identity, unknown offset * fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask} * -- unknown frame identity; .mask is a @@ -607,8 +607,6 @@ enum arg_track_state { ARG_IMPRECISE = -3, /* lost identity; .mask is arg bitmask */ }; -#define OFF_IMPRECISE S16_MIN /* arg identity known but offset unknown */ - /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */ #define MAX_ARG_SPILL_SLOTS 64 @@ -622,28 +620,6 @@ static bool arg_is_fp(const struct arg_track *at) return at->frame >= 0 || at->frame == ARG_IMPRECISE; } -/* - * Clear all tracked callee stack slots overlapping the byte range - * [off, off+sz-1] where off is a negative FP-relative offset. - */ -static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz) -{ - struct arg_track none = { .frame = ARG_NONE }; - - if (off == OFF_IMPRECISE) { - for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) - at_stack[i] = none; - return; - } - for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) { - int slot_start = -((i + 1) * 8); - int slot_end = slot_start + 8; - - if (slot_start < off + (int)sz && slot_end > off) - at_stack[i] = none; - } -} - static void verbose_arg_track(struct bpf_verifier_env *env, struct arg_track *at) { int i; @@ -863,16 +839,13 @@ static void arg_track_alu64(struct arg_track *dst, const struct arg_track *src) *dst = arg_join_imprecise(*dst, *src); } -static s16 arg_add(s16 off, s64 delta) +static bool arg_add(s16 off, s64 delta, s16 *out) { - s64 res; - - if (off == OFF_IMPRECISE) - return OFF_IMPRECISE; - res = (s64)off + delta; - if (res < S16_MIN + 1 || res > S16_MAX) - return OFF_IMPRECISE; - return res; + s16 d = delta; + + if (d != delta) + return true; + return check_add_overflow(off, d, out); } static void arg_padd(struct arg_track *at, s64 delta) @@ -882,9 +855,9 @@ static void arg_padd(struct arg_track *at, s64 delta) if (at->off_cnt == 0) return; for (i = 0; i < at->off_cnt; i++) { - s16 new_off = arg_add(at->off[i], delta); + s16 new_off; - if (new_off == OFF_IMPRECISE) { + if (arg_add(at->off[i], delta, &new_off)) { at->off_cnt = 0; return; } @@ -899,8 +872,6 @@ static void arg_padd(struct arg_track *at, s64 delta) */ static int fp_off_to_slot(s16 off) { - if (off == OFF_IMPRECISE) - return -1; if (off >= 0 || off < -(int)(MAX_ARG_SPILL_SLOTS * 8)) return -1; if (off % 8) @@ -930,9 +901,11 @@ static struct arg_track fill_from_stack(struct bpf_insn *insn, return imp; for (i = 0; i < cnt; i++) { - s16 fp_off = arg_add(at_out[reg].off[i], insn->off); - int slot = fp_off_to_slot(fp_off); + s16 fp_off, slot; + if (arg_add(at_out[reg].off[i], insn->off, &fp_off)) + return imp; + slot = fp_off_to_slot(fp_off); if (slot < 0) return imp; result = __arg_track_join(result, at_stack_out[slot]); @@ -968,9 +941,12 @@ static void spill_to_stack(struct bpf_insn *insn, struct arg_track *at_out, return; } for (i = 0; i < cnt; i++) { - s16 fp_off = arg_add(at_out[reg].off[i], insn->off); - int slot = fp_off_to_slot(fp_off); + s16 fp_off; + int slot; + if (arg_add(at_out[reg].off[i], insn->off, &fp_off)) + continue; + slot = fp_off_to_slot(fp_off); if (slot < 0) continue; if (cnt == 1) @@ -980,6 +956,32 @@ static void spill_to_stack(struct bpf_insn *insn, struct arg_track *at_out, } } +/* + * Clear all tracked callee stack slots overlapping the byte range + * [off, off+sz-1] where off is a negative FP-relative offset. + */ +static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz, int cnt) +{ + struct arg_track none = { .frame = ARG_NONE }; + + if (cnt == 0) { + for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) + at_stack[i] = __arg_track_join(at_stack[i], none); + return; + } + for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) { + int slot_start = -((i + 1) * 8); + int slot_end = slot_start + 8; + + if (slot_start < off + (int)sz && slot_end > off) { + if (cnt == 1) + at_stack[i] = none; + else + at_stack[i] = __arg_track_join(at_stack[i], none); + } + } +} + /* * Clear stack slots overlapping all possible FP offsets in @reg. */ @@ -990,18 +992,22 @@ static void clear_stack_for_all_offs(struct bpf_insn *insn, int cnt, i; if (reg == BPF_REG_FP) { - clear_overlapping_stack_slots(at_stack_out, insn->off, sz); + clear_overlapping_stack_slots(at_stack_out, insn->off, sz, 1); return; } cnt = at_out[reg].off_cnt; if (cnt == 0) { - clear_overlapping_stack_slots(at_stack_out, OFF_IMPRECISE, sz); + clear_overlapping_stack_slots(at_stack_out, 0, sz, cnt); return; } for (i = 0; i < cnt; i++) { - s16 fp_off = arg_add(at_out[reg].off[i], insn->off); + s16 fp_off; - clear_overlapping_stack_slots(at_stack_out, fp_off, sz); + if (arg_add(at_out[reg].off[i], insn->off, &fp_off)) { + clear_overlapping_stack_slots(at_stack_out, 0, sz, 0); + break; + } + clear_overlapping_stack_slots(at_stack_out, fp_off, sz, cnt); } } @@ -1042,6 +1048,12 @@ static void arg_track_log(struct bpf_verifier_env *env, struct bpf_insn *insn, i verbose(env, "\n"); } +static bool can_be_local_fp(int depth, int regno, struct arg_track *at) +{ + return regno == BPF_REG_FP || at->frame == depth || + (at->frame == ARG_IMPRECISE && (at->mask & BIT(depth))); +} + /* * Pure dataflow transfer function for arg_track state. * Updates at_out[] based on how the instruction modifies registers. @@ -1111,8 +1123,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn, at_out[r] = none; } else if (class == BPF_LDX) { u32 sz = bpf_size_to_bytes(BPF_SIZE(insn->code)); - bool src_is_local_fp = insn->src_reg == BPF_REG_FP || src->frame == depth || - (src->frame == ARG_IMPRECISE && (src->mask & BIT(depth))); + bool src_is_local_fp = can_be_local_fp(depth, insn->src_reg, src); /* * Reload from callee stack: if src is current-frame FP-derived @@ -1147,7 +1158,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn, bool dst_is_local_fp; /* Track spills to current-frame FP-derived callee stack */ - dst_is_local_fp = insn->dst_reg == BPF_REG_FP || dst->frame == depth; + dst_is_local_fp = can_be_local_fp(depth, insn->dst_reg, dst); if (dst_is_local_fp && BPF_MODE(insn->code) == BPF_MEM) spill_to_stack(insn, at_out, insn->dst_reg, at_stack_out, src, sz); @@ -1166,7 +1177,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn, } } else if (class == BPF_ST && BPF_MODE(insn->code) == BPF_MEM) { u32 sz = bpf_size_to_bytes(BPF_SIZE(insn->code)); - bool dst_is_local_fp = insn->dst_reg == BPF_REG_FP || dst->frame == depth; + bool dst_is_local_fp = can_be_local_fp(depth, insn->dst_reg, dst); /* BPF_ST to FP-derived dst: clear overlapping stack slots */ if (dst_is_local_fp) @@ -1316,8 +1327,7 @@ static int record_load_store_access(struct bpf_verifier_env *env, resolved.off_cnt = ptr->off_cnt; resolved.frame = ptr->frame; for (oi = 0; oi < ptr->off_cnt; oi++) { - resolved.off[oi] = arg_add(ptr->off[oi], insn->off); - if (resolved.off[oi] == OFF_IMPRECISE) { + if (arg_add(ptr->off[oi], insn->off, &resolved.off[oi])) { resolved.off_cnt = 0; break; } -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH bpf-next v2 2/2] selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX 2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman @ 2026-04-13 23:30 ` Eduard Zingerman 2026-04-15 16:00 ` [PATCH bpf-next v2 0/2] bpf: " patchwork-bot+netdevbpf 2 siblings, 0 replies; 5+ messages in thread From: Eduard Zingerman @ 2026-04-13 23:30 UTC (permalink / raw) To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87 Add test cases for clear_stack_for_all_offs and dst_is_local_fp handling of multi-offset and ARG_IMPRECISE stack pointers: - st_imm_join_with_multi_off: BPF_ST through multi-offset dst should join at_stack with none instead of overwriting both candidate slots. - st_imm_join_with_imprecise_off: BPF_ST through offset-imprecise dst should join at_stack with none instead of clearing all slots. - st_imm_join_with_single_off: a canary checking that BPF_ST with a known offset overwrites slot instead of joining. - imprecise_dst_spill_join: BPF_STX through ARG_IMPRECISE dst should be recognized as a local spill and join at_stack with the written value. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- .../selftests/bpf/progs/verifier_live_stack.c | 193 +++++++++++++++++++++ 1 file changed, 193 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c index b7a9fa10e84d..401152b2b64f 100644 --- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c +++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c @@ -2647,3 +2647,196 @@ __naked void spill_join_with_imprecise_off(void) "exit;" ::: __clobber_all); } + +/* + * Same as spill_join_with_multi_off but the write is BPF_ST (store + * immediate) instead of BPF_STX. BPF_ST goes through + * clear_stack_for_all_offs() rather than spill_to_stack(), and that + * path also needs to join instead of overwriting. + * + * fp-8 = &fp-24 + * fp-16 = &fp-32 + * r1 = fp-8 or fp-16 (two offsets from branch) + * *(u64 *)(r1 + 0) = 0 -- BPF_ST with immediate + * r0 = *(u64 *)(r10 - 16) -- fill from fp-16 + * r0 = *(u64 *)(r0 + 0) -- deref: should produce use + */ +SEC("socket") +__log_level(2) +__failure +__msg("15: (7a) *(u64 *)(r1 +0) = 0 fp-8: fp0-24 -> fp0-24|fp0+0 fp-16: fp0-32 -> fp0-32|fp0+0") +__msg("17: (79) r0 = *(u64 *)(r0 +0) ; use: fp0-32") +__naked void st_imm_join_with_multi_off(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + "*(u64 *)(r10 - 32) = 0;" + "r1 = r10;" + "r1 += -24;" + "*(u64 *)(r10 - 8) = r1;" + "r1 = r10;" + "r1 += -32;" + "*(u64 *)(r10 - 16) = r1;" + /* create r1 with two candidate offsets: fp-8 or fp-16 */ + "call %[bpf_get_prandom_u32];" + "if r0 == 0 goto 1f;" + "r1 = r10;" + "r1 += -8;" + "goto 2f;" +"1:" + "r1 = r10;" + "r1 += -16;" +"2:" + /* BPF_ST: store immediate through multi-offset r1 */ + "*(u64 *)(r1 + 0) = 0;" + /* read back fp-16 and deref */ + "r0 = *(u64 *)(r10 - 16);" + "r0 = *(u64 *)(r0 + 0);" + "r0 = 0;" + "exit;" + :: __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +/* + * Check that BPF_ST with a known offset fully overwrites stack slot + * from the arg tracking point of view. + */ +SEC("socket") +__log_level(2) +__success +__msg("5: (7a) *(u64 *)(r1 +0) = 0 fp-8: fp0-16 -> _{{$}}") +__naked void st_imm_join_with_single_off(void) +{ + asm volatile ( + "r2 = r10;" + "r2 += -16;" + "*(u64 *)(r10 - 8) = r2;" + "r1 = r10;" + "r1 += -8;" + "*(u64 *)(r1 + 0) = 0;" + "r0 = 0;" + "exit;" + ::: __clobber_all); +} + +/* + * Same as spill_join_with_imprecise_off but the write is BPF_ST. + * Use "r2 = -8; r1 += r2" to make arg tracking lose offset + * precision while the main verifier keeps r1 as fixed-offset. + * + * fp-8 = &fp-24 + * fp-16 = &fp-32 + * r1 = fp-8 (imprecise to arg tracking) + * *(u64 *)(r1 + 0) = 0 -- BPF_ST with immediate + * r0 = *(u64 *)(r10 - 16) -- fill from fp-16 + * r0 = *(u64 *)(r0 + 0) -- deref: should produce use + */ +SEC("socket") +__log_level(2) +__success +__msg("13: (79) r0 = *(u64 *)(r0 +0) ; use: fp0-32") +__naked void st_imm_join_with_imprecise_off(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + "*(u64 *)(r10 - 32) = 0;" + "r1 = r10;" + "r1 += -24;" + "*(u64 *)(r10 - 8) = r1;" + "r1 = r10;" + "r1 += -32;" + "*(u64 *)(r10 - 16) = r1;" + /* r1 = fp-8 but arg tracking sees off_cnt == 0 */ + "r1 = r10;" + "r2 = -8;" + "r1 += r2;" + /* store immediate through imprecise r1 */ + "*(u64 *)(r1 + 0) = 0;" + /* read back fp-16 */ + "r0 = *(u64 *)(r10 - 16);" + /* deref: should produce use */ + "r0 = *(u64 *)(r0 + 0);" + "r0 = 0;" + "exit;" + ::: __clobber_all); +} + +/* + * Test that spilling through an ARG_IMPRECISE pointer joins with + * existing at_stack values. Subprog receives r1 = fp0-24 and + * r2 = map_value, creates an ARG_IMPRECISE pointer by joining caller + * and callee FP on two branches. + * + * Setup: callee spills &fp1-16 to fp1-8 (precise, tracked). + * Then writes map_value through ARG_IMPRECISE r1 — on path A + * this hits fp1-8, on path B it hits caller stack. + * Since spill_to_stack is skipped for ARG_IMPRECISE dst, + * fp1-8 tracking isn't joined with none. + * + * Expected after the imprecise write: + * - arg tracking should show fp1-8 = fp1-16|fp1+0 (joined with none) + * - read from fp1-8 and deref should produce use for fp1-16 + * - write through it should NOT produce def for fp1-16 + */ +SEC("socket") +__log_level(2) +__success +__msg("26: (79) r0 = *(u64 *)(r10 -8) // r1=IMP3 r6=fp0-24 r7=fp1-16 fp-8=fp1-16|fp1+0") +__naked void imprecise_dst_spill_join(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + /* map lookup for a valid non-FP pointer */ + "*(u32 *)(r10 - 32) = 0;" + "r1 = %[map] ll;" + "r2 = r10;" + "r2 += -32;" + "call %[bpf_map_lookup_elem];" + "if r0 == 0 goto 1f;" + /* r1 = &caller_fp-24, r2 = map_value */ + "r1 = r10;" + "r1 += -24;" + "r2 = r0;" + "call imprecise_dst_spill_join_sub;" +"1:" + "r0 = 0;" + "exit;" + :: __imm_addr(map), + __imm(bpf_map_lookup_elem) + : __clobber_all); +} + +static __used __naked void imprecise_dst_spill_join_sub(void) +{ + asm volatile ( + /* r6 = &caller_fp-24 (frame=0), r8 = map_value */ + "r6 = r1;" + "r8 = r2;" + /* spill &fp1-16 to fp1-8: at_stack[0] = fp1-16 */ + "*(u64 *)(r10 - 16) = 0;" + "r7 = r10;" + "r7 += -16;" + "*(u64 *)(r10 - 8) = r7;" + /* branch to create ARG_IMPRECISE pointer */ + "call %[bpf_get_prandom_u32];" + /* path B: r1 = caller fp-24 (frame=0) */ + "r1 = r6;" + "if r0 == 0 goto 1f;" + /* path A: r1 = callee fp-8 (frame=1) */ + "r1 = r10;" + "r1 += -8;" +"1:" + /* r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}. + * Write map_value (non-FP) through r1. On path A this overwrites fp1-8. + * Should join at_stack[0] with none: fp1-16|fp1+0. + */ + "*(u64 *)(r1 + 0) = r8;" + /* read fp1-8: should be fp1-16|fp1+0 (joined) */ + "r0 = *(u64 *)(r10 - 8);" + "*(u64 *)(r0 + 0) = 42;" + "r0 = 0;" + "exit;" + :: __imm(bpf_get_prandom_u32) + : __clobber_all); +} -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX 2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman @ 2026-04-15 16:00 ` patchwork-bot+netdevbpf 2 siblings, 0 replies; 5+ messages in thread From: patchwork-bot+netdevbpf @ 2026-04-15 16:00 UTC (permalink / raw) To: Eduard Zingerman Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song Hello: This series was applied to bpf/bpf.git (master) by Alexei Starovoitov <ast@kernel.org>: On Mon, 13 Apr 2026 16:30:51 -0700 you wrote: > When the static arg tracking analysis encounters a store through a > pointer with imprecise or multi-offset destination, it must use weak > updates (join) instead of strong updates (overwrite) for the affected > at_stack slots. At runtime only one slot is actually written; the > others retain their old values. > > Two cases are addressed: > - BPF_STX, handled by spill_to_stack(). It was gated on > `dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE > pointers entirely. > - BPF_ST, handled by clear_stack_for_all_offs(). It delegates to > clear_overlapping_stack_slots() which unconditionally set > `at_stack[i] = none`. Change to `at_stack[i] = join(old, none)` > when multiple candidate slots exist (cnt != 1), so that untouched > slots preserve their tracked values. > > [...] Here is the summary with links: - [bpf-next,v2,1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX https://git.kernel.org/bpf/bpf/c/ecdd4fd8a54c - [bpf-next,v2,2/2] selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX https://git.kernel.org/bpf/bpf/c/d97cc8fc997c You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
@ 2026-04-13 21:58 Eduard Zingerman
2026-04-13 21:58 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman
0 siblings, 1 reply; 5+ messages in thread
From: Eduard Zingerman @ 2026-04-13 21:58 UTC (permalink / raw)
To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87
When the static arg tracking analysis encounters a store through a
pointer with imprecise or multi-offset destination, it must use weak
updates (join) instead of strong updates (overwrite) for the affected
at_stack slots. At runtime only one slot is actually written; the
others retain their old values.
Two cases are addressed:
- BPF_STX, handled by spill_to_stack(). It was gated on
`dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
pointers entirely.
- BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
clear_overlapping_stack_slots() which unconditionally set
`at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
when multiple candidate slots exist (cnt != 1), so that untouched
slots preserve their tracked values.
No veristat diff compared to current master when tested on selftests,
sched_ext, cilium and a set of Meta internal programs.
This addresses issues reported by sashiko for patch #7 in [1].
[1] https://sashiko.dev/#/patchset/20260410-patch-set-v4-0-5d4eecb343db%40gmail.com
Changelog:
v1 -> v2:
- Delete the OFF_IMPRECISE constant, always rely on
arg_track->cnt == 0 as a marker the offset is imprecise.
(Alexei).
- Squash all patches together to simplify backporting to
'bpf' branch (Alexei).
v1: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v1-0-9f48a9999d6e@gmail.com/T/#u
---
Eduard Zingerman (2):
bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
kernel/bpf/liveness.c | 110 ++++++------
.../selftests/bpf/progs/verifier_live_stack.c | 194 +++++++++++++++++++++
2 files changed, 255 insertions(+), 49 deletions(-)
---
base-commit: 71b500afd2f7336f5b6c6026f2af546fc079be26
change-id: 20260413-stacklive-fixes-42e258cf0397
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH bpf-next v2 2/2] selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX 2026-04-13 21:58 Eduard Zingerman @ 2026-04-13 21:58 ` Eduard Zingerman 0 siblings, 0 replies; 5+ messages in thread From: Eduard Zingerman @ 2026-04-13 21:58 UTC (permalink / raw) To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87 Add test cases for clear_stack_for_all_offs and dst_is_local_fp handling of multi-offset and ARG_IMPRECISE stack pointers: - st_imm_join_with_multi_off: BPF_ST through multi-offset dst should join at_stack with none instead of overwriting both candidate slots. - st_imm_join_with_imprecise_off: BPF_ST through offset-imprecise dst should join at_stack with none instead of clearing all slots. - st_imm_join_with_single_off: a canary checking that BPF_ST with a known offset overwrites slot instead of joining. - imprecise_dst_spill_join: BPF_STX through ARG_IMPRECISE dst should be recognized as a local spill and join at_stack with the written value. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> --- .../selftests/bpf/progs/verifier_live_stack.c | 194 +++++++++++++++++++++ 1 file changed, 194 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c index b7a9fa10e84d..536e214d0376 100644 --- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c +++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c @@ -2647,3 +2647,197 @@ __naked void spill_join_with_imprecise_off(void) "exit;" ::: __clobber_all); } + +/* + * Same as spill_join_with_multi_off but the write is BPF_ST (store + * immediate) instead of BPF_STX. BPF_ST goes through + * clear_stack_for_all_offs() rather than spill_to_stack(), and that + * path also needs to join instead of overwriting. + * + * fp-8 = &fp-24 + * fp-16 = &fp-32 + * r1 = fp-8 or fp-16 (two offsets from branch) + * *(u64 *)(r1 + 0) = 0 -- BPF_ST with immediate + * r0 = *(u64 *)(r10 - 16) -- fill from fp-16 + * r0 = *(u64 *)(r0 + 0) -- deref: should produce use + */ +SEC("socket") +__log_level(2) +__failure +__msg("15: (7a) *(u64 *)(r1 +0) = 0 fp-8: fp0-24 -> fp0-24|fp0+0 fp-16: fp0-32 -> fp0-32|fp0+0") +__msg("17: (79) r0 = *(u64 *)(r0 +0) ; use: fp0-32") +__naked void st_imm_join_with_multi_off(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + "*(u64 *)(r10 - 32) = 0;" + "r1 = r10;" + "r1 += -24;" + "*(u64 *)(r10 - 8) = r1;" + "r1 = r10;" + "r1 += -32;" + "*(u64 *)(r10 - 16) = r1;" + /* create r1 with two candidate offsets: fp-8 or fp-16 */ + "call %[bpf_get_prandom_u32];" + "if r0 == 0 goto 1f;" + "r1 = r10;" + "r1 += -8;" + "goto 2f;" +"1:" + "r1 = r10;" + "r1 += -16;" +"2:" + /* BPF_ST: store immediate through multi-offset r1 */ + "*(u64 *)(r1 + 0) = 0;" + /* read back fp-16 and deref */ + "r0 = *(u64 *)(r10 - 16);" + "r0 = *(u64 *)(r0 + 0);" + "r0 = 0;" + "exit;" + :: __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +/* + * Check that BPF_ST with a known offset fully overwrites stack slot + * from the arg tracking point of view. + */ +SEC("socket") +__log_level(2) +__success +__msg("5: (7a) *(u64 *)(r1 +0) = 0 fp-8: fp0-16 -> _{{$}}") +__naked void st_imm_join_with_single_off(void) +{ + asm volatile ( + "r2 = r10;" + "r2 += -16;" + "*(u64 *)(r10 - 8) = r2;" + "r1 = r10;" + "r1 += -8;" + "*(u64 *)(r1 + 0) = 0;" + "r0 = 0;" + "exit;" + :: __imm(bpf_get_prandom_u32) + : __clobber_all); +} + +/* + * Same as spill_join_with_imprecise_off but the write is BPF_ST. + * Use "r2 = -8; r1 += r2" to make arg tracking lose offset + * precision while the main verifier keeps r1 as fixed-offset. + * + * fp-8 = &fp-24 + * fp-16 = &fp-32 + * r1 = fp-8 (imprecise to arg tracking) + * *(u64 *)(r1 + 0) = 0 -- BPF_ST with immediate + * r0 = *(u64 *)(r10 - 16) -- fill from fp-16 + * r0 = *(u64 *)(r0 + 0) -- deref: should produce use + */ +SEC("socket") +__log_level(2) +__success +__msg("13: (79) r0 = *(u64 *)(r0 +0) ; use: fp0-32") +__naked void st_imm_join_with_imprecise_off(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + "*(u64 *)(r10 - 32) = 0;" + "r1 = r10;" + "r1 += -24;" + "*(u64 *)(r10 - 8) = r1;" + "r1 = r10;" + "r1 += -32;" + "*(u64 *)(r10 - 16) = r1;" + /* r1 = fp-8 but arg tracking sees off_cnt == 0 */ + "r1 = r10;" + "r2 = -8;" + "r1 += r2;" + /* store immediate through imprecise r1 */ + "*(u64 *)(r1 + 0) = 0;" + /* read back fp-16 */ + "r0 = *(u64 *)(r10 - 16);" + /* deref: should produce use */ + "r0 = *(u64 *)(r0 + 0);" + "r0 = 0;" + "exit;" + ::: __clobber_all); +} + +/* + * Test that spilling through an ARG_IMPRECISE pointer joins with + * existing at_stack values. Subprog receives r1 = fp0-24 and + * r2 = map_value, creates an ARG_IMPRECISE pointer by joining caller + * and callee FP on two branches. + * + * Setup: callee spills &fp1-16 to fp1-8 (precise, tracked). + * Then writes map_value through ARG_IMPRECISE r1 — on path A + * this hits fp1-8, on path B it hits caller stack. + * Since spill_to_stack is skipped for ARG_IMPRECISE dst, + * fp1-8 tracking isn't joined with none. + * + * Expected after the imprecise write: + * - arg tracking should show fp1-8 = fp1-16|fp1+0 (joined with none) + * - read from fp1-8 and deref should produce use for fp1-16 + * - write through it should NOT produce def for fp1-16 + */ +SEC("socket") +__log_level(2) +__success +__msg("26: (79) r0 = *(u64 *)(r10 -8) // r1=IMP3 r6=fp0-24 r7=fp1-16 fp-8=fp1-16|fp1+0") +__naked void imprecise_dst_spill_join(void) +{ + asm volatile ( + "*(u64 *)(r10 - 24) = 0;" + /* map lookup for a valid non-FP pointer */ + "*(u32 *)(r10 - 32) = 0;" + "r1 = %[map] ll;" + "r2 = r10;" + "r2 += -32;" + "call %[bpf_map_lookup_elem];" + "if r0 == 0 goto 1f;" + /* r1 = &caller_fp-24, r2 = map_value */ + "r1 = r10;" + "r1 += -24;" + "r2 = r0;" + "call imprecise_dst_spill_join_sub;" +"1:" + "r0 = 0;" + "exit;" + :: __imm_addr(map), + __imm(bpf_map_lookup_elem) + : __clobber_all); +} + +static __used __naked void imprecise_dst_spill_join_sub(void) +{ + asm volatile ( + /* r6 = &caller_fp-24 (frame=0), r8 = map_value */ + "r6 = r1;" + "r8 = r2;" + /* spill &fp1-16 to fp1-8: at_stack[0] = fp1-16 */ + "*(u64 *)(r10 - 16) = 0;" + "r7 = r10;" + "r7 += -16;" + "*(u64 *)(r10 - 8) = r7;" + /* branch to create ARG_IMPRECISE pointer */ + "call %[bpf_get_prandom_u32];" + /* path B: r1 = caller fp-24 (frame=0) */ + "r1 = r6;" + "if r0 == 0 goto 1f;" + /* path A: r1 = callee fp-8 (frame=1) */ + "r1 = r10;" + "r1 += -8;" +"1:" + /* r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}. + * Write map_value (non-FP) through r1. On path A this overwrites fp1-8. + * Should join at_stack[0] with none: fp1-16|fp1+0. + */ + "*(u64 *)(r1 + 0) = r8;" + /* read fp1-8: should be fp1-16|fp1+0 (joined) */ + "r0 = *(u64 *)(r10 - 8);" + "*(u64 *)(r0 + 0) = 42;" + "r0 = 0;" + "exit;" + :: __imm(bpf_get_prandom_u32) + : __clobber_all); +} -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-15 16:00 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman 2026-04-13 23:30 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman 2026-04-15 16:00 ` [PATCH bpf-next v2 0/2] bpf: " patchwork-bot+netdevbpf -- strict thread matches above, loose matches on Subject: below -- 2026-04-13 21:58 Eduard Zingerman 2026-04-13 21:58 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox