public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
@ 2026-04-13 21:58 Eduard Zingerman
  2026-04-13 21:58 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
  2026-04-13 21:58 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman
  0 siblings, 2 replies; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 21:58 UTC (permalink / raw)
  To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87

When the static arg tracking analysis encounters a store through a
pointer with imprecise or multi-offset destination, it must use weak
updates (join) instead of strong updates (overwrite) for the affected
at_stack slots. At runtime only one slot is actually written; the
others retain their old values.

Two cases are addressed:
- BPF_STX, handled by spill_to_stack(). It was gated on
  `dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
  pointers entirely.
- BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
  clear_overlapping_stack_slots() which unconditionally set
  `at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
  when multiple candidate slots exist (cnt != 1), so that untouched
  slots preserve their tracked values.

No veristat diff compared to current master when tested on selftests,
sched_ext, cilium and a set of Meta internal programs.

This addresses issues reported by sashiko for patch #7 in [1].

[1] https://sashiko.dev/#/patchset/20260410-patch-set-v4-0-5d4eecb343db%40gmail.com

Changelog:
v1 -> v2:
- Delete the OFF_IMPRECISE constant, always rely on
  arg_track->cnt == 0 as a marker the offset is imprecise.
  (Alexei).
- Squash all patches together to simplify backporting to
  'bpf' branch (Alexei).

v1: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v1-0-9f48a9999d6e@gmail.com/T/#u
---
Eduard Zingerman (2):
      bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
      selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX

 kernel/bpf/liveness.c                              | 110 ++++++------
 .../selftests/bpf/progs/verifier_live_stack.c      | 194 +++++++++++++++++++++
 2 files changed, 255 insertions(+), 49 deletions(-)
---
base-commit: 71b500afd2f7336f5b6c6026f2af546fc079be26
change-id: 20260413-stacklive-fixes-42e258cf0397

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 21:58 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman
@ 2026-04-13 21:58 ` Eduard Zingerman
  2026-04-13 22:35   ` bot+bpf-ci
  2026-04-13 22:42   ` Alexei Starovoitov
  2026-04-13 21:58 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman
  1 sibling, 2 replies; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 21:58 UTC (permalink / raw)
  To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87

BPF_STX through ARG_IMPRECISE dst should be recognized as a local
spill and join at_stack with the written value. For example,
consider the following situation:

   // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
   *(u64 *)(r1 + 0) = r8

Here the analysis should produce an equivalent of

  at_stack[*] = join(old, r8)

BPF_ST through multi-offset or imprecise dst should join at_stack with
none instead of overwriting the slots. For example, consider the
following situation:

   // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
   *(u64 *)(r1 + 0) = 0

Here the analysis should produce an equivalent of

  at_stack[*r1] = join(old, none).

Move the definition of the clear_overlapping_stack_slots() in order to
have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to
avoid having two ways to express imprecise offset. Only
'offset-imprecise {frame=N, cnt=0}' remains.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 kernel/bpf/liveness.c | 110 ++++++++++++++++++++++++++++----------------------
 1 file changed, 61 insertions(+), 49 deletions(-)

diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
index 1fb4c511db5a..23d19147f123 100644
--- a/kernel/bpf/liveness.c
+++ b/kernel/bpf/liveness.c
@@ -574,7 +574,7 @@ static int print_instances(struct bpf_verifier_env *env)
  *
  *   precise {frame=N, off=V}      -- known absolute frame index and byte offset
  *        |
- *   offset-imprecise {frame=N, off=OFF_IMPRECISE}
+ *   offset-imprecise {frame=N, cnt=0}
  *        |                        -- known frame identity, unknown offset
  *   fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask}
  *                                 -- unknown frame identity; .mask is a
@@ -607,8 +607,6 @@ enum arg_track_state {
 	ARG_IMPRECISE	= -3,	/* lost identity; .mask is arg bitmask */
 };
 
-#define OFF_IMPRECISE	S16_MIN	/* arg identity known but offset unknown */
-
 /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
 #define MAX_ARG_SPILL_SLOTS 64
 
@@ -622,28 +620,6 @@ static bool arg_is_fp(const struct arg_track *at)
 	return at->frame >= 0 || at->frame == ARG_IMPRECISE;
 }
 
-/*
- * Clear all tracked callee stack slots overlapping the byte range
- * [off, off+sz-1] where off is a negative FP-relative offset.
- */
-static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz)
-{
-	struct arg_track none = { .frame = ARG_NONE };
-
-	if (off == OFF_IMPRECISE) {
-		for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++)
-			at_stack[i] = none;
-		return;
-	}
-	for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) {
-		int slot_start = -((i + 1) * 8);
-		int slot_end = slot_start + 8;
-
-		if (slot_start < off + (int)sz && slot_end > off)
-			at_stack[i] = none;
-	}
-}
-
 static void verbose_arg_track(struct bpf_verifier_env *env, struct arg_track *at)
 {
 	int i;
@@ -863,16 +839,15 @@ static void arg_track_alu64(struct arg_track *dst, const struct arg_track *src)
 	*dst = arg_join_imprecise(*dst, *src);
 }
 
-static s16 arg_add(s16 off, s64 delta)
+static bool arg_add(s16 off, s64 delta, s16 *out)
 {
 	s64 res;
 
-	if (off == OFF_IMPRECISE)
-		return OFF_IMPRECISE;
 	res = (s64)off + delta;
-	if (res < S16_MIN + 1 || res > S16_MAX)
-		return OFF_IMPRECISE;
-	return res;
+	if (res < S16_MIN || res > S16_MAX)
+		return true;
+	*out = res;
+	return false;
 }
 
 static void arg_padd(struct arg_track *at, s64 delta)
@@ -882,9 +857,9 @@ static void arg_padd(struct arg_track *at, s64 delta)
 	if (at->off_cnt == 0)
 		return;
 	for (i = 0; i < at->off_cnt; i++) {
-		s16 new_off = arg_add(at->off[i], delta);
+		s16 new_off;
 
-		if (new_off == OFF_IMPRECISE) {
+		if (arg_add(at->off[i], delta, &new_off)) {
 			at->off_cnt = 0;
 			return;
 		}
@@ -899,8 +874,6 @@ static void arg_padd(struct arg_track *at, s64 delta)
  */
 static int fp_off_to_slot(s16 off)
 {
-	if (off == OFF_IMPRECISE)
-		return -1;
 	if (off >= 0 || off < -(int)(MAX_ARG_SPILL_SLOTS * 8))
 		return -1;
 	if (off % 8)
@@ -930,9 +903,11 @@ static struct arg_track fill_from_stack(struct bpf_insn *insn,
 		return imp;
 
 	for (i = 0; i < cnt; i++) {
-		s16 fp_off = arg_add(at_out[reg].off[i], insn->off);
-		int slot = fp_off_to_slot(fp_off);
+		s16 fp_off, slot;
 
+		if (arg_add(at_out[reg].off[i], insn->off, &fp_off))
+			return imp;
+		slot = fp_off_to_slot(fp_off);
 		if (slot < 0)
 			return imp;
 		result = __arg_track_join(result, at_stack_out[slot]);
@@ -968,9 +943,12 @@ static void spill_to_stack(struct bpf_insn *insn, struct arg_track *at_out,
 		return;
 	}
 	for (i = 0; i < cnt; i++) {
-		s16 fp_off = arg_add(at_out[reg].off[i], insn->off);
-		int slot = fp_off_to_slot(fp_off);
+		s16 fp_off;
+		int slot;
 
+		if (arg_add(at_out[reg].off[i], insn->off, &fp_off))
+			continue;
+		slot = fp_off_to_slot(fp_off);
 		if (slot < 0)
 			continue;
 		if (cnt == 1)
@@ -980,6 +958,32 @@ static void spill_to_stack(struct bpf_insn *insn, struct arg_track *at_out,
 	}
 }
 
+/*
+ * Clear all tracked callee stack slots overlapping the byte range
+ * [off, off+sz-1] where off is a negative FP-relative offset.
+ */
+static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz, int cnt)
+{
+	struct arg_track none = { .frame = ARG_NONE };
+
+	if (cnt == 0) {
+		for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++)
+			at_stack[i] = __arg_track_join(at_stack[i], none);
+		return;
+	}
+	for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) {
+		int slot_start = -((i + 1) * 8);
+		int slot_end = slot_start + 8;
+
+		if (slot_start < off + (int)sz && slot_end > off) {
+			if (cnt == 1)
+				at_stack[i] = none;
+			else
+				at_stack[i] = __arg_track_join(at_stack[i], none);
+		}
+	}
+}
+
 /*
  * Clear stack slots overlapping all possible FP offsets in @reg.
  */
@@ -990,18 +994,22 @@ static void clear_stack_for_all_offs(struct bpf_insn *insn,
 	int cnt, i;
 
 	if (reg == BPF_REG_FP) {
-		clear_overlapping_stack_slots(at_stack_out, insn->off, sz);
+		clear_overlapping_stack_slots(at_stack_out, insn->off, sz, 1);
 		return;
 	}
 	cnt = at_out[reg].off_cnt;
 	if (cnt == 0) {
-		clear_overlapping_stack_slots(at_stack_out, OFF_IMPRECISE, sz);
+		clear_overlapping_stack_slots(at_stack_out, 0, sz, cnt);
 		return;
 	}
 	for (i = 0; i < cnt; i++) {
-		s16 fp_off = arg_add(at_out[reg].off[i], insn->off);
+		s16 fp_off;
 
-		clear_overlapping_stack_slots(at_stack_out, fp_off, sz);
+		if (arg_add(at_out[reg].off[i], insn->off, &fp_off)) {
+			clear_overlapping_stack_slots(at_stack_out, 0, sz, 0);
+			break;
+		}
+		clear_overlapping_stack_slots(at_stack_out, fp_off, sz, cnt);
 	}
 }
 
@@ -1042,6 +1050,12 @@ static void arg_track_log(struct bpf_verifier_env *env, struct bpf_insn *insn, i
 		verbose(env, "\n");
 }
 
+static bool can_be_local_fp(int depth, int regno, struct arg_track *at)
+{
+	return regno == BPF_REG_FP || at->frame == depth ||
+	       (at->frame == ARG_IMPRECISE && (at->mask & BIT(depth)));
+}
+
 /*
  * Pure dataflow transfer function for arg_track state.
  * Updates at_out[] based on how the instruction modifies registers.
@@ -1111,8 +1125,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			at_out[r] = none;
 	} else if (class == BPF_LDX) {
 		u32 sz = bpf_size_to_bytes(BPF_SIZE(insn->code));
-		bool src_is_local_fp = insn->src_reg == BPF_REG_FP || src->frame == depth ||
-				       (src->frame == ARG_IMPRECISE && (src->mask & BIT(depth)));
+		bool src_is_local_fp = can_be_local_fp(depth, insn->src_reg, src);
 
 		/*
 		 * Reload from callee stack: if src is current-frame FP-derived
@@ -1147,7 +1160,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		bool dst_is_local_fp;
 
 		/* Track spills to current-frame FP-derived callee stack */
-		dst_is_local_fp = insn->dst_reg == BPF_REG_FP || dst->frame == depth;
+		dst_is_local_fp = can_be_local_fp(depth, insn->dst_reg, dst);
 		if (dst_is_local_fp && BPF_MODE(insn->code) == BPF_MEM)
 			spill_to_stack(insn, at_out, insn->dst_reg,
 				       at_stack_out, src, sz);
@@ -1166,7 +1179,7 @@ static void arg_track_xfer(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		}
 	} else if (class == BPF_ST && BPF_MODE(insn->code) == BPF_MEM) {
 		u32 sz = bpf_size_to_bytes(BPF_SIZE(insn->code));
-		bool dst_is_local_fp = insn->dst_reg == BPF_REG_FP || dst->frame == depth;
+		bool dst_is_local_fp = can_be_local_fp(depth, insn->dst_reg, dst);
 
 		/* BPF_ST to FP-derived dst: clear overlapping stack slots */
 		if (dst_is_local_fp)
@@ -1316,8 +1329,7 @@ static int record_load_store_access(struct bpf_verifier_env *env,
 		resolved.off_cnt = ptr->off_cnt;
 		resolved.frame = ptr->frame;
 		for (oi = 0; oi < ptr->off_cnt; oi++) {
-			resolved.off[oi] = arg_add(ptr->off[oi], insn->off);
-			if (resolved.off[oi] == OFF_IMPRECISE) {
+			if (arg_add(ptr->off[oi], insn->off, &resolved.off[oi])) {
 				resolved.off_cnt = 0;
 				break;
 			}

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 2/2] selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 21:58 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman
  2026-04-13 21:58 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
@ 2026-04-13 21:58 ` Eduard Zingerman
  1 sibling, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 21:58 UTC (permalink / raw)
  To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87

Add test cases for clear_stack_for_all_offs and dst_is_local_fp
handling of multi-offset and ARG_IMPRECISE stack pointers:

- st_imm_join_with_multi_off: BPF_ST through multi-offset dst should
  join at_stack with none instead of overwriting both candidate slots.
- st_imm_join_with_imprecise_off: BPF_ST through offset-imprecise dst
  should join at_stack with none instead of clearing all slots.
- st_imm_join_with_single_off: a canary checking that BPF_ST with a
  known offset overwrites slot instead of joining.
- imprecise_dst_spill_join: BPF_STX through ARG_IMPRECISE dst should
  be recognized as a local spill and join at_stack with the written
  value.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 .../selftests/bpf/progs/verifier_live_stack.c      | 194 +++++++++++++++++++++
 1 file changed, 194 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_live_stack.c b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
index b7a9fa10e84d..536e214d0376 100644
--- a/tools/testing/selftests/bpf/progs/verifier_live_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_live_stack.c
@@ -2647,3 +2647,197 @@ __naked void spill_join_with_imprecise_off(void)
 	"exit;"
 	::: __clobber_all);
 }
+
+/*
+ * Same as spill_join_with_multi_off but the write is BPF_ST (store
+ * immediate) instead of BPF_STX. BPF_ST goes through
+ * clear_stack_for_all_offs() rather than spill_to_stack(), and that
+ * path also needs to join instead of overwriting.
+ *
+ *   fp-8  = &fp-24
+ *   fp-16 = &fp-32
+ *   r1 = fp-8 or fp-16 (two offsets from branch)
+ *   *(u64 *)(r1 + 0) = 0        -- BPF_ST with immediate
+ *   r0 = *(u64 *)(r10 - 16)     -- fill from fp-16
+ *   r0 = *(u64 *)(r0 + 0)       -- deref: should produce use
+ */
+SEC("socket")
+__log_level(2)
+__failure
+__msg("15: (7a) *(u64 *)(r1 +0) = 0	fp-8: fp0-24 -> fp0-24|fp0+0	fp-16: fp0-32 -> fp0-32|fp0+0")
+__msg("17: (79) r0 = *(u64 *)(r0 +0)         ; use: fp0-32")
+__naked void st_imm_join_with_multi_off(void)
+{
+	asm volatile (
+	"*(u64 *)(r10 - 24) = 0;"
+	"*(u64 *)(r10 - 32) = 0;"
+	"r1 = r10;"
+	"r1 += -24;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r1 = r10;"
+	"r1 += -32;"
+	"*(u64 *)(r10 - 16) = r1;"
+	/* create r1 with two candidate offsets: fp-8 or fp-16 */
+	"call %[bpf_get_prandom_u32];"
+	"if r0 == 0 goto 1f;"
+	"r1 = r10;"
+	"r1 += -8;"
+	"goto 2f;"
+"1:"
+	"r1 = r10;"
+	"r1 += -16;"
+"2:"
+	/* BPF_ST: store immediate through multi-offset r1 */
+	"*(u64 *)(r1 + 0) = 0;"
+	/* read back fp-16 and deref */
+	"r0 = *(u64 *)(r10 - 16);"
+	"r0 = *(u64 *)(r0 + 0);"
+	"r0 = 0;"
+	"exit;"
+	:: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
+/*
+ * Check that BPF_ST with a known offset fully overwrites stack slot
+ * from the arg tracking point of view.
+ */
+SEC("socket")
+__log_level(2)
+__success
+__msg("5: (7a) *(u64 *)(r1 +0) = 0	fp-8: fp0-16 -> _{{$}}")
+__naked void st_imm_join_with_single_off(void)
+{
+	asm volatile (
+	"r2 = r10;"
+	"r2 += -16;"
+	"*(u64 *)(r10 - 8) = r2;"
+	"r1 = r10;"
+	"r1 += -8;"
+	"*(u64 *)(r1 + 0) = 0;"
+	"r0 = 0;"
+	"exit;"
+	:: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
+/*
+ * Same as spill_join_with_imprecise_off but the write is BPF_ST.
+ * Use "r2 = -8; r1 += r2" to make arg tracking lose offset
+ * precision while the main verifier keeps r1 as fixed-offset.
+ *
+ *   fp-8  = &fp-24
+ *   fp-16 = &fp-32
+ *   r1 = fp-8 (imprecise to arg tracking)
+ *   *(u64 *)(r1 + 0) = 0        -- BPF_ST with immediate
+ *   r0 = *(u64 *)(r10 - 16)     -- fill from fp-16
+ *   r0 = *(u64 *)(r0 + 0)       -- deref: should produce use
+ */
+SEC("socket")
+__log_level(2)
+__success
+__msg("13: (79) r0 = *(u64 *)(r0 +0)         ; use: fp0-32")
+__naked void st_imm_join_with_imprecise_off(void)
+{
+	asm volatile (
+	"*(u64 *)(r10 - 24) = 0;"
+	"*(u64 *)(r10 - 32) = 0;"
+	"r1 = r10;"
+	"r1 += -24;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r1 = r10;"
+	"r1 += -32;"
+	"*(u64 *)(r10 - 16) = r1;"
+	/* r1 = fp-8 but arg tracking sees off_cnt == 0 */
+	"r1 = r10;"
+	"r2 = -8;"
+	"r1 += r2;"
+	/* store immediate through imprecise r1 */
+	"*(u64 *)(r1 + 0) = 0;"
+	/* read back fp-16 */
+	"r0 = *(u64 *)(r10 - 16);"
+	/* deref: should produce use */
+	"r0 = *(u64 *)(r0 + 0);"
+	"r0 = 0;"
+	"exit;"
+	::: __clobber_all);
+}
+
+/*
+ * Test that spilling through an ARG_IMPRECISE pointer joins with
+ * existing at_stack values. Subprog receives r1 = fp0-24 and
+ * r2 = map_value, creates an ARG_IMPRECISE pointer by joining caller
+ * and callee FP on two branches.
+ *
+ * Setup: callee spills &fp1-16 to fp1-8 (precise, tracked).
+ * Then writes map_value through ARG_IMPRECISE r1 — on path A
+ * this hits fp1-8, on path B it hits caller stack.
+ * Since spill_to_stack is skipped for ARG_IMPRECISE dst,
+ * fp1-8 tracking isn't joined with none.
+ *
+ * Expected after the imprecise write:
+ * - arg tracking should show fp1-8 = fp1-16|fp1+0 (joined with none)
+ * - read from fp1-8 and deref should produce use for fp1-16
+ * - write through it should NOT produce def for fp1-16
+ */
+SEC("socket")
+__log_level(2)
+__success
+__msg("26: (79) r0 = *(u64 *)(r10 -8) // r1=IMP3 r6=fp0-24 r7=fp1-16 fp-8=fp1-16|fp1+0")
+__naked void imprecise_dst_spill_join(void)
+{
+	asm volatile (
+	"*(u64 *)(r10 - 24) = 0;"
+	/* map lookup for a valid non-FP pointer */
+	"*(u32 *)(r10 - 32) = 0;"
+	"r1 = %[map] ll;"
+	"r2 = r10;"
+	"r2 += -32;"
+	"call %[bpf_map_lookup_elem];"
+	"if r0 == 0 goto 1f;"
+	/* r1 = &caller_fp-24, r2 = map_value */
+	"r1 = r10;"
+	"r1 += -24;"
+	"r2 = r0;"
+	"call imprecise_dst_spill_join_sub;"
+"1:"
+	"r0 = 0;"
+	"exit;"
+	:: __imm_addr(map),
+	   __imm(bpf_map_lookup_elem)
+	: __clobber_all);
+}
+
+static __used __naked void imprecise_dst_spill_join_sub(void)
+{
+	asm volatile (
+	/* r6 = &caller_fp-24 (frame=0), r8 = map_value */
+	"r6 = r1;"
+	"r8 = r2;"
+	/* spill &fp1-16 to fp1-8: at_stack[0] = fp1-16 */
+	"*(u64 *)(r10 - 16) = 0;"
+	"r7 = r10;"
+	"r7 += -16;"
+	"*(u64 *)(r10 - 8) = r7;"
+	/* branch to create ARG_IMPRECISE pointer */
+	"call %[bpf_get_prandom_u32];"
+	/* path B: r1 = caller fp-24 (frame=0) */
+	"r1 = r6;"
+	"if r0 == 0 goto 1f;"
+	/* path A: r1 = callee fp-8 (frame=1) */
+	"r1 = r10;"
+	"r1 += -8;"
+"1:"
+	/* r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}.
+	 * Write map_value (non-FP) through r1. On path A this overwrites fp1-8.
+	 * Should join at_stack[0] with none: fp1-16|fp1+0.
+	 */
+	"*(u64 *)(r1 + 0) = r8;"
+	/* read fp1-8: should be fp1-16|fp1+0 (joined) */
+	"r0 = *(u64 *)(r10 - 8);"
+	"*(u64 *)(r0 + 0) = 42;"
+	"r0 = 0;"
+	"exit;"
+	:: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}

-- 
2.53.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 21:58 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
@ 2026-04-13 22:35   ` bot+bpf-ci
  2026-04-13 22:39     ` Eduard Zingerman
  2026-04-13 22:42   ` Alexei Starovoitov
  1 sibling, 1 reply; 9+ messages in thread
From: bot+bpf-ci @ 2026-04-13 22:35 UTC (permalink / raw)
  To: eddyz87, bpf, ast, andrii
  Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87, ast,
	andrii, daniel, martin.lau, eddyz87, yonghong.song, clm,
	ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]


commit 944d3bf7ec135889f575f3b8ae4612684896737f
Author: Eduard Zingerman <eddyz87@gmail.com>

bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX

This commit fixes the arg tracking dataflow analysis to use weak
updates (join) instead of strong updates (overwrite) when storing
through imprecise or multi-offset pointers, and extends dst_is_local_fp
to recognize ARG_IMPRECISE destinations for BPF_STX and BPF_ST.

This appears to fix bugs introduced in bf0c571f7feb ("bpf: introduce
forward arg-tracking dataflow analysis"). Should this commit include a
Fixes: tag?

Fixes: bf0c571f7feb ("bpf: introduce forward arg-tracking dataflow analysis")


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24369343210

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 22:35   ` bot+bpf-ci
@ 2026-04-13 22:39     ` Eduard Zingerman
  0 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 22:39 UTC (permalink / raw)
  To: bot+bpf-ci, bpf, ast, andrii
  Cc: daniel, martin.lau, kernel-team, yonghong.song, martin.lau, clm,
	ihor.solodrai

On Mon, 2026-04-13 at 22:35 +0000, bot+bpf-ci@kernel.org wrote:
> commit 944d3bf7ec135889f575f3b8ae4612684896737f
> Author: Eduard Zingerman <eddyz87@gmail.com>
> 
> bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
> 
> This commit fixes the arg tracking dataflow analysis to use weak
> updates (join) instead of strong updates (overwrite) when storing
> through imprecise or multi-offset pointers, and extends dst_is_local_fp
> to recognize ARG_IMPRECISE destinations for BPF_STX and BPF_ST.
> 
> This appears to fix bugs introduced in bf0c571f7feb ("bpf: introduce
> forward arg-tracking dataflow analysis"). Should this commit include a
> Fixes: tag?
> 
> Fixes: bf0c571f7feb ("bpf: introduce forward arg-tracking dataflow analysis")

Suggested fixes tag is correct.

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 21:58 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
  2026-04-13 22:35   ` bot+bpf-ci
@ 2026-04-13 22:42   ` Alexei Starovoitov
  2026-04-13 22:57     ` Eduard Zingerman
  1 sibling, 1 reply; 9+ messages in thread
From: Alexei Starovoitov @ 2026-04-13 22:42 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, Apr 13, 2026 at 2:58 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> BPF_STX through ARG_IMPRECISE dst should be recognized as a local
> spill and join at_stack with the written value. For example,
> consider the following situation:
>
>    // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
>    *(u64 *)(r1 + 0) = r8
>
> Here the analysis should produce an equivalent of
>
>   at_stack[*] = join(old, r8)
>
> BPF_ST through multi-offset or imprecise dst should join at_stack with
> none instead of overwriting the slots. For example, consider the
> following situation:
>
>    // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
>    *(u64 *)(r1 + 0) = 0
>
> Here the analysis should produce an equivalent of
>
>   at_stack[*r1] = join(old, none).
>
> Move the definition of the clear_overlapping_stack_slots() in order to
> have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to
> avoid having two ways to express imprecise offset. Only
> 'offset-imprecise {frame=N, cnt=0}' remains.
>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  kernel/bpf/liveness.c | 110 ++++++++++++++++++++++++++++----------------------
>  1 file changed, 61 insertions(+), 49 deletions(-)
>
> diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
> index 1fb4c511db5a..23d19147f123 100644
> --- a/kernel/bpf/liveness.c
> +++ b/kernel/bpf/liveness.c
> @@ -574,7 +574,7 @@ static int print_instances(struct bpf_verifier_env *env)
>   *
>   *   precise {frame=N, off=V}      -- known absolute frame index and byte offset
>   *        |
> - *   offset-imprecise {frame=N, off=OFF_IMPRECISE}
> + *   offset-imprecise {frame=N, cnt=0}
>   *        |                        -- known frame identity, unknown offset
>   *   fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask}
>   *                                 -- unknown frame identity; .mask is a
> @@ -607,8 +607,6 @@ enum arg_track_state {
>         ARG_IMPRECISE   = -3,   /* lost identity; .mask is arg bitmask */
>  };
>
> -#define OFF_IMPRECISE  S16_MIN /* arg identity known but offset unknown */
> -
>  /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
>  #define MAX_ARG_SPILL_SLOTS 64
>
> @@ -622,28 +620,6 @@ static bool arg_is_fp(const struct arg_track *at)
>         return at->frame >= 0 || at->frame == ARG_IMPRECISE;
>  }
>
> -/*
> - * Clear all tracked callee stack slots overlapping the byte range
> - * [off, off+sz-1] where off is a negative FP-relative offset.
> - */
> -static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz)
> -{
> -       struct arg_track none = { .frame = ARG_NONE };
> -
> -       if (off == OFF_IMPRECISE) {
> -               for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++)
> -                       at_stack[i] = none;
> -               return;
> -       }
> -       for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) {
> -               int slot_start = -((i + 1) * 8);
> -               int slot_end = slot_start + 8;
> -
> -               if (slot_start < off + (int)sz && slot_end > off)
> -                       at_stack[i] = none;
> -       }
> -}
> -
>  static void verbose_arg_track(struct bpf_verifier_env *env, struct arg_track *at)
>  {
>         int i;
> @@ -863,16 +839,15 @@ static void arg_track_alu64(struct arg_track *dst, const struct arg_track *src)
>         *dst = arg_join_imprecise(*dst, *src);
>  }
>
> -static s16 arg_add(s16 off, s64 delta)
> +static bool arg_add(s16 off, s64 delta, s16 *out)
>  {
>         s64 res;
>
> -       if (off == OFF_IMPRECISE)
> -               return OFF_IMPRECISE;
>         res = (s64)off + delta;
> -       if (res < S16_MIN + 1 || res > S16_MAX)
> -               return OFF_IMPRECISE;
> -       return res;
> +       if (res < S16_MIN || res > S16_MAX)
> +               return true;
> +       *out = res;
> +       return false;

Nice. It's almost check_add_overflow().
May be something like:
  s16 d = delta;
  if (d != delta)
    return true;
  return check_add_overflow(off, d, out);

I'm curious whether asm will be better.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 22:42   ` Alexei Starovoitov
@ 2026-04-13 22:57     ` Eduard Zingerman
  0 siblings, 0 replies; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 22:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Kernel Team, Yonghong Song

On Mon, 2026-04-13 at 15:42 -0700, Alexei Starovoitov wrote:
> On Mon, Apr 13, 2026 at 2:58 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > BPF_STX through ARG_IMPRECISE dst should be recognized as a local
> > spill and join at_stack with the written value. For example,
> > consider the following situation:
> > 
> >    // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
> >    *(u64 *)(r1 + 0) = r8
> > 
> > Here the analysis should produce an equivalent of
> > 
> >   at_stack[*] = join(old, r8)
> > 
> > BPF_ST through multi-offset or imprecise dst should join at_stack with
> > none instead of overwriting the slots. For example, consider the
> > following situation:
> > 
> >    // r1 = ARG_IMPRECISE{mask=BIT(0)|BIT(1)}
> >    *(u64 *)(r1 + 0) = 0
> > 
> > Here the analysis should produce an equivalent of
> > 
> >   at_stack[*r1] = join(old, none).
> > 
> > Move the definition of the clear_overlapping_stack_slots() in order to
> > have __arg_track_join() visible. Remove the OFF_IMPRECISE constant to
> > avoid having two ways to express imprecise offset. Only
> > 'offset-imprecise {frame=N, cnt=0}' remains.
> > 
> > Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> > ---
> >  kernel/bpf/liveness.c | 110 ++++++++++++++++++++++++++++----------------------
> >  1 file changed, 61 insertions(+), 49 deletions(-)
> > 
> > diff --git a/kernel/bpf/liveness.c b/kernel/bpf/liveness.c
> > index 1fb4c511db5a..23d19147f123 100644
> > --- a/kernel/bpf/liveness.c
> > +++ b/kernel/bpf/liveness.c
> > @@ -574,7 +574,7 @@ static int print_instances(struct bpf_verifier_env *env)
> >   *
> >   *   precise {frame=N, off=V}      -- known absolute frame index and byte offset
> >   *        |
> > - *   offset-imprecise {frame=N, off=OFF_IMPRECISE}
> > + *   offset-imprecise {frame=N, cnt=0}
> >   *        |                        -- known frame identity, unknown offset
> >   *   fully-imprecise {frame=ARG_IMPRECISE, mask=bitmask}
> >   *                                 -- unknown frame identity; .mask is a
> > @@ -607,8 +607,6 @@ enum arg_track_state {
> >         ARG_IMPRECISE   = -3,   /* lost identity; .mask is arg bitmask */
> >  };
> > 
> > -#define OFF_IMPRECISE  S16_MIN /* arg identity known but offset unknown */
> > -
> >  /* Track callee stack slots fp-8 through fp-512 (64 slots of 8 bytes each) */
> >  #define MAX_ARG_SPILL_SLOTS 64
> > 
> > @@ -622,28 +620,6 @@ static bool arg_is_fp(const struct arg_track *at)
> >         return at->frame >= 0 || at->frame == ARG_IMPRECISE;
> >  }
> > 
> > -/*
> > - * Clear all tracked callee stack slots overlapping the byte range
> > - * [off, off+sz-1] where off is a negative FP-relative offset.
> > - */
> > -static void clear_overlapping_stack_slots(struct arg_track *at_stack, s16 off, u32 sz)
> > -{
> > -       struct arg_track none = { .frame = ARG_NONE };
> > -
> > -       if (off == OFF_IMPRECISE) {
> > -               for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++)
> > -                       at_stack[i] = none;
> > -               return;
> > -       }
> > -       for (int i = 0; i < MAX_ARG_SPILL_SLOTS; i++) {
> > -               int slot_start = -((i + 1) * 8);
> > -               int slot_end = slot_start + 8;
> > -
> > -               if (slot_start < off + (int)sz && slot_end > off)
> > -                       at_stack[i] = none;
> > -       }
> > -}
> > -
> >  static void verbose_arg_track(struct bpf_verifier_env *env, struct arg_track *at)
> >  {
> >         int i;
> > @@ -863,16 +839,15 @@ static void arg_track_alu64(struct arg_track *dst, const struct arg_track *src)
> >         *dst = arg_join_imprecise(*dst, *src);
> >  }
> > 
> > -static s16 arg_add(s16 off, s64 delta)
> > +static bool arg_add(s16 off, s64 delta, s16 *out)
> >  {
> >         s64 res;
> > 
> > -       if (off == OFF_IMPRECISE)
> > -               return OFF_IMPRECISE;
> >         res = (s64)off + delta;
> > -       if (res < S16_MIN + 1 || res > S16_MAX)
> > -               return OFF_IMPRECISE;
> > -       return res;
> > +       if (res < S16_MIN || res > S16_MAX)
> > +               return true;
> > +       *out = res;
> > +       return false;
> 
> Nice. It's almost check_add_overflow().
> May be something like:
>   s16 d = delta;
>   if (d != delta)
>     return true;
>   return check_add_overflow(off, d, out);

I modeled api after the check_add_overflow() but using it directly
didn't occur to me for some reason.

I don't think the s16 cast is necessary, check_add_overflow results in
a call to __builtin_add_overflow(), which is documented as:

  > Built-in Function: bool __builtin_add_overflow (type1 a, type2 b, type3 *res)
  >   ...
  >   These built-in functions promote the first two operands into
  >   infinite precision signed type and perform addition on those
  >   promoted operands. The result is then cast to the type the third
  >   pointer argument points to and stored there. If the stored result is
  >   equal to the infinite precision result, the built-in functions
  >   return false, otherwise they return true. As the addition is
  >   performed in infinite signed precision, these built-in functions
  >   have fully defined behavior for all argument values.

> I'm curious whether asm will be better.

Disasm looks identical.
I'll respin with check_add_overflow(), fixes tag and a test nit from sashiko.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
@ 2026-04-13 23:30 Eduard Zingerman
  2026-04-15 16:00 ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 9+ messages in thread
From: Eduard Zingerman @ 2026-04-13 23:30 UTC (permalink / raw)
  To: bpf, ast, andrii; +Cc: daniel, martin.lau, kernel-team, yonghong.song, eddyz87

When the static arg tracking analysis encounters a store through a
pointer with imprecise or multi-offset destination, it must use weak
updates (join) instead of strong updates (overwrite) for the affected
at_stack slots. At runtime only one slot is actually written; the
others retain their old values.

Two cases are addressed:
- BPF_STX, handled by spill_to_stack(). It was gated on
  `dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
  pointers entirely.
- BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
  clear_overlapping_stack_slots() which unconditionally set
  `at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
  when multiple candidate slots exist (cnt != 1), so that untouched
  slots preserve their tracked values.

No veristat diff compared to current master when tested on selftests,
sched_ext, cilium and a set of Meta internal programs.

This addresses issues reported by sashiko for patch #7 in [1].

[1] https://sashiko.dev/#/patchset/20260410-patch-set-v4-0-5d4eecb343db%40gmail.com

Changelog:
v2 -> v3:
- Use check_add_overflow() in arg_add() (Alexei).
- Add missing fixes tag (CI bot).
- Remove unused __imm in the selftest (sashiko).
v1 -> v2:
- Delete the OFF_IMPRECISE constant, always rely on
  arg_track->cnt == 0 as a marker the offset is imprecise.
  (Alexei).
- Squash all patches together to simplify backporting to
  'bpf' branch (Alexei).

v1: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v1-0-9f48a9999d6e@gmail.com/T/
v2: https://lore.kernel.org/bpf/20260413-stacklive-fixes-v2-0-ff91c4f8d273@gmail.com/T/
---
Eduard Zingerman (2):
      bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
      selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX

 kernel/bpf/liveness.c                              | 114 ++++++------
 .../selftests/bpf/progs/verifier_live_stack.c      | 193 +++++++++++++++++++++
 2 files changed, 255 insertions(+), 52 deletions(-)
---
base-commit: 71b500afd2f7336f5b6c6026f2af546fc079be26
change-id: 20260413-stacklive-fixes-42e258cf0397

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
  2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: " Eduard Zingerman
@ 2026-04-15 16:00 ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-04-15 16:00 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, andrii, daniel, martin.lau, kernel-team, yonghong.song

Hello:

This series was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 13 Apr 2026 16:30:51 -0700 you wrote:
> When the static arg tracking analysis encounters a store through a
> pointer with imprecise or multi-offset destination, it must use weak
> updates (join) instead of strong updates (overwrite) for the affected
> at_stack slots. At runtime only one slot is actually written; the
> others retain their old values.
> 
> Two cases are addressed:
> - BPF_STX, handled by spill_to_stack(). It was gated on
>   `dst_is_local_fp = (frame == depth)`, which missed ARG_IMPRECISE
>   pointers entirely.
> - BPF_ST, handled by clear_stack_for_all_offs(). It delegates to
>   clear_overlapping_stack_slots() which unconditionally set
>   `at_stack[i] = none`. Change to `at_stack[i] = join(old, none)`
>   when multiple candidate slots exist (cnt != 1), so that untouched
>   slots preserve their tracked values.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v2,1/2] bpf: fix arg tracking for imprecise/multi-offset BPF_ST/STX
    https://git.kernel.org/bpf/bpf/c/ecdd4fd8a54c
  - [bpf-next,v2,2/2] selftests/bpf: arg tracking for imprecise/multi-offset BPF_ST/STX
    https://git.kernel.org/bpf/bpf/c/d97cc8fc997c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-04-15 16:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 21:58 [PATCH bpf-next v2 0/2] bpf: arg tracking for imprecise/multi-offset BPF_ST/STX Eduard Zingerman
2026-04-13 21:58 ` [PATCH bpf-next v2 1/2] bpf: fix " Eduard Zingerman
2026-04-13 22:35   ` bot+bpf-ci
2026-04-13 22:39     ` Eduard Zingerman
2026-04-13 22:42   ` Alexei Starovoitov
2026-04-13 22:57     ` Eduard Zingerman
2026-04-13 21:58 ` [PATCH bpf-next v2 2/2] selftests/bpf: " Eduard Zingerman
  -- strict thread matches above, loose matches on Subject: below --
2026-04-13 23:30 [PATCH bpf-next v2 0/2] bpf: " Eduard Zingerman
2026-04-15 16:00 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox