[RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes

Netdev List
 help / color / mirror / Atom feed

* [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
@ 2026-06-01 18:03 Zhenzhong Wu
  2026-06-01 18:03 ` [RFC PATCH 6.1.y 1/2] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zhenzhong Wu @ 2026-06-01 18:03 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, tamird

Hi BPF maintainers,

This RFC backports two BPF verifier scalar range-tracking fixes to 6.1.y.
The series is intended to fix a verifier state-pruning issue where an
impossible scalar path can be kept while the real success path is pruned.

This is a verifier scalar range-tracking issue, not a helper-specific
issue.
The visible failure is that the verifier can prune the real success
continuation, which should not be skipped, and keep only an impossible one.
In the reproducer, the traced function returns 15 at runtime, but the
verifier keeps the path where r7 is treated as 0, hard-wires the opposite
branch, and the program reports the error branch.

The minimized reproducer uses fexit/bpf_get_func_ret only because it
provides a compact way to create the interesting register flow: one scalar
in r0 for the helper status, and another scalar loaded from the stack for
the traced function return value. The issue is not specific to
bpf_get_func_ret itself.
Because bpf_get_func_ret() was added in v5.17, this particular reproducer
directly applies to 6.1.y. I have not built a 5.15.y-compatible reproducer.

The relevant verifier-log bytecode from the reproducer is below. The later
instructions only store r7 into a map so user space can observe which
branch the verifier kept.

  15: (85) call bpf_get_func_ret#184    ; R0_w=scalar() fp-8_w=mmmmmmmm
  16: (79) r7 = *(u64 *)(r10 -8)        ; R7_w=scalar() R10=fp0
  17: (15) if r0 == 0x0 goto pc+1       ; R0_w=scalar()
  18: (bf) r7 = r0                      ; R0=scalar(id=1) R7=scalar(id=1)
  19: (55) if r0 != 0x0 goto pc+6       ; R0=0
  20: (67) r7 <<= 32                    ; R7_w=0
  21: (77) r7 >>= 32                    ; R7_w=0
  22: (b7) r1 = 1                       ; R1_w=1
  23: (55) if r7 != 0xf goto pc+1

The failure mechanism is:

  1. The program checks "if r0 == 0". The jump target is the success path,
     and the fallthrough path is the failure path and should imply r0 != 0.

  2. On v6.1.91, the verifier does not record that r0 != 0 fact for the
     fallthrough path. The following "r7 = r0" then gives r0 and r7 the
     same scalar id while both are still treated as possibly zero.

  3. At the later "if r0 != 0" check, the verifier still thinks r0 may be
     zero, so it explores the fallthrough path of that JNE. That path means
     r0 == 0, and because r7 shares the same scalar id, r7 is narrowed to
     zero as well. This is an impossible path: it came from the earlier
     failure path that should have implied r0 != 0.

  4. That impossible continuation reaches the return-value comparison with
     r7 == 0 and can make the verifier keep only the wrong branch. When the
     real success path is analyzed later, state pruning considers it safe
     against the earlier cached verifier state, so the real continuation is
     not explored.

The relevant pruning point is that regsafe()/states_equal() accepted the
real success-path state against an earlier cached state where r0 was an
imprecise scalar and r7 constraints were loose enough to cover the current
r7.

After confirming the mechanism, I ran git bisect with this minimized C
reproducer as the test case. The bisect started from the affected 6.7.y
behavior and the fixed v6.8 behavior, and narrowed the fix to the
v6.7..v6.8 window:

  https://gist.github.com/swananan/165cca6008f6c81870a28aa7a445d5ea

The bisect identified the upstream fix as:

  d028f87517d6775dccff4ddbca2740826f9e53f1
  bpf: make the verifier tracks the "not equal" for regs

For 6.1.y, applying d028f87517d6 alone is not sufficient. The older
verifier code also needs the range-preservation semantics from:

  9e314f5d8682e1fe6ac214fb34580a238b6fd3c4
  bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic

Without that semantic prerequisite, the old range-combining logic can still
discard the refined bounds after the verifier learns them.

The 6.1.y adaptation is split as follows:

  - patch 1 carries the 6.1.y-relevant part of 9e314f5d8682 by removing the
    knowledge-losing __reg_combine_{32,64}_into_{64,32} paths and using
    reg_bounds_sync() after conditional refinement;
  - patch 2 carries d028f87517d6 in the older reg_set_min_max() layout. In
    newer kernels, reg_set_min_max() refines the fallthrough branch through
    rev_opcode(opcode), so the fallthrough branch of BPF_JEQ is handled by
    the BPF_JNE refinement. In 6.1.y that split does not exist, so the same
    not-equal fact is expressed directly on BPF_JEQ's false_reg and
    BPF_JNE's true_reg.

Observed results with that reproducer:

  v6.1.91:               REPRO: BAD  (ran=1 error=1)
  v6.7.12:               REPRO: BAD  (ran=1 error=1)
  v6.8:                  REPRO: GOOD (ran=1 error=0)
  v6.1.91 + this series: REPRO: GOOD (ran=1 error=0)

Because this touches shared verifier scalar range logic, I am sending it as
RFC and would appreciate BPF maintainer guidance on whether this 6.1.y
semantic backport should be carried and whether the split in this series is
reasonable. The same issue should also be relevant to 6.6.y, which still
has the older verifier logic and predates the v6.8 fix, but this RFC only
includes the 6.1.y backport.

Zhenzhong Wu (2):
  bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic
  bpf: make the verifier tracks the "not equal" for regs

 kernel/bpf/verifier.c | 92 +++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 52 deletions(-)

base-commit: 228da13e907e2b46b7222cfc35290fbfad920bef
-- 
2.43.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 6.1.y 1/2] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic
  2026-06-01 18:03 [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
@ 2026-06-01 18:03 ` Zhenzhong Wu
  2026-06-01 18:04 ` [RFC PATCH 6.1.y 2/2] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
  2026-06-02  5:47 ` [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
  2 siblings, 0 replies; 8+ messages in thread
From: Zhenzhong Wu @ 2026-06-01 18:03 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, tamird

[ Upstream commit 9e314f5d8682e1fe6ac214fb34580a238b6fd3c4 ]

When performing 32-bit conditional operation operating on lower 32 bits
of a full 64-bit register, register full value isn't changed. We just
potentially gain new knowledge about that register's lower 32 bits.

Unfortunately, __reg_combine_{32,64}_into_{64,32} logic that
reg_set_min_max() performs as a last step, can lose information in some
cases due to __mark_reg64_unbounded() and __reg_assign_32_into_64().
That's bad and completely unnecessary. Especially __reg_assign_32_into_64()
looks completely out of place here, because we are not performing
zero-extending subregister assignment during conditional jump.

So this patch replaced __reg_combine_* with just a normal
reg_bounds_sync() which will do a proper job of deriving u64/s64 bounds
from u32/s32, and vice versa (among all other combinations).

__reg_combine_64_into_32() is also used in one more place,
coerce_reg_to_size(), while handling 1- and 2-byte register loads.
Looking into this, it seems like besides marking subregister as
unbounded before performing reg_bounds_sync(), we were also performing
deduction of smin32/smax32 and umin32/umax32 bounds from respective
smin/smax and umin/umax bounds. It's now redundant as reg_bounds_sync()
performs all the same logic more generically (e.g., without unnecessary
assumption that upper 32 bits of full register should be zero).

Long story short, we remove __reg_combine_64_into_32() completely, and
coerce_reg_to_size() now only does resetting subreg to unbounded and then
performing reg_bounds_sync() to recover as much information as possible
from 64-bit umin/umax and smin/smax bounds, set explicitly in
coerce_reg_to_size() earlier.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231102033759.2541186-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[ zhenzhong: adapt to 6.1.y verifier.c layout ]
Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
---
 kernel/bpf/verifier.c | 60 ++++++-------------------------------------
 1 file changed, 8 insertions(+), 52 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d8d3616..5e029d1 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1577,51 +1577,6 @@ static void __reg_assign_32_into_64(struct bpf_reg_state *reg)
 	}
 }
 
-static void __reg_combine_32_into_64(struct bpf_reg_state *reg)
-{
-	/* special case when 64-bit register has upper 32-bit register
-	 * zeroed. Typically happens after zext or <<32, >>32 sequence
-	 * allowing us to use 32-bit bounds directly,
-	 */
-	if (tnum_equals_const(tnum_clear_subreg(reg->var_off), 0)) {
-		__reg_assign_32_into_64(reg);
-	} else {
-		/* Otherwise the best we can do is push lower 32bit known and
-		 * unknown bits into register (var_off set from jmp logic)
-		 * then learn as much as possible from the 64-bit tnum
-		 * known and unknown bits. The previous smin/smax bounds are
-		 * invalid here because of jmp32 compare so mark them unknown
-		 * so they do not impact tnum bounds calculation.
-		 */
-		__mark_reg64_unbounded(reg);
-	}
-	reg_bounds_sync(reg);
-}
-
-static bool __reg64_bound_s32(s64 a)
-{
-	return a >= S32_MIN && a <= S32_MAX;
-}
-
-static bool __reg64_bound_u32(u64 a)
-{
-	return a >= U32_MIN && a <= U32_MAX;
-}
-
-static void __reg_combine_64_into_32(struct bpf_reg_state *reg)
-{
-	__mark_reg32_unbounded(reg);
-	if (__reg64_bound_s32(reg->smin_value) && __reg64_bound_s32(reg->smax_value)) {
-		reg->s32_min_value = (s32)reg->smin_value;
-		reg->s32_max_value = (s32)reg->smax_value;
-	}
-	if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value)) {
-		reg->u32_min_value = (u32)reg->umin_value;
-		reg->u32_max_value = (u32)reg->umax_value;
-	}
-	reg_bounds_sync(reg);
-}
-
 /* Mark a register as having a completely unknown (scalar) value. */
 static void __mark_reg_unknown(const struct bpf_verifier_env *env,
 			       struct bpf_reg_state *reg)
@@ -4660,9 +4615,10 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
 	 * values are also truncated so we push 64-bit bounds into
 	 * 32-bit bounds. Above were truncated < 32-bits already.
 	 */
-	if (size >= 4)
-		return;
-	__reg_combine_64_into_32(reg);
+	if (size < 4) {
+		__mark_reg32_unbounded(reg);
+		reg_bounds_sync(reg);
+	}
 }
 
 static bool bpf_map_is_rdonly(const struct bpf_map *map)
@@ -10114,13 +10070,13 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
 					     tnum_subreg(false_32off));
 		true_reg->var_off = tnum_or(tnum_clear_subreg(true_64off),
 					    tnum_subreg(true_32off));
-		__reg_combine_32_into_64(false_reg);
-		__reg_combine_32_into_64(true_reg);
+		reg_bounds_sync(false_reg);
+		reg_bounds_sync(true_reg);
 	} else {
 		false_reg->var_off = false_64off;
 		true_reg->var_off = true_64off;
-		__reg_combine_64_into_32(false_reg);
-		__reg_combine_64_into_32(true_reg);
+		reg_bounds_sync(false_reg);
+		reg_bounds_sync(true_reg);
 	}
 }
 
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 6.1.y 2/2] bpf: make the verifier tracks the "not equal" for regs
  2026-06-01 18:03 [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
  2026-06-01 18:03 ` [RFC PATCH 6.1.y 1/2] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
@ 2026-06-01 18:04 ` Zhenzhong Wu
  2026-06-02  5:47 ` [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
  2 siblings, 0 replies; 8+ messages in thread
From: Zhenzhong Wu @ 2026-06-01 18:04 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, tamird

[ Upstream commit d028f87517d6775dccff4ddbca2740826f9e53f1 ]

We can derive some new information for BPF_JNE in regs_refine_cond_op().
Take following code for example:

  /* The type of "a" is u32 */
  if (a > 0 && a < 100) {
    /* the range of the register for a is [0, 99], not [1, 99],
     * and will cause the following error:
     *
     *   invalid zero-sized read
     *
     * as a can be 0.
     */
    bpf_skb_store_bytes(skb, xx, xx, a, 0);
  }

In the code above, "a > 0" will be compiled to "jmp xxx if a == 0". In the
TRUE branch, the dst_reg will be marked as known to 0. However, in the
fallthrough(FALSE) branch, the dst_reg will not be handled, which makes
the [min, max] for a is [0, 99], not [1, 99].

For BPF_JNE, we can reduce the range of the dst reg if the src reg is a
const and is exactly the edge of the dst reg.

Signed-off-by: Menglong Dong <menglong8.dong@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231219134800.1550388-2-menglong8.dong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[ zhenzhong: adapt to 6.1.y reg_set_min_max() layout. The upstream
  change lives in regs_refine_cond_op(); 6.1.y still refines true/false
  branch states in reg_set_min_max(), so apply the not-equal range
  exclusion to BPF_JEQ's false_reg and BPF_JNE's true_reg there. ]
Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
---
 kernel/bpf/verifier.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5e029d1..e51f44b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9954,18 +9954,50 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
 		if (is_jmp32) {
 			__mark_reg32_known(true_reg, val32);
 			true_32off = tnum_subreg(true_reg->var_off);
+			if (false_reg->u32_min_value == val32)
+				false_reg->u32_min_value++;
+			if (false_reg->u32_max_value == val32)
+				false_reg->u32_max_value--;
+			if (false_reg->s32_min_value == sval32)
+				false_reg->s32_min_value++;
+			if (false_reg->s32_max_value == sval32)
+				false_reg->s32_max_value--;
 		} else {
 			___mark_reg_known(true_reg, val);
 			true_64off = true_reg->var_off;
+			if (false_reg->umin_value == val)
+				false_reg->umin_value++;
+			if (false_reg->umax_value == val)
+				false_reg->umax_value--;
+			if (false_reg->smin_value == sval)
+				false_reg->smin_value++;
+			if (false_reg->smax_value == sval)
+				false_reg->smax_value--;
 		}
 		break;
 	case BPF_JNE:
 		if (is_jmp32) {
 			__mark_reg32_known(false_reg, val32);
 			false_32off = tnum_subreg(false_reg->var_off);
+			if (true_reg->u32_min_value == val32)
+				true_reg->u32_min_value++;
+			if (true_reg->u32_max_value == val32)
+				true_reg->u32_max_value--;
+			if (true_reg->s32_min_value == sval32)
+				true_reg->s32_min_value++;
+			if (true_reg->s32_max_value == sval32)
+				true_reg->s32_max_value--;
 		} else {
 			___mark_reg_known(false_reg, val);
 			false_64off = false_reg->var_off;
+			if (true_reg->umin_value == val)
+				true_reg->umin_value++;
+			if (true_reg->umax_value == val)
+				true_reg->umax_value--;
+			if (true_reg->smin_value == sval)
+				true_reg->smin_value++;
+			if (true_reg->smax_value == sval)
+				true_reg->smax_value--;
 		}
 		break;
 	case BPF_JSET:
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
  2026-06-01 18:03 [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
  2026-06-01 18:03 ` [RFC PATCH 6.1.y 1/2] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
  2026-06-01 18:04 ` [RFC PATCH 6.1.y 2/2] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
@ 2026-06-02  5:47 ` Shung-Hsi Yu
  2026-06-02  6:42   ` Shung-Hsi Yu
  2 siblings, 1 reply; 8+ messages in thread
From: Shung-Hsi Yu @ 2026-06-02  5:47 UTC (permalink / raw)
  To: Zhenzhong Wu, eddyz87
  Cc: stable, Paul Chaignon, bpf, netdev, linux-kernel, ast, daniel,
	john.fastabend, andrii, martin.lau, song, yonghong.song, kpsingh,
	sdf, haoluo, jolsa, menglong8.dong, tamird

Hi Zhenzhong,

Thanks for looking at the stable kernel branch!

Since this patchset is intended for stable 6.1 I'd suggest to also
include stable@vger.kernel.org even if this is an RFC (and ideally with
'PATCH stable ...' as subject prefix, but that's just minor), so that
the stable team is aware.

On Tue, Jun 02, 2026 at 02:03:58AM +0800, Zhenzhong Wu wrote:
> Hi BPF maintainers,
> 
> This RFC backports two BPF verifier scalar range-tracking fixes to 6.1.y.
> The series is intended to fix a verifier state-pruning issue where an
> impossible scalar path can be kept while the real success path is pruned.
> 
> This is a verifier scalar range-tracking issue, not a helper-specific
> issue.
> The visible failure is that the verifier can prune the real success
> continuation, which should not be skipped, and keep only an impossible one.
...

This sounds somewhat similar to the issue fixed in "backport of iterator
and callback handling fixes" for stable 6.6[1] by @Eduard. Could you try
to test on the latest stable 6.6.y as well at see if you can reproduce
the issue there?

Also per stable policy[2] we have backport the patches in the series to
6.6 first if we want it in 6.1 anyway.

  When using option 2 or 3 you can ask for your change to be included in specific
  stable series. When doing so, ensure the fix or an equivalent is applicable,
  submitted, or already present in all newer stable trees still supported. This is
  meant to prevent regressions that users might later encounter on updating...

Cheers,
Shung-Hsi Yu

1: https://lore.kernel.org/stable/20240125001554.25287-1-eddyz87@gmail.com/
2: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
  2026-06-02  5:47 ` [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
@ 2026-06-02  6:42   ` Shung-Hsi Yu
  2026-06-02  9:17     ` Shung-Hsi Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Shung-Hsi Yu @ 2026-06-02  6:42 UTC (permalink / raw)
  To: Zhenzhong Wu
  Cc: stable, Paul Chaignon, bpf, netdev, linux-kernel, ast, daniel,
	john.fastabend, andrii, martin.lau, song, yonghong.song, kpsingh,
	haoluo, jolsa, menglong8.dong, tamird, eddyz87

On Tue, Jun 02, 2026 at 01:47:01PM +0800, Shung-Hsi Yu wrote:
...
> On Tue, Jun 02, 2026 at 02:03:58AM +0800, Zhenzhong Wu wrote:
> > Hi BPF maintainers,
> > 
> > This RFC backports two BPF verifier scalar range-tracking fixes to 6.1.y.
> > The series is intended to fix a verifier state-pruning issue where an
> > impossible scalar path can be kept while the real success path is pruned.
> > 
> > This is a verifier scalar range-tracking issue, not a helper-specific
> > issue.
> > The visible failure is that the verifier can prune the real success
> > continuation, which should not be skipped, and keep only an impossible one.
> ...
> 
> This sounds somewhat similar to the issue fixed in "backport of iterator
> and callback handling fixes" for stable 6.6[1] by @Eduard. Could you try
> to test on the latest stable 6.6.y as well at see if you can reproduce
> the issue there?
...

My mistake, the reproducer you had doesn't use iterator or callback, so
probably not fixed in stable 6.6. I'll take a better look at this later
this week.

> 1: https://lore.kernel.org/stable/20240125001554.25287-1-eddyz87@gmail.com/
> 2: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
  2026-06-02  6:42   ` Shung-Hsi Yu
@ 2026-06-02  9:17     ` Shung-Hsi Yu
  2026-06-02 17:25       ` Zhenzhong Wu
  0 siblings, 1 reply; 8+ messages in thread
From: Shung-Hsi Yu @ 2026-06-02  9:17 UTC (permalink / raw)
  To: Zhenzhong Wu
  Cc: stable, Paul Chaignon, bpf, netdev, linux-kernel, ast, daniel,
	john.fastabend, andrii, martin.lau, song, yonghong.song, kpsingh,
	haoluo, jolsa, menglong8.dong, tamird, eddyz87

On Tue, Jun 02, 2026 at 02:42:35PM +0800, Shung-Hsi Yu wrote:
> On Tue, Jun 02, 2026 at 01:47:01PM +0800, Shung-Hsi Yu wrote:
> ...
> > On Tue, Jun 02, 2026 at 02:03:58AM +0800, Zhenzhong Wu wrote:
> > > Hi BPF maintainers,
> > > 
> > > This RFC backports two BPF verifier scalar range-tracking fixes to 6.1.y.
> > > The series is intended to fix a verifier state-pruning issue where an
> > > impossible scalar path can be kept while the real success path is pruned.
> > > 
> > > This is a verifier scalar range-tracking issue, not a helper-specific
> > > issue.
> > > The visible failure is that the verifier can prune the real success
> > > continuation, which should not be skipped, and keep only an impossible one.
> > ...
> > 
> > This sounds somewhat similar to the issue fixed in "backport of iterator
> > and callback handling fixes" for stable 6.6[1] by @Eduard. Could you try
> > to test on the latest stable 6.6.y as well at see if you can reproduce
> > the issue there?
> ...
> 
> My mistake, the reproducer you had doesn't use iterator or callback, so
> probably not fixed in stable 6.6. I'll take a better look at this later
> this week.

Two more ideas beside testing on latest stable 6.6. 

1. Can you try testing on bpf-next, but with commit d028f87517d6 'bpf:
   make the verifier tracks the "not equal" for regs' reverted? My
   concern is that it is possible that commit d028f87517d6 does not
   address the root cause of incorrect state pruning here.

   If the reproducer _fails_ to reproduce the issue even with commit
   d028f87517d6 reverted, then it is possible that the root cause was
   fixed by another commit further down the line.

2. Have you consider adding your reproducer into BPF selftests? Would be
   very useful to have in stable (though it needs to first land in
   bpf-next first).

> > 1: https://lore.kernel.org/stable/20240125001554.25287-1-eddyz87@gmail.com/
> > 2: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
  2026-06-02  9:17     ` Shung-Hsi Yu
@ 2026-06-02 17:25       ` Zhenzhong Wu
  2026-06-06  9:19         ` Shung-Hsi Yu
  0 siblings, 1 reply; 8+ messages in thread
From: Zhenzhong Wu @ 2026-06-02 17:25 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: stable, Paul Chaignon, bpf, netdev, linux-kernel, ast, daniel,
	john.fastabend, andrii, martin.lau, song, yonghong.song, kpsingh,
	haoluo, jolsa, menglong8.dong, tamird, eddyz87

Hi Shung-Hsi,

Thanks, that makes sense.

I was mixing up two different things here: the BPF docs say not to add
"Cc: stable@vger.kernel.org" to the patch description as a stable tag, and
instead ask BPF maintainers to queue stable fixes. Cc'ing stable@ in the
email headers for awareness is separate. Thanks for pointing this out.

Thanks also for pointing out the 6.6.y requirement. I'll make sure v2 takes
the stable ordering requirement into account before targeting 6.1.y.

I ran the suggested checks with the same reproducer, where BAD means the
program ran and observed the unexpected error, and GOOD means no error was
observed:

- latest 6.6.y, v6.6.142 (924b4a879cbb): BAD
- bpf-next at b93c55b4932d: GOOD
- bpf-next with the d028f87517d6 JNE refinement reverted: still GOOD

So the issue still reproduces on the latest 6.6.y, but d028f87517d6 alone
does not explain why bpf-next passes. I'll do more narrowing and update the
candidate backport set accordingly.

I'm also happy to add a BPF selftest for this. I plan to send a v2 series
later this week.

BR,
Zhenzhong

Shung-Hsi Yu <shung-hsi.yu@suse.com>于2026年6月2日 周二17:18写道：


On Tue, Jun 2, 2026 at 5:18 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
>
> On Tue, Jun 02, 2026 at 02:42:35PM +0800, Shung-Hsi Yu wrote:
> > On Tue, Jun 02, 2026 at 01:47:01PM +0800, Shung-Hsi Yu wrote:
> > ...
> > > On Tue, Jun 02, 2026 at 02:03:58AM +0800, Zhenzhong Wu wrote:
> > > > Hi BPF maintainers,
> > > >
> > > > This RFC backports two BPF verifier scalar range-tracking fixes to 6.1.y.
> > > > The series is intended to fix a verifier state-pruning issue where an
> > > > impossible scalar path can be kept while the real success path is pruned.
> > > >
> > > > This is a verifier scalar range-tracking issue, not a helper-specific
> > > > issue.
> > > > The visible failure is that the verifier can prune the real success
> > > > continuation, which should not be skipped, and keep only an impossible one.
> > > ...
> > >
> > > This sounds somewhat similar to the issue fixed in "backport of iterator
> > > and callback handling fixes" for stable 6.6[1] by @Eduard. Could you try
> > > to test on the latest stable 6.6.y as well at see if you can reproduce
> > > the issue there?
> > ...
> >
> > My mistake, the reproducer you had doesn't use iterator or callback, so
> > probably not fixed in stable 6.6. I'll take a better look at this later
> > this week.
>
> Two more ideas beside testing on latest stable 6.6.
>
> 1. Can you try testing on bpf-next, but with commit d028f87517d6 'bpf:
>    make the verifier tracks the "not equal" for regs' reverted? My
>    concern is that it is possible that commit d028f87517d6 does not
>    address the root cause of incorrect state pruning here.
>
>    If the reproducer _fails_ to reproduce the issue even with commit
>    d028f87517d6 reverted, then it is possible that the root cause was
>    fixed by another commit further down the line.
>
> 2. Have you consider adding your reproducer into BPF selftests? Would be
>    very useful to have in stable (though it needs to first land in
>    bpf-next first).
>
> > > 1: https://lore.kernel.org/stable/20240125001554.25287-1-eddyz87@gmail.com/
> > > 2: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes
  2026-06-02 17:25       ` Zhenzhong Wu
@ 2026-06-06  9:19         ` Shung-Hsi Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Shung-Hsi Yu @ 2026-06-06  9:19 UTC (permalink / raw)
  To: Zhenzhong Wu
  Cc: stable, Paul Chaignon, bpf, netdev, linux-kernel, ast, daniel,
	john.fastabend, andrii, martin.lau, song, yonghong.song, kpsingh,
	haoluo, jolsa, menglong8.dong, tamird, eddyz87

Just want to send out a quick reply after looking at this.

On Wed, Jun 03, 2026 at 01:25:15AM +0800, Zhenzhong Wu wrote:
> Hi Shung-Hsi,
...
> I ran the suggested checks with the same reproducer, where BAD means the
> program ran and observed the unexpected error, and GOOD means no error was
> observed:
> 
> - latest 6.6.y, v6.6.142 (924b4a879cbb): BAD
> - bpf-next at b93c55b4932d: GOOD
> - bpf-next with the d028f87517d6 JNE refinement reverted: still GOOD
> 
> So the issue still reproduces on the latest 6.6.y, but d028f87517d6 alone
> does not explain why bpf-next passes. I'll do more narrowing and update the
> candidate backport set accordingly.
...

I think it possibly comes down to commit 4bf79f9be434e ("bpf: Track
equal scalars history on per-instruction level") added in v6.12. Without
that, the precise mark wasn't propogated (for scalars with the same ID),
and that likely made the state comparison (invalidly) go through.

Shung-Hsi

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-06  9:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 18:03 [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
2026-06-01 18:03 ` [RFC PATCH 6.1.y 1/2] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
2026-06-01 18:04 ` [RFC PATCH 6.1.y 2/2] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
2026-06-02  5:47 ` [RFC PATCH 6.1.y 0/2] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
2026-06-02  6:42   ` Shung-Hsi Yu
2026-06-02  9:17     ` Shung-Hsi Yu
2026-06-02 17:25       ` Zhenzhong Wu
2026-06-06  9:19         ` Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox