[PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes

Netdev List
 help / color / mirror / Atom feed

* [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes
@ 2026-06-07 17:09 Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 1/3] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Zhenzhong Wu @ 2026-06-07 17:09 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, stable, mykolal, tamird

Hi,

This series backports two BPF verifier scalar range-tracking fixes to
6.6.y and adds a selftest. It fixes a verifier state-pruning issue where
an impossible linked-scalar path can be kept while the real success path is
pruned.

The issue is verifier scalar state tracking, not helper-specific behavior.
A helper return value in r0 and another scalar can become linked by scalar
id on one branch. If the verifier does not preserve the not-equal fact on
the right branch edge, a later check can let it explore an impossible
continuation, narrow the linked scalar to the wrong value, and prune the
real success path against an earlier cached state. The program is accepted
by the verifier but then reports the wrong branch outcome at runtime.

The original visible failure was found in Rust-generated eBPF around helper
calls. Rust match lowering can keep a helper return value and a scalar
filled through a by-reference helper argument in the same enum-style control
flow. That makes it easy for the verifier-visible scalar values to become
linked by scalar id.

The relevant verifier-log bytecode from the original fexit reproducer is
below. The later instructions only store r7 into a map so user space can
observe which branch the verifier kept.

  15: (85) call bpf_get_func_ret#184    ; R0_w=scalar() fp-8_w=mmmmmmmm
  16: (79) r7 = *(u64 *)(r10 -8)        ; R7_w=scalar() R10=fp0
  17: (15) if r0 == 0x0 goto pc+1       ; R0_w=scalar()
  18: (bf) r7 = r0                      ; R0=scalar(id=1) R7=scalar(id=1)
  19: (55) if r0 != 0x0 goto pc+6       ; R0=0
  20: (67) r7 <<= 32                    ; R7_w=0
  21: (77) r7 >>= 32                    ; R7_w=0
  22: (b7) r1 = 1                       ; R1_w=1
  23: (55) if r7 != 0xf goto pc+1

The failure mechanism is:

  1. The program checks "if r0 == 0". The jump target is the success path,
     and the fallthrough path is the failure path and should imply r0 != 0.

  2. On affected kernels, the verifier does not record that r0 != 0 fact for
     the fallthrough path. The following "r7 = r0" then gives r0 and r7 the
     same scalar id while both are still treated as possibly zero.

  3. At the later "if r0 != 0" check, the verifier still thinks r0 may be
     zero, so it explores the fallthrough path of that JNE. That path means
     r0 == 0, and because r7 shares the same scalar id, r7 is narrowed to
     zero as well. This is an impossible path: it came from the earlier
     failure path that should have implied r0 != 0.

  4. That impossible continuation reaches the return-value comparison with
     r7 == 0 and can make the verifier keep only the wrong branch. When the
     real success path is analyzed later, state pruning considers it safe
     against the earlier cached verifier state, so the real continuation is
     not explored.

The relevant pruning point is that regsafe()/states_equal() accepted the
real success-path state against an earlier cached state where r0 was an
imprecise scalar and r7 constraints were loose enough to cover the current
r7.

After confirming the mechanism, I used a reproducer with the same verifier
state shape, now captured by the selftest, as the test case for git bisect.
The bisect started from the affected 6.7.y behavior and the fixed v6.8
behavior, and narrowed the fix to the v6.7..v6.8 window. It identified the
upstream fix as:

  d028f87517d6775dccff4ddbca2740826f9e53f1
  bpf: make the verifier tracks the "not equal" for regs

For 6.6.y and older stable verifier code, applying d028f87517d6 alone is
not sufficient. The verifier also needs the range-preservation semantics
from:

  9e314f5d8682e1fe6ac214fb34580a238b6fd3c4
  bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic

Without that semantic prerequisite, the old range-combining logic can still
discard the refined bounds after the verifier learns them.

The new selftest uses bpf_skb_load_bytes() only to create a helper status in
r0 and run through the normal tc test-run path. It reproduces the verifier
state shape without requiring fexit attach or bpf_get_func_ret().

I would like this fix to be applied to the supported 6.6.y, 6.1.y,
5.15.y, and 5.10.y stable trees. This v2 targets 6.6.y first for stable
ordering. The same issue is also reproducible on 6.1.y, 5.15.y, and
5.10.y, but those trees need separate older-layout adaptations.

Targeted BPF selftest/reproducer results are:

  For 5.10.y and 5.15.y, I used the same minimized reproducer bytecode in
  QEMU because those trees still use the older test_verifier framework.

  v5.10.258:                         FAIL
  v5.10.258 + equivalent backport:   PASS
  v5.15.209:                         FAIL
  v5.15.209 + equivalent backport:   PASS
  v6.1.91:                         FAIL
  v6.1.91 + RFC backport series:   PASS
  v6.6.142:                        FAIL
  v6.6.142 + this series:          PASS
  v6.7.12:                         FAIL
  v6.8:                            PASS

I also checked bpf-next: bpf-next passes even when the d028f87517d6 JNE
refinement is reverted, because newer kernels also have the later
4bf79f9be434e ("bpf: Track equal scalars history on per-instruction level")
precision-tracking change. I did not use 4bf79f9be434e as the stable
backport base because it is a broader jmp_history/precision-tracking change
for linked scalars. For 6.6.y this series keeps the smaller stable backport
path that directly follows the bisected fix: preserve scalar bounds after
conditional refinement, then add the not-equal range refinement in the older
reg_set_min_max() layout.

Changes since RFC v1:
  - drop RFC;
  - state the intended stable targets and keep 6.6.y first for stable
    ordering;
  - add a BPF selftest covering the failure;
  - add 5.10.y and 5.15.y reproducer validation;
  - document why Rust-generated eBPF can naturally create this state shape;
  - note the later 4bf79f9be434e precision-tracking reason why bpf-next can
    pass independently.

RFC v1:
  https://lore.kernel.org/r/20260601180400.1381736-1-jt26wzz@gmail.com/

Thanks to Shung-Hsi Yu for reviewing the RFC, pointing out that 6.6.y
should be handled first for stable ordering, and noting that bpf-next is
also protected by the later 4bf79f9be434e ("bpf: Track equal scalars
history on per-instruction level") precision-tracking change.

Zhenzhong Wu (3):
  bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic
  bpf: make the verifier tracks the "not equal" for regs
  selftests/bpf: add helper retval linked scalar pruning test

 kernel/bpf/verifier.c                         | 92 ++++++++-----------
 .../selftests/bpf/progs/verifier_reg_equal.c  | 35 +++++++
 2 files changed, 75 insertions(+), 52 deletions(-)

base-commit: 924b4a879cbb75aef37c160b955b92f6894b11a4
-- 
2.43.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH stable 6.6.y v2 1/3] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic
  2026-06-07 17:09 [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
@ 2026-06-07 17:09 ` Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 2/3] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Zhenzhong Wu @ 2026-06-07 17:09 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, stable, mykolal, tamird

From: Andrii Nakryiko <andrii@kernel.org>

[ Upstream commit 9e314f5d8682e1fe6ac214fb34580a238b6fd3c4 ]

When performing 32-bit conditional operation operating on lower 32 bits
of a full 64-bit register, register full value isn't changed. We just
potentially gain new knowledge about that register's lower 32 bits.

Unfortunately, __reg_combine_{32,64}_into_{64,32} logic that
reg_set_min_max() performs as a last step, can lose information in some
cases due to __mark_reg64_unbounded() and __reg_assign_32_into_64().
That's bad and unnecessary. Especially __reg_assign_32_into_64() looks
out of place here, because we are not performing zero-extending
subregister assignment during conditional jump.

Replace __reg_combine_* with reg_bounds_sync(), which derives u64/s64
bounds from u32/s32 and vice versa.

For coerce_reg_to_size(), reset subreg bounds for 1- and 2-byte loads and
then use reg_bounds_sync() to recover as much information as possible.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231102033759.2541186-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[ zhenzhong: backport to 6.6.y verifier.c layout. ]
Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
---
 kernel/bpf/verifier.c | 60 ++++++-------------------------------------
 1 file changed, 8 insertions(+), 52 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0d90236d0..5f94bff12 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2448,51 +2448,6 @@ static void __reg_assign_32_into_64(struct bpf_reg_state *reg)
 	}
 }
 
-static void __reg_combine_32_into_64(struct bpf_reg_state *reg)
-{
-	/* special case when 64-bit register has upper 32-bit register
-	 * zeroed. Typically happens after zext or <<32, >>32 sequence
-	 * allowing us to use 32-bit bounds directly,
-	 */
-	if (tnum_equals_const(tnum_clear_subreg(reg->var_off), 0)) {
-		__reg_assign_32_into_64(reg);
-	} else {
-		/* Otherwise the best we can do is push lower 32bit known and
-		 * unknown bits into register (var_off set from jmp logic)
-		 * then learn as much as possible from the 64-bit tnum
-		 * known and unknown bits. The previous smin/smax bounds are
-		 * invalid here because of jmp32 compare so mark them unknown
-		 * so they do not impact tnum bounds calculation.
-		 */
-		__mark_reg64_unbounded(reg);
-	}
-	reg_bounds_sync(reg);
-}
-
-static bool __reg64_bound_s32(s64 a)
-{
-	return a >= S32_MIN && a <= S32_MAX;
-}
-
-static bool __reg64_bound_u32(u64 a)
-{
-	return a >= U32_MIN && a <= U32_MAX;
-}
-
-static void __reg_combine_64_into_32(struct bpf_reg_state *reg)
-{
-	__mark_reg32_unbounded(reg);
-	if (__reg64_bound_s32(reg->smin_value) && __reg64_bound_s32(reg->smax_value)) {
-		reg->s32_min_value = (s32)reg->smin_value;
-		reg->s32_max_value = (s32)reg->smax_value;
-	}
-	if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value)) {
-		reg->u32_min_value = (u32)reg->umin_value;
-		reg->u32_max_value = (u32)reg->umax_value;
-	}
-	reg_bounds_sync(reg);
-}
-
 /* Mark a register as having a completely unknown (scalar) value. */
 static void __mark_reg_unknown(const struct bpf_verifier_env *env,
 			       struct bpf_reg_state *reg)
@@ -6164,9 +6119,10 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
 	 * values are also truncated so we push 64-bit bounds into
 	 * 32-bit bounds. Above were truncated < 32-bits already.
 	 */
-	if (size >= 4)
-		return;
-	__reg_combine_64_into_32(reg);
+	if (size < 4) {
+		__mark_reg32_unbounded(reg);
+		reg_bounds_sync(reg);
+	}
 }
 
 static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
@@ -14329,13 +14285,13 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
 					     tnum_subreg(false_32off));
 		true_reg->var_off = tnum_or(tnum_clear_subreg(true_64off),
 					    tnum_subreg(true_32off));
-		__reg_combine_32_into_64(false_reg);
-		__reg_combine_32_into_64(true_reg);
+		reg_bounds_sync(false_reg);
+		reg_bounds_sync(true_reg);
 	} else {
 		false_reg->var_off = false_64off;
 		true_reg->var_off = true_64off;
-		__reg_combine_64_into_32(false_reg);
-		__reg_combine_64_into_32(true_reg);
+		reg_bounds_sync(false_reg);
+		reg_bounds_sync(true_reg);
 	}
 }
 
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH stable 6.6.y v2 2/3] bpf: make the verifier tracks the "not equal" for regs
  2026-06-07 17:09 [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 1/3] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
@ 2026-06-07 17:09 ` Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 3/3] selftests/bpf: add helper retval linked scalar pruning test Zhenzhong Wu
  2026-06-08 10:11 ` [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
  3 siblings, 0 replies; 7+ messages in thread
From: Zhenzhong Wu @ 2026-06-07 17:09 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, stable, mykolal, tamird

From: Menglong Dong <menglong8.dong@gmail.com>

[ Upstream commit d028f87517d6775dccff4ddbca2740826f9e53f1 ]

We can derive useful information for BPF_JNE when one side is a constant
and the constant is exactly at the edge of the other register range.

For example, a > 0 can be compiled as a jump if a == 0. The equal branch
marks the register as known zero, but the fallthrough branch also needs to
preserve that the register is not zero. Without this, the range can remain
[0, max] and later verifier state pruning can keep an impossible scalar
path.

The upstream fix lives in regs_refine_cond_op(). The 6.6.y verifier still
uses the older reg_set_min_max() layout, so express the same branch-edge
refinement there: for BPF_JEQ, preserve the known-equal true branch and
exclude the constant from false_reg; for BPF_JNE, preserve the known-equal
false branch and exclude the constant from true_reg.

Signed-off-by: Menglong Dong <menglong8.dong@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://lore.kernel.org/r/20231219134800.1550388-2-menglong8.dong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
[ zhenzhong: backport to 6.6.y reg_set_min_max() layout. ]
Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
---
 kernel/bpf/verifier.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 5f94bff12..de4f46796 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14169,18 +14169,50 @@ static void reg_set_min_max(struct bpf_reg_state *true_reg,
 		if (is_jmp32) {
 			__mark_reg32_known(true_reg, val32);
 			true_32off = tnum_subreg(true_reg->var_off);
+			if (false_reg->u32_min_value == val32)
+				false_reg->u32_min_value++;
+			if (false_reg->u32_max_value == val32)
+				false_reg->u32_max_value--;
+			if (false_reg->s32_min_value == sval32)
+				false_reg->s32_min_value++;
+			if (false_reg->s32_max_value == sval32)
+				false_reg->s32_max_value--;
 		} else {
 			___mark_reg_known(true_reg, val);
 			true_64off = true_reg->var_off;
+			if (false_reg->umin_value == val)
+				false_reg->umin_value++;
+			if (false_reg->umax_value == val)
+				false_reg->umax_value--;
+			if (false_reg->smin_value == sval)
+				false_reg->smin_value++;
+			if (false_reg->smax_value == sval)
+				false_reg->smax_value--;
 		}
 		break;
 	case BPF_JNE:
 		if (is_jmp32) {
 			__mark_reg32_known(false_reg, val32);
 			false_32off = tnum_subreg(false_reg->var_off);
+			if (true_reg->u32_min_value == val32)
+				true_reg->u32_min_value++;
+			if (true_reg->u32_max_value == val32)
+				true_reg->u32_max_value--;
+			if (true_reg->s32_min_value == sval32)
+				true_reg->s32_min_value++;
+			if (true_reg->s32_max_value == sval32)
+				true_reg->s32_max_value--;
 		} else {
 			___mark_reg_known(false_reg, val);
 			false_64off = false_reg->var_off;
+			if (true_reg->umin_value == val)
+				true_reg->umin_value++;
+			if (true_reg->umax_value == val)
+				true_reg->umax_value--;
+			if (true_reg->smin_value == sval)
+				true_reg->smin_value++;
+			if (true_reg->smax_value == sval)
+				true_reg->smax_value--;
 		}
 		break;
 	case BPF_JSET:
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH stable 6.6.y v2 3/3] selftests/bpf: add helper retval linked scalar pruning test
  2026-06-07 17:09 [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 1/3] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 2/3] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
@ 2026-06-07 17:09 ` Zhenzhong Wu
  2026-06-08 10:11 ` [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
  3 siblings, 0 replies; 7+ messages in thread
From: Zhenzhong Wu @ 2026-06-07 17:09 UTC (permalink / raw)
  To: bpf
  Cc: netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, shung-hsi.yu, stable, mykolal, tamird

Add a verifier test case covering a pruning bug where a helper return
value and another scalar become linked by scalar id on one path. A later
branch can then let the verifier explore an impossible continuation and
prune the real success path.

The test uses bpf_skb_load_bytes() to create a helper return value in R0
and a scalar derived from the tc test packet length. It then links the two
scalars on one path and checks that the later branch keeps the reachable
success path.

Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
---
 .../selftests/bpf/progs/verifier_reg_equal.c  | 35 +++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_reg_equal.c b/tools/testing/selftests/bpf/progs/verifier_reg_equal.c
index dc1d8c30f..269b2af50 100644
--- a/tools/testing/selftests/bpf/progs/verifier_reg_equal.c
+++ b/tools/testing/selftests/bpf/progs/verifier_reg_equal.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
 #include <linux/bpf.h>
+#include <stddef.h>
 #include <bpf/bpf_helpers.h>
 #include "bpf_misc.h"
 
@@ -55,4 +56,38 @@ l1_%=:	exit;						\
 	: __clobber_all);
 }
 
+SEC("tc")
+__description("helper retval linked scalar pruning")
+__success __retval(0)
+__naked void helper_retval_linked_scalar_pruning(void)
+{
+	asm volatile ("					\
+	r7 = *(u32 *)(r1 + %[__sk_buff_data_end]);	\
+	r5 = *(u32 *)(r1 + %[__sk_buff_data]);		\
+	r7 -= r5;					\
+	r2 = 0;						\
+	r3 = r10;					\
+	r3 += -8;					\
+	r4 = 1;						\
+	call %[bpf_skb_load_bytes];			\
+	r6 = 1;						\
+	if r0 == 0 goto l0_%=;				\
+	r7 = r0;					\
+l0_%=:	if r0 != 0 goto l1_%=;				\
+	r7 <<= 32;					\
+	r7 >>= 32;					\
+	r6 = 1;						\
+	if r7 != %[test_data_len] goto l1_%=;		\
+	r0 = 0;						\
+	exit;						\
+l1_%=:	r0 = r6;					\
+	exit;						\
+"	:
+	: __imm(bpf_skb_load_bytes),
+	  __imm_const(__sk_buff_data, offsetof(struct __sk_buff, data)),
+	  __imm_const(__sk_buff_data_end, offsetof(struct __sk_buff, data_end)),
+	  __imm_const(test_data_len, TEST_DATA_LEN)
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes
  2026-06-07 17:09 [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
                   ` (2 preceding siblings ...)
  2026-06-07 17:09 ` [PATCH stable 6.6.y v2 3/3] selftests/bpf: add helper retval linked scalar pruning test Zhenzhong Wu
@ 2026-06-08 10:11 ` Shung-Hsi Yu
  2026-06-10 15:46   ` Zhenzhong Wu
  3 siblings, 1 reply; 7+ messages in thread
From: Shung-Hsi Yu @ 2026-06-08 10:11 UTC (permalink / raw)
  To: Zhenzhong Wu
  Cc: bpf, netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, stable, mykolal, tamird

Hi Zhenzhong,

On Mon, Jun 08, 2026 at 01:09:55AM +0800, Zhenzhong Wu wrote:
> Hi,
> 
> This series backports two BPF verifier scalar range-tracking fixes to
> 6.6.y and adds a selftest. It fixes a verifier state-pruning issue where
> an impossible linked-scalar path can be kept while the real success path is
> pruned.
...
>   15: (85) call bpf_get_func_ret#184    ; R0_w=scalar() fp-8_w=mmmmmmmm
>   16: (79) r7 = *(u64 *)(r10 -8)        ; R7_w=scalar() R10=fp0
>   17: (15) if r0 == 0x0 goto pc+1       ; R0_w=scalar()
>   18: (bf) r7 = r0                      ; R0=scalar(id=1) R7=scalar(id=1)
>   19: (55) if r0 != 0x0 goto pc+6       ; R0=0
>   20: (67) r7 <<= 32                    ; R7_w=0
>   21: (77) r7 >>= 32                    ; R7_w=0
>   22: (b7) r1 = 1                       ; R1_w=1
>   23: (55) if r7 != 0xf goto pc+1
...
> I also checked bpf-next: bpf-next passes even when the d028f87517d6 JNE
> refinement is reverted, because newer kernels also have the later
> 4bf79f9be434e ("bpf: Track equal scalars history on per-instruction level")
> precision-tracking change. I did not use 4bf79f9be434e as the stable
> backport base because it is a broader jmp_history/precision-tracking change
> for linked scalars. For 6.6.y this series keeps the smaller stable backport
> path that directly follows the bisected fix: preserve scalar bounds after
> conditional refinement, then add the not-equal range refinement in the older
> reg_set_min_max() layout.
...

To be honest I have not figure everything out yet, but I really much
prefer we backport commit 4bf79f9be434e ("bpf: Track equal scalars
history on per-instruction level") to address the issue instead. While
'bpf: make the verifier tracks the "not equal" for regs' itself is
self-contained and reasonable, "bpf: drop knowledge-losing
__reg_combine_{32,64}_into_{64,32} logic" comes from a much larger
series[1], and taking that out of context seems rather risky[2].

More importantly, 'bpf: make the verifier tracks the "not equal" for
regs' does not address root cause of the issue, it merely mask the issue
by making the two states different enough that the two is no longer
equal, which works for the Rust specific case you have, but won't work
if the value was slightly different (e.g. "r0 == 1" followed by "r0 !=
1").

The root cause to the problem have been stated by you already, it is:

> The relevant pruning point is that regsafe()/states_equal() accepted the
> real success-path state against an earlier cached state where r0 was an
> imprecise scalar and r7 constraints were loose enough to cover the current
> r7.

Looking at the verifier log you have, in the impossible path we have
r0.id == r7.id from instruction 18, where as the real success path (that
skips instruction 18) does not have that relationship, thus the two
should be considered different, and that seems just what "bpf: track
find_equal_scalars history on per-instruction level" solves by having
the correct precise mark.

Could you give backporting the full "bpf: track find_equal_scalars history on
per-instruction level" series[3] a try? For 6.6 it should be doable, and
hopefully for 6.1, too, but not too sure about earlier ones. If you prefer I
work on it I can also give it a try later this week.

As for the selftest, it would need to be send separately and by itself
to bpf-next, and picked up there, before it can be backported to stable.
I suggest you look at [4] and have your test placed similarly, and
mention that your test specifically test a Rust/Aya pattern.

Thanks,
Shung-Hsi

1: https://lore.kernel.org/r/20231102033759.2541186-1-andrii@kernel.org
2: https://lore.kernel.org/bpf/20260601182508.29C811F00893@smtp.kernel.org/
3: https://lore.kernel.org/bpf/20240718202357.1746514-1-eddyz87@gmail.com/
4: https://lore.kernel.org/bpf/20240718202357.1746514-4-eddyz87@gmail.com/

[...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes
  2026-06-08 10:11 ` [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
@ 2026-06-10 15:46   ` Zhenzhong Wu
  2026-06-11  6:47     ` Shung-Hsi Yu
  0 siblings, 1 reply; 7+ messages in thread
From: Zhenzhong Wu @ 2026-06-10 15:46 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: bpf, netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, stable, mykolal, tamird

Hi Shung-Hsi,

> More importantly, 'bpf: make the verifier tracks the "not equal" for
> regs' does not address root cause of the issue, it merely mask the issue
> by making the two states different enough that the two is no longer
> equal, which works for the Rust specific case you have, but won't work
> if the value was slightly different (e.g. "r0 == 1" followed by "r0 !=
> 1").

Thanks for spelling this out. I now see that I did not fully
understand the point behind your suggested bpf-next-with-d028-reverted
check.

I was treating the not-equal refinement and the linked-scalar precision
issue as two ways to break the same failure chain, and chose the
d028-based path because it was smaller and easier for me to reason
about. With the `r0 == 1` variant, it became clear to me that this only
fixes the zero-valued branch shape from my original reproducer, while
the underlying linked-scalar pruning issue remains.

> Could you give backporting the full "bpf: track find_equal_scalars history on
> per-instruction level" series[3] a try? For 6.6 it should be doable, and
> hopefully for 6.1, too, but not too sure about earlier ones. If you prefer I
> work on it I can also give it a try later this week.

Sure, I will prepare v3 based on that series for 6.6.y, and then work
on the 6.1.y adaptation separately.

I tried applying the series starting from 6.1.y and still hit some
issues that need adaptation. 5.15.y and 5.10.y appear to need more
surrounding verifier changes, so they may be harder, but I will still
try to work through them. If I run into anything I am unsure about, I
will raise it earlier.

> As for the selftest, it would need to be send separately and by itself
> to bpf-next, and picked up there, before it can be backported to stable.
> I suggest you look at [4] and have your test placed similarly, and
> mention that your test specifically test a Rust/Aya pattern.

Thanks, I will send the selftest to bpf-next separately. I will also
change the test to use the `r0 == 1` / `r0 != 1` shape, so it covers
the broader linked-scalar pruning issue instead of only the original
zero-valued case.

Thanks again for the detailed explanation. I have only recently started
digging into the verifier implementation details, so this was very helpful!

BR,
Zhenzhong

On Mon, Jun 8, 2026 at 6:11 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
>
> Hi Zhenzhong,
>
> On Mon, Jun 08, 2026 at 01:09:55AM +0800, Zhenzhong Wu wrote:
> > Hi,
> >
> > This series backports two BPF verifier scalar range-tracking fixes to
> > 6.6.y and adds a selftest. It fixes a verifier state-pruning issue where
> > an impossible linked-scalar path can be kept while the real success path is
> > pruned.
> ...
> >   15: (85) call bpf_get_func_ret#184    ; R0_w=scalar() fp-8_w=mmmmmmmm
> >   16: (79) r7 = *(u64 *)(r10 -8)        ; R7_w=scalar() R10=fp0
> >   17: (15) if r0 == 0x0 goto pc+1       ; R0_w=scalar()
> >   18: (bf) r7 = r0                      ; R0=scalar(id=1) R7=scalar(id=1)
> >   19: (55) if r0 != 0x0 goto pc+6       ; R0=0
> >   20: (67) r7 <<= 32                    ; R7_w=0
> >   21: (77) r7 >>= 32                    ; R7_w=0
> >   22: (b7) r1 = 1                       ; R1_w=1
> >   23: (55) if r7 != 0xf goto pc+1
> ...
> > I also checked bpf-next: bpf-next passes even when the d028f87517d6 JNE
> > refinement is reverted, because newer kernels also have the later
> > 4bf79f9be434e ("bpf: Track equal scalars history on per-instruction level")
> > precision-tracking change. I did not use 4bf79f9be434e as the stable
> > backport base because it is a broader jmp_history/precision-tracking change
> > for linked scalars. For 6.6.y this series keeps the smaller stable backport
> > path that directly follows the bisected fix: preserve scalar bounds after
> > conditional refinement, then add the not-equal range refinement in the older
> > reg_set_min_max() layout.
> ...
>
> To be honest I have not figure everything out yet, but I really much
> prefer we backport commit 4bf79f9be434e ("bpf: Track equal scalars
> history on per-instruction level") to address the issue instead. While
> 'bpf: make the verifier tracks the "not equal" for regs' itself is
> self-contained and reasonable, "bpf: drop knowledge-losing
> __reg_combine_{32,64}_into_{64,32} logic" comes from a much larger
> series[1], and taking that out of context seems rather risky[2].
>
> More importantly, 'bpf: make the verifier tracks the "not equal" for
> regs' does not address root cause of the issue, it merely mask the issue
> by making the two states different enough that the two is no longer
> equal, which works for the Rust specific case you have, but won't work
> if the value was slightly different (e.g. "r0 == 1" followed by "r0 !=
> 1").
>
> The root cause to the problem have been stated by you already, it is:
>
> > The relevant pruning point is that regsafe()/states_equal() accepted the
> > real success-path state against an earlier cached state where r0 was an
> > imprecise scalar and r7 constraints were loose enough to cover the current
> > r7.
>
> Looking at the verifier log you have, in the impossible path we have
> r0.id == r7.id from instruction 18, where as the real success path (that
> skips instruction 18) does not have that relationship, thus the two
> should be considered different, and that seems just what "bpf: track
> find_equal_scalars history on per-instruction level" solves by having
> the correct precise mark.
>
> Could you give backporting the full "bpf: track find_equal_scalars history on
> per-instruction level" series[3] a try? For 6.6 it should be doable, and
> hopefully for 6.1, too, but not too sure about earlier ones. If you prefer I
> work on it I can also give it a try later this week.
>
> As for the selftest, it would need to be send separately and by itself
> to bpf-next, and picked up there, before it can be backported to stable.
> I suggest you look at [4] and have your test placed similarly, and
> mention that your test specifically test a Rust/Aya pattern.
>
>
> Thanks,
> Shung-Hsi
>
> 1: https://lore.kernel.org/r/20231102033759.2541186-1-andrii@kernel.org
> 2: https://lore.kernel.org/bpf/20260601182508.29C811F00893@smtp.kernel.org/
> 3: https://lore.kernel.org/bpf/20240718202357.1746514-1-eddyz87@gmail.com/
> 4: https://lore.kernel.org/bpf/20240718202357.1746514-4-eddyz87@gmail.com/
>
> [...]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes
  2026-06-10 15:46   ` Zhenzhong Wu
@ 2026-06-11  6:47     ` Shung-Hsi Yu
  0 siblings, 0 replies; 7+ messages in thread
From: Shung-Hsi Yu @ 2026-06-11  6:47 UTC (permalink / raw)
  To: Zhenzhong Wu
  Cc: bpf, netdev, linux-kernel, ast, daniel, john.fastabend, andrii,
	martin.lau, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	menglong8.dong, eddyz87, stable, mykolal, tamird

On Wed, Jun 10, 2026 at 11:46:18PM +0800, Zhenzhong Wu wrote:
> > More importantly, 'bpf: make the verifier tracks the "not equal" for
> > regs' does not address root cause of the issue, it merely mask the issue
> > by making the two states different enough that the two is no longer
> > equal, which works for the Rust specific case you have, but won't work
> > if the value was slightly different (e.g. "r0 == 1" followed by "r0 !=
> > 1").
> 
> Thanks for spelling this out. I now see that I did not fully
> understand the point behind your suggested bpf-next-with-d028-reverted
> check.
> 
> I was treating the not-equal refinement and the linked-scalar precision
> issue as two ways to break the same failure chain, and chose the
> d028-based path because it was smaller and easier for me to reason
> about. With the `r0 == 1` variant, it became clear to me that this only
> fixes the zero-valued branch shape from my original reproducer, while
> the underlying linked-scalar pruning issue remains.
> 
> > Could you give backporting the full "bpf: track find_equal_scalars history on
> > per-instruction level" series[3] a try? For 6.6 it should be doable, and
> > hopefully for 6.1, too, but not too sure about earlier ones. If you prefer I
> > work on it I can also give it a try later this week.
> 
> Sure, I will prepare v3 based on that series for 6.6.y, and then work
> on the 6.1.y adaptation separately.
> 
> I tried applying the series starting from 6.1.y and still hit some
> issues that need adaptation. 5.15.y and 5.10.y appear to need more
> surrounding verifier changes, so they may be harder, but I will still
> try to work through them. If I run into anything I am unsure about, I
> will raise it earlier.

Thanks. Yeah besides the requirement of having to backport 6.6 before the same
patch will be accepted in 6.1, personally I find it much eaiser to backport to
newer stable to build understanding, before moving on to older ones; hopefully
you'll should find starting with 6.6 first helps, too.

> > As for the selftest, it would need to be send separately and by itself
> > to bpf-next, and picked up there, before it can be backported to stable.
> > I suggest you look at [4] and have your test placed similarly, and
> > mention that your test specifically test a Rust/Aya pattern.
> 
> Thanks, I will send the selftest to bpf-next separately. I will also
> change the test to use the `r0 == 1` / `r0 != 1` shape, so it covers
> the broader linked-scalar pruning issue instead of only the original
> zero-valued case.

Actually I thought it is better that you keep the `r0 == 0` / `r0 != 0` shape,
the reason is that it seems to be the pattern produced by the compiler. But now
that I think about it, using that shape in bpf-next means that impossible path
will get min=1 due to the not-equal refinement, and thus precision won't matter.

In that case using the `r0 == 1` / `r0 != 1` shape is probably better indeed.

> Thanks again for the detailed explanation. I have only recently started
> digging into the verifier implementation details, so this was very helpful!
...

Happy to help!
Shung-Hsi

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-11  6:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-07 17:09 [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Zhenzhong Wu
2026-06-07 17:09 ` [PATCH stable 6.6.y v2 1/3] bpf: drop knowledge-losing __reg_combine_{32,64}_into_{64,32} logic Zhenzhong Wu
2026-06-07 17:09 ` [PATCH stable 6.6.y v2 2/3] bpf: make the verifier tracks the "not equal" for regs Zhenzhong Wu
2026-06-07 17:09 ` [PATCH stable 6.6.y v2 3/3] selftests/bpf: add helper retval linked scalar pruning test Zhenzhong Wu
2026-06-08 10:11 ` [PATCH stable 6.6.y v2 0/3] bpf: backport scalar not-equal tracking fixes Shung-Hsi Yu
2026-06-10 15:46   ` Zhenzhong Wu
2026-06-11  6:47     ` Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox