[PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction

public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
@ 2026-03-07  0:01 Paul Chaignon
  2026-03-07  0:03 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Test case for refinement improvement using 64b bounds Paul Chaignon
  2026-03-10  1:07 ` [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Eduard Zingerman
  0 siblings, 2 replies; 13+ messages in thread
From: Paul Chaignon @ 2026-03-07  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Harishankar Vishwanathan, Shung-Hsi Yu

In commit 5dbb19b16ac49 ("bpf: Add third round of bounds deduction"), I
added a new round of bounds deduction because two rounds were not enough
to converge to a fixed point. This commit slightly refactor the bounds
deduction logic such that two rounds are enough.

In [1], Eduard noticed that after we improved the refinement logic, a
third call to the bounds deduction (__reg_deduce_bounds) was needed to
converge to a fixed point. More specifically, we needed this third call
to improve the s64 range using the s32 range. We added the third call
and postponed a more detailed analysis of the refinement logic.

I've been looking into this more recently. To help, I wrote a high level
sequence of all the refinements carried out in reg_bounds_sync. u64 ->
s32 means we used the u64 ranges to improve the s32 ranges.

    /* __update_reg_bounds: */
    tnum -> {s32, u32, s64, u64}
    /* __reg_deduce_bounds: */
    for (3 times) {
        /* __reg32_deduce_bounds: */
        u64 -> {u32, s32}
        s64 -> {u32, s32}
        u64 -> s32
        s64 -> s32
        u32 -> s32
        s32 -> u32
        /* __reg64_deduce_bounds: */
        u64 -> s64
        s64 -> u64
        u64 + s64 -> {u64, s64}
        /* __reg_deduce_mixed_bounds: */
        u32 -> u64
        u32 -> s64
        s32 + s64 -> {s64, u64, tnum}
    }
    /* __reg_bound_offset: */
    {u64, u32} -> tnum
    /* __update_reg_bounds: */
    tnum -> {s32, u32, s64, u64}

From this, we can observe that we first improve the 32bit ranges from
the 64bit ranges in __reg32_deduce_bounds, then improve the 64bit
ranges on their own in __reg64_deduce_bounds. Intuitively, if we were to
improve the 64bit ranges on their own *before* we use them to improve
the 32bit ranges, we may reach a fixed point earlier. Said otherwise, we
would need to move the *64 -> *32 refinements to the beginning of
__reg_deduce_mixed_bounds. (The logic would then also better match the
function names, but that's secondary.)

After testing, this small refactoring does allow us to lose one call to
__reg_deduce_bounds. Without this refactoring, the test
"verifier_bounds/bounds deduction cross sign boundary, negative
overlap" fails when removing one call to __reg_deduce_bounds. This
patch therefore implements that bit of refactoring and reverts commit
5dbb19b16ac49 ("bpf: Add third round of bounds deduction").

In some cases, this change can even improve precision a little bit, as
illustrated in the new selftest in the next patch.

As expected, this change didn't have any impact on the number of
instructions processed when running it through the Cilium complexity
test suite [2].

Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1]
Link: https://pchaigno.github.io/test-verifier-complexity.html [2]
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
---
Changes in v2:
  - Updated description to mention potential precision improvement and
    to clarify the sequence of refinements (Shung-Hsi).
  - Added the second patch.
  - Rebased.

 kernel/bpf/verifier.c | 138 +++++++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 69 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d92cf2821657..792017d87ead 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2450,74 +2450,6 @@ static void __update_reg_bounds(struct bpf_reg_state *reg)
 /* Uses signed min/max values to inform unsigned, and vice-versa */
 static void __reg32_deduce_bounds(struct bpf_reg_state *reg)
 {
-	/* If upper 32 bits of u64/s64 range don't change, we can use lower 32
-	 * bits to improve our u32/s32 boundaries.
-	 *
-	 * E.g., the case where we have upper 32 bits as zero ([10, 20] in
-	 * u64) is pretty trivial, it's obvious that in u32 we'll also have
-	 * [10, 20] range. But this property holds for any 64-bit range as
-	 * long as upper 32 bits in that entire range of values stay the same.
-	 *
-	 * E.g., u64 range [0x10000000A, 0x10000000F] ([4294967306, 4294967311]
-	 * in decimal) has the same upper 32 bits throughout all the values in
-	 * that range. As such, lower 32 bits form a valid [0xA, 0xF] ([10, 15])
-	 * range.
-	 *
-	 * Note also, that [0xA, 0xF] is a valid range both in u32 and in s32,
-	 * following the rules outlined below about u64/s64 correspondence
-	 * (which equally applies to u32 vs s32 correspondence). In general it
-	 * depends on actual hexadecimal values of 32-bit range. They can form
-	 * only valid u32, or only valid s32 ranges in some cases.
-	 *
-	 * So we use all these insights to derive bounds for subregisters here.
-	 */
-	if ((reg->umin_value >> 32) == (reg->umax_value >> 32)) {
-		/* u64 to u32 casting preserves validity of low 32 bits as
-		 * a range, if upper 32 bits are the same
-		 */
-		reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->umin_value);
-		reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->umax_value);
-
-		if ((s32)reg->umin_value <= (s32)reg->umax_value) {
-			reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value);
-			reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value);
-		}
-	}
-	if ((reg->smin_value >> 32) == (reg->smax_value >> 32)) {
-		/* low 32 bits should form a proper u32 range */
-		if ((u32)reg->smin_value <= (u32)reg->smax_value) {
-			reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->smin_value);
-			reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->smax_value);
-		}
-		/* low 32 bits should form a proper s32 range */
-		if ((s32)reg->smin_value <= (s32)reg->smax_value) {
-			reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value);
-			reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value);
-		}
-	}
-	/* Special case where upper bits form a small sequence of two
-	 * sequential numbers (in 32-bit unsigned space, so 0xffffffff to
-	 * 0x00000000 is also valid), while lower bits form a proper s32 range
-	 * going from negative numbers to positive numbers. E.g., let's say we
-	 * have s64 range [-1, 1] ([0xffffffffffffffff, 0x0000000000000001]).
-	 * Possible s64 values are {-1, 0, 1} ({0xffffffffffffffff,
-	 * 0x0000000000000000, 0x00000000000001}). Ignoring upper 32 bits,
-	 * we still get a valid s32 range [-1, 1] ([0xffffffff, 0x00000001]).
-	 * Note that it doesn't have to be 0xffffffff going to 0x00000000 in
-	 * upper 32 bits. As a random example, s64 range
-	 * [0xfffffff0fffffff0; 0xfffffff100000010], forms a valid s32 range
-	 * [-16, 16] ([0xfffffff0; 0x00000010]) in its 32 bit subregister.
-	 */
-	if ((u32)(reg->umin_value >> 32) + 1 == (u32)(reg->umax_value >> 32) &&
-	    (s32)reg->umin_value < 0 && (s32)reg->umax_value >= 0) {
-		reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value);
-		reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value);
-	}
-	if ((u32)(reg->smin_value >> 32) + 1 == (u32)(reg->smax_value >> 32) &&
-	    (s32)reg->smin_value < 0 && (s32)reg->smax_value >= 0) {
-		reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value);
-		reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value);
-	}
 	/* if u32 range forms a valid s32 range (due to matching sign bit),
 	 * try to learn from that
 	 */
@@ -2672,6 +2604,75 @@ static void __reg64_deduce_bounds(struct bpf_reg_state *reg)
 
 static void __reg_deduce_mixed_bounds(struct bpf_reg_state *reg)
 {
+	/* If upper 32 bits of u64/s64 range don't change, we can use lower 32
+	 * bits to improve our u32/s32 boundaries.
+	 *
+	 * E.g., the case where we have upper 32 bits as zero ([10, 20] in
+	 * u64) is pretty trivial, it's obvious that in u32 we'll also have
+	 * [10, 20] range. But this property holds for any 64-bit range as
+	 * long as upper 32 bits in that entire range of values stay the same.
+	 *
+	 * E.g., u64 range [0x10000000A, 0x10000000F] ([4294967306, 4294967311]
+	 * in decimal) has the same upper 32 bits throughout all the values in
+	 * that range. As such, lower 32 bits form a valid [0xA, 0xF] ([10, 15])
+	 * range.
+	 *
+	 * Note also, that [0xA, 0xF] is a valid range both in u32 and in s32,
+	 * following the rules outlined below about u64/s64 correspondence
+	 * (which equally applies to u32 vs s32 correspondence). In general it
+	 * depends on actual hexadecimal values of 32-bit range. They can form
+	 * only valid u32, or only valid s32 ranges in some cases.
+	 *
+	 * So we use all these insights to derive bounds for subregisters here.
+	 */
+	if ((reg->umin_value >> 32) == (reg->umax_value >> 32)) {
+		/* u64 to u32 casting preserves validity of low 32 bits as
+		 * a range, if upper 32 bits are the same
+		 */
+		reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->umin_value);
+		reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->umax_value);
+
+		if ((s32)reg->umin_value <= (s32)reg->umax_value) {
+			reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value);
+			reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value);
+		}
+	}
+	if ((reg->smin_value >> 32) == (reg->smax_value >> 32)) {
+		/* low 32 bits should form a proper u32 range */
+		if ((u32)reg->smin_value <= (u32)reg->smax_value) {
+			reg->u32_min_value = max_t(u32, reg->u32_min_value, (u32)reg->smin_value);
+			reg->u32_max_value = min_t(u32, reg->u32_max_value, (u32)reg->smax_value);
+		}
+		/* low 32 bits should form a proper s32 range */
+		if ((s32)reg->smin_value <= (s32)reg->smax_value) {
+			reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value);
+			reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value);
+		}
+	}
+	/* Special case where upper bits form a small sequence of two
+	 * sequential numbers (in 32-bit unsigned space, so 0xffffffff to
+	 * 0x00000000 is also valid), while lower bits form a proper s32 range
+	 * going from negative numbers to positive numbers. E.g., let's say we
+	 * have s64 range [-1, 1] ([0xffffffffffffffff, 0x0000000000000001]).
+	 * Possible s64 values are {-1, 0, 1} ({0xffffffffffffffff,
+	 * 0x0000000000000000, 0x00000000000001}). Ignoring upper 32 bits,
+	 * we still get a valid s32 range [-1, 1] ([0xffffffff, 0x00000001]).
+	 * Note that it doesn't have to be 0xffffffff going to 0x00000000 in
+	 * upper 32 bits. As a random example, s64 range
+	 * [0xfffffff0fffffff0; 0xfffffff100000010], forms a valid s32 range
+	 * [-16, 16] ([0xfffffff0; 0x00000010]) in its 32 bit subregister.
+	 */
+	if ((u32)(reg->umin_value >> 32) + 1 == (u32)(reg->umax_value >> 32) &&
+	    (s32)reg->umin_value < 0 && (s32)reg->umax_value >= 0) {
+		reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->umin_value);
+		reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->umax_value);
+	}
+	if ((u32)(reg->smin_value >> 32) + 1 == (u32)(reg->smax_value >> 32) &&
+	    (s32)reg->smin_value < 0 && (s32)reg->smax_value >= 0) {
+		reg->s32_min_value = max_t(s32, reg->s32_min_value, (s32)reg->smin_value);
+		reg->s32_max_value = min_t(s32, reg->s32_max_value, (s32)reg->smax_value);
+	}
+
 	/* Try to tighten 64-bit bounds from 32-bit knowledge, using 32-bit
 	 * values on both sides of 64-bit range in hope to have tighter range.
 	 * E.g., if r1 is [0x1'00000000, 0x3'80000000], and we learn from
@@ -2764,7 +2765,6 @@ static void reg_bounds_sync(struct bpf_reg_state *reg)
 	/* We might have learned something about the sign bit. */
 	__reg_deduce_bounds(reg);
 	__reg_deduce_bounds(reg);
-	__reg_deduce_bounds(reg);
 	/* We might have learned some bits from the bounds. */
 	__reg_bound_offset(reg);
 	/* Intersecting with the old var_off might have improved our bounds
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 bpf-next 2/2] selftests/bpf: Test case for refinement improvement using 64b bounds
  2026-03-07  0:01 [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Paul Chaignon
@ 2026-03-07  0:03 ` Paul Chaignon
  2026-03-10  1:07 ` [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Eduard Zingerman
  1 sibling, 0 replies; 13+ messages in thread
From: Paul Chaignon @ 2026-03-07  0:03 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Harishankar Vishwanathan, Shung-Hsi Yu

This new selftest demonstrates the improvement of bounds refinement from
the previous patch. It is inspired from a set of reg_bounds_sync inputs
generated using CBMC [1] by Shung-Hsi:

  reg.smin_value=0x8000000000000002
  reg.smax_value=2
  reg.umin_value=2
  reg.umax_value=19
  reg.s32_min_value=2
  reg.s32_max_value=3
  reg.u32_min_value=2
  reg.u32_max_value=3

reg_bounds_sync returns R=[2; 3] without the previous patch, and R=2
with it. __reg64_deduce_bounds is able to derive that u64=2, but before
the previous patch, those bounds are overwritten in
__reg_deduce_mixed_bounds using the 32bits bounds.

To arrive to these reg_bounds_sync inputs, we bound the 32bits value
first to [2; 3]. We can then upper-bound s64 without impacting u64. At
that point, the refinement to u64=2 doesn't happen because the ranges
still overlap in two points:

  0  umin=2                             umax=0xff..ff00..03   U64_MAX
  |  [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]      |
  |----------------------------|------------------------------|
  |xx]                           [xxxxxxxxxxxxxxxxxxxxxxxxxxxx|
  0  smax=2                      smin=0x800..02               -1

With an upper-bound check at value 19, we can reach the above inputs for
reg_bounds_sync. At that point, the refinement to u64=2 happens and
because it isn't overwritten by __reg_deduce_mixed_bounds anymore,
reg_bounds_sync returns with reg=2.

The test validates this result by including an illegal instruction in
the (dead) branch reg != 2.

Link: https://github.com/shunghsiyu/reg_bounds_sync-review/ [1]
Co-developed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
---
 .../selftests/bpf/progs/verifier_bounds.c     | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c
index 4588e5b55b41..fd377d3a27d4 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bounds.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c
@@ -2000,4 +2000,37 @@ __naked void bounds_refinement_multiple_overlaps(void *ctx)
 	: __clobber_all);
 }
 
+/* After instruction 3, the u64 and s64 ranges look as follows:
+ * 0  umin=2                             umax=0xff..ff00..03   U64_MAX
+ * |  [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]      |
+ * |----------------------------|------------------------------|
+ * |xx]                           [xxxxxxxxxxxxxxxxxxxxxxxxxxxx|
+ * 0  smax=2                      smin=0x800..02               -1
+ *
+ * The two ranges can't be refined because they overlap in two places. Once we
+ * add an upper-bound to u64 at instruction 4, the refinement can happen. This
+ * test validates that this refinement does happen and is not overwritten by
+ * the less-precise 32bits ranges.
+ */
+SEC("socket")
+__description("bounds refinement: 64bits ranges not overwritten by 32bits ranges")
+__msg("3: (65) if r0 s> 0x2 {{.*}} R0=scalar(smin=0x8000000000000002,smax=2,umin=smin32=umin32=2,umax=0xffffffff00000003,smax32=umax32=3")
+__msg("4: (25) if r0 > 0x13 {{.*}} R0=2")
+__success __log_level(2)
+__naked void tmp(void *ctx)
+{
+	asm volatile("			\
+	call %[bpf_get_prandom_u32];	\
+	if w0 < 2 goto +5;		\
+	if w0 > 3 goto +4;		\
+	if r0 s> 2 goto +3;		\
+	if r0 > 19 goto +2;		\
+	if r0 == 2 goto +1;		\
+	r10 = 0;			\
+	exit;				\
+"	:
+	: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-07  0:01 [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Paul Chaignon
  2026-03-07  0:03 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Test case for refinement improvement using 64b bounds Paul Chaignon
@ 2026-03-10  1:07 ` Eduard Zingerman
  2026-03-10  1:30   ` Eduard Zingerman
  1 sibling, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-10  1:07 UTC (permalink / raw)
  To: Paul Chaignon, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan, Shung-Hsi Yu

On Sat, 2026-03-07 at 01:01 +0100, Paul Chaignon wrote:
> In commit 5dbb19b16ac49 ("bpf: Add third round of bounds deduction"), I
> added a new round of bounds deduction because two rounds were not enough
> to converge to a fixed point. This commit slightly refactor the bounds
> deduction logic such that two rounds are enough.
>
> In [1], Eduard noticed that after we improved the refinement logic, a
> third call to the bounds deduction (__reg_deduce_bounds) was needed to
> converge to a fixed point. More specifically, we needed this third call
> to improve the s64 range using the s32 range. We added the third call
> and postponed a more detailed analysis of the refinement logic.
>
> I've been looking into this more recently. To help, I wrote a high level
> sequence of all the refinements carried out in reg_bounds_sync. u64 ->
> s32 means we used the u64 ranges to improve the s32 ranges.
>
>     /* __update_reg_bounds: */
>     tnum -> {s32, u32, s64, u64}
>     /* __reg_deduce_bounds: */
>     for (3 times) {
>         /* __reg32_deduce_bounds: */
>         u64 -> {u32, s32}
>         s64 -> {u32, s32}
>         u64 -> s32
>         s64 -> s32
>         u32 -> s32
>         s32 -> u32
>         /* __reg64_deduce_bounds: */
>         u64 -> s64
>         s64 -> u64
>         u64 + s64 -> {u64, s64}
>         /* __reg_deduce_mixed_bounds: */
>         u32 -> u64
>         u32 -> s64
>         s32 + s64 -> {s64, u64, tnum}
>     }
>     /* __reg_bound_offset: */
>     {u64, u32} -> tnum
>     /* __update_reg_bounds: */
>     tnum -> {s32, u32, s64, u64}
>
> From this, we can observe that we first improve the 32bit ranges from
> the 64bit ranges in __reg32_deduce_bounds, then improve the 64bit
> ranges on their own in __reg64_deduce_bounds. Intuitively, if we were to
> improve the 64bit ranges on their own *before* we use them to improve
> the 32bit ranges, we may reach a fixed point earlier. Said otherwise, we
> would need to move the *64 -> *32 refinements to the beginning of
> __reg_deduce_mixed_bounds. (The logic would then also better match the
> function names, but that's secondary.)
>
> After testing, this small refactoring does allow us to lose one call to
> __reg_deduce_bounds. Without this refactoring, the test
> "verifier_bounds/bounds deduction cross sign boundary, negative
> overlap" fails when removing one call to __reg_deduce_bounds. This
> patch therefore implements that bit of refactoring and reverts commit
> 5dbb19b16ac49 ("bpf: Add third round of bounds deduction").
>
> In some cases, this change can even improve precision a little bit, as
> illustrated in the new selftest in the next patch.
>
> As expected, this change didn't have any impact on the number of
> instructions processed when running it through the Cilium complexity
> test suite [2].
>
> Link: https://lore.kernel.org/bpf/aIKtSK9LjQXB8FLY@mail.gmail.com/ [1]
> Link: https://pchaigno.github.io/test-verifier-complexity.html [2]
> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

I wish we had better names for functions in question, e.g. like in [1].

Since sbmc is all the rage now, I tried to check if new and old
definitions for reg_bounds_sync() function are equivalent.
See [2], where I copied relevant definitions (except for tnum part)
and added a simple test to see if old and new functions compute the
same result. Running this test seem to show some differences, e.g.:

  Values extracted from log:
    s32: [-1741665249,  1482110531]  v_s32=-1202410177
    u32: [ 1358815199,  3092723295]  v_u32=3092557119
    s64: [  -509161414271901871,   9223372036728946880]  v_s64=9223372035652365631
    u64: [  9223372034449554622,  17937582659561010782]  v_u64=9223372035652365631

  Running C code directly:
  input:
    s32: [0x98304c1f, 0x58573643]
    u32: [0x50fddfdf, 0xb857365f]
    s64: [0xf8ef186430fcdf51, 0x7ffffffff88000c0]
    u64: [0x7fffffff70a33cbe, 0xf8ef18643857365e]

  old:
    s32: [0x98304c1f, 0xb857365f]
    u32: [0x98304c1f, 0xb857365f]
    s64: [0x7fffffff98304c1f, 0x7fffffffb857365f]
    u64: [0x7fffffff98304c1f, 0x7fffffffb857365f]

  new:
    s32: [0x98304c1f, 0x58573643]
    u32: [0x70a33cbe, 0xb857365f]
    s64: [0x7fffffff70a33cbe, 0x7fffffffb857365f]
    u64: [0x7fffffff70a33cbe, 0x7fffffffb857365f]

Of-course, I might have some bugs in the test definition, but in case
the definition is correct, I think I'm inclined to agree with Ihor
here. If we can't mathematically prove that the sequence converges in
a fixed number of steps, putting a loop on top of it seem to be a
reasonable thing to do.

[1] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
[2] https://github.com/eddyz87/deduce-bounds-verif

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-10  1:07 ` [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Eduard Zingerman
@ 2026-03-10  1:30   ` Eduard Zingerman
  2026-03-10  5:53     ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-10  1:30 UTC (permalink / raw)
  To: Paul Chaignon, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan, Shung-Hsi Yu

On Mon, 2026-03-09 at 18:07 -0700, Eduard Zingerman wrote:

[...]

> Of-course, I might have some bugs in the test definition, but in case
> the definition is correct, I think I'm inclined to agree with Ihor
> here. If we can't mathematically prove that the sequence converges in
> a fixed number of steps, putting a loop on top of it seem to be a
> reasonable thing to do.
> 
> [1] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> [2] https://github.com/eddyz87/deduce-bounds-verif
> 
> [...]

Pushed an update to the test to check if new bounds are tighter than
old bounds (instead of equivalent). Still get a failure log:

$ ./parse-log.sh out2.log
Values extracted from log:
  s32: [-1686110209,  1907957760]  v_s32=-1411416065
  u32: [ 1746894847,  4030210048]  v_u32=2883551231
  s64: [ -4350478693995380738,   2172132303447261182]  v_s64=-4350478693150261249
  u64: [  2172132302103053322,  14096265381705949184]  v_u64=14096265380559290367

Running C code directly:
input:
  s32: [0x9b7fffff, 0x71b92000] [-1686110209, 1907957760]
  u32: [0x681f7fff, 0xf0382000] [1746894847, 4030210048]
  s64: [0xc39ffead797ffffe, 0x1e24f6d2501ffffe] [-4350478693995380738, 2172132303447261182]
  u64: [0x1e24f6d20001040a, 0xc39ffeadf0382000] [2172132302103053322, 14096265381705949184]

old:
  s32: [0x9b7fffff, 0xf0382000] [-1686110209, -264757248]
  u32: [0x9b7fffff, 0xf0382000] [2608857087, 4030210048]
  s64: [0xc39ffead9b7fffff, 0xc39ffeadf0382000] [-4350478693424955393, -4350478692003602432]
  u64: [0xc39ffead9b7fffff, 0xc39ffeadf0382000] [14096265380284596223, 14096265381705949184]

new:
  s32: [0x9b7fffff, 0x71b92000] [-1686110209, 1907957760]
  u32: [0x797ffffe, 0xf0382000] [2038431742, 4030210048]
  s64: [0xc39ffead797ffffe, 0xc39ffeadf0382000] [-4350478693995380738, -4350478692003602432]
  u64: [0xc39ffead797ffffe, 0xc39ffeadf0382000] [14096265379714170878, 14096265381705949184]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-10  1:30   ` Eduard Zingerman
@ 2026-03-10  5:53     ` Eduard Zingerman
  2026-03-10  7:56       ` Shung-Hsi Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-10  5:53 UTC (permalink / raw)
  To: Paul Chaignon, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan, Shung-Hsi Yu

On Mon, 2026-03-09 at 18:30 -0700, Eduard Zingerman wrote:
> On Mon, 2026-03-09 at 18:07 -0700, Eduard Zingerman wrote:
> 
> [...]
> 
> > Of-course, I might have some bugs in the test definition, but in case
> > the definition is correct, I think I'm inclined to agree with Ihor
> > here. If we can't mathematically prove that the sequence converges in
> > a fixed number of steps, putting a loop on top of it seem to be a
> > reasonable thing to do.
> > 
> > [1] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> > [2] https://github.com/eddyz87/deduce-bounds-verif
> > 
> > [...]
> 
> Pushed an update to the test to check if new bounds are tighter than
> old bounds (instead of equivalent). Still get a failure log:
> 
> $ ./parse-log.sh out2.log
> Values extracted from log:
>   s32: [-1686110209,  1907957760]  v_s32=-1411416065
>   u32: [ 1746894847,  4030210048]  v_u32=2883551231
>   s64: [ -4350478693995380738,   2172132303447261182]  v_s64=-4350478693150261249
>   u64: [  2172132302103053322,  14096265381705949184]  v_u64=14096265380559290367
> 
> Running C code directly:
> input:
>   s32: [0x9b7fffff, 0x71b92000] [-1686110209, 1907957760]
>   u32: [0x681f7fff, 0xf0382000] [1746894847, 4030210048]
>   s64: [0xc39ffead797ffffe, 0x1e24f6d2501ffffe] [-4350478693995380738, 2172132303447261182]
>   u64: [0x1e24f6d20001040a, 0xc39ffeadf0382000] [2172132302103053322, 14096265381705949184]
> 
> old:
>   s32: [0x9b7fffff, 0xf0382000] [-1686110209, -264757248]
>   u32: [0x9b7fffff, 0xf0382000] [2608857087, 4030210048]
>   s64: [0xc39ffead9b7fffff, 0xc39ffeadf0382000] [-4350478693424955393, -4350478692003602432]
>   u64: [0xc39ffead9b7fffff, 0xc39ffeadf0382000] [14096265380284596223, 14096265381705949184]
> 
> new:
>   s32: [0x9b7fffff, 0x71b92000] [-1686110209, 1907957760]
>   u32: [0x797ffffe, 0xf0382000] [2038431742, 4030210048]
>   s64: [0xc39ffead797ffffe, 0xc39ffeadf0382000] [-4350478693995380738, -4350478692003602432]
>   u64: [0xc39ffead797ffffe, 0xc39ffeadf0382000] [14096265379714170878, 14096265381705949184]

I added back third round of __reg_deduce_bounds():

  diff --git a/deduce_bounds_new.c b/deduce_bounds_new.c
  index b0bd8c8..dc498b2 100644
  --- a/deduce_bounds_new.c
  +++ b/deduce_bounds_new.c
  @@ -326,4 +326,5 @@ void reg_bounds_sync_new(struct bpf_reg_state *reg)
   {
          __reg_deduce_bounds(reg);
          __reg_deduce_bounds(reg);
  +       __reg_deduce_bounds(reg);
   }

And run cbmc with the following assertions (see [1]):

	assert(a.smin_value == b.smin_value);
	assert(a.smax_value == b.smax_value);
	assert(a.umin_value == b.umin_value);
	assert(a.umax_value == b.umax_value);
	
	assert(a.s32_min_value == b.s32_min_value);
	assert(a.s32_max_value == b.s32_max_value);
	assert(a.u32_min_value == b.u32_min_value);
	assert(a.u32_max_value == b.u32_max_value);

This failed quickly with an example, showing different results for old
and new orderings, new being tighter.

And then with the following assertion:

	bool old_better = a.smin_value < b.smin_value ||
			  a.smax_value > b.smax_value ||
			  a.umin_value < b.umin_value ||
			  a.umax_value > b.umax_value ||
			  a.s32_min_value < b.s32_min_value ||
			  a.s32_max_value > b.s32_max_value ||
			  a.u32_min_value < b.u32_min_value ||
			  a.u32_max_value > b.u32_max_value;
	assert(!old_better);

This finished in 20 minutes claiming successful verification.
Hence, I conclude that new ordering is strictly better than the old
ordering if there are three rounds of deductions.

I suggest we should keep three rounds of deductions and adopt the new
ordering (or play with orderings some more to choose which one is
better under 3 rounds).

[1] https://github.com/eddyz87/deduce-bounds-verif/tree/experiment

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-10  5:53     ` Eduard Zingerman
@ 2026-03-10  7:56       ` Shung-Hsi Yu
  2026-03-10 19:45         ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-10  7:56 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Harishankar Vishwanathan

On Mon, Mar 09, 2026 at 10:53:50PM -0700, Eduard Zingerman wrote:
> On Mon, 2026-03-09 at 18:30 -0700, Eduard Zingerman wrote:
> > On Mon, 2026-03-09 at 18:07 -0700, Eduard Zingerman wrote:
> > [...]
> > 
> > > Of-course, I might have some bugs in the test definition, but in case
> > > the definition is correct, I think I'm inclined to agree with Ihor
> > > here. If we can't mathematically prove that the sequence converges in
> > > a fixed number of steps, putting a loop on top of it seem to be a
> > > reasonable thing to do.

Got some evidence that 3 may be just the magic number (of course,
assuming no bug in test definition, as well as the representativeness
caveat; will try to further cross-check mine with [1][2]).

More below.

> > > [1] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> > > [2] https://github.com/eddyz87/deduce-bounds-verif
> > > 
> > > [...]
> > 
> > Pushed an update to the test to check if new bounds are tighter than
> > old bounds (instead of equivalent). Still get a failure log:
> > 
[...]
> 
> I added back third round of __reg_deduce_bounds():
> 
>   diff --git a/deduce_bounds_new.c b/deduce_bounds_new.c
>   index b0bd8c8..dc498b2 100644
>   --- a/deduce_bounds_new.c
>   +++ b/deduce_bounds_new.c
>   @@ -326,4 +326,5 @@ void reg_bounds_sync_new(struct bpf_reg_state *reg)
>    {
>           __reg_deduce_bounds(reg);
>           __reg_deduce_bounds(reg);
>   +       __reg_deduce_bounds(reg);
>    }
> 
> And run cbmc with the following assertions (see [1]):
> 
> 	assert(a.smin_value == b.smin_value);
> 	assert(a.smax_value == b.smax_value);
> 	assert(a.umin_value == b.umin_value);
> 	assert(a.umax_value == b.umax_value);
> 	
> 	assert(a.s32_min_value == b.s32_min_value);
> 	assert(a.s32_max_value == b.s32_max_value);
> 	assert(a.u32_min_value == b.u32_min_value);
> 	assert(a.u32_max_value == b.u32_max_value);
> 
> This failed quickly with an example, showing different results for old
> and new orderings, new being tighter.
> 
> And then with the following assertion:
> 
> 	bool old_better = a.smin_value < b.smin_value ||
> 			  a.smax_value > b.smax_value ||
> 			  a.umin_value < b.umin_value ||
> 			  a.umax_value > b.umax_value ||
> 			  a.s32_min_value < b.s32_min_value ||
> 			  a.s32_max_value > b.s32_max_value ||
> 			  a.u32_min_value < b.u32_min_value ||
> 			  a.u32_max_value > b.u32_max_value;
> 	assert(!old_better);
> 
> This finished in 20 minutes claiming successful verification.
> Hence, I conclude that new ordering is strictly better than the old
> ordering if there are three rounds of deductions.
> 
> I suggest we should keep three rounds of deductions and adopt the new
> ordering (or play with orderings some more to choose which one is
> better under 3 rounds).

Agree that perhaps we still need to keep 3 rounds of deduction after
poking at this further.

I tried to run an updated cbmc script[a], manually editing the file to
change the number of __reg_deduce_bounds calls made, and check whether
an extra call would further change reg_state. For
__reg_deduce_bounds_old, it needs 3 calls to pass verification, and for
__reg_deduce_bounds_new, it still needs 3.

Gist below:

  static void reg_bounds_sync_new_loop(struct bpf_reg_state *current) {
  	struct bpf_reg_state next;
  	int i;
  	__update_reg_bounds(current);
  	/* Taking any call away from below make verification fails */
  	__reg_deduce_bounds_new(current);
  	__reg_deduce_bounds_new(current);
  	__reg_deduce_bounds_new(current);
  	/* Try to see if an extra round of __reg_deduce_bounds_new further
  	 * changes anything.
  	 */
  	{
  		next = *current;
  		__reg_deduce_bounds_new(&next);
          /* check equivalence */
  		assert(next.var_off.value == current->var_off.value);
  		...
  	}
  	__reg_bound_offset(current);
  	__update_reg_bounds(current);
  }

On a quick glimpse, 2 rounds of __reg_deduce_bounds_new do seems to come
closer to a fix-point than 2 rounds of __reg_deduce_bounds_old. However,
for the following case it fails to converge in 2 rounds. Counter-example
generated with [b].

Input:

  actual=8
  reg.var_off.value=0
  reg.var_off.mask=0xfffffffffffffff8
  reg.umin_value=1
  reg.umax_value=0xffffffffffffffe9
  reg.smin_value=-23
  reg.smax_value=24
  reg.u32_min_value=8
  reg.u32_max_value=0x80000008
  reg.s32_min_value=-24
  reg.s32_max_value=23
  
After 2 rounds of __reg_deduce_bounds_new:

  current.var_off.value=0
  current.var_off.mask=24
  current.umin_value=8
  current.umax_value=23
  current.smin_value=8
  current.smax_value=23
  current.u32_min_value=8
  current.u32_max_value=24
  current.s32_min_value=8
  current.s32_max_value=23

After 3 rounds of __reg_deduce_bounds_new:
  
  next.var_off.value=0
  next.var_off.mask=24
  next.umin_value=8
  next.umax_value=23
  next.smin_value=8
  next.smax_value=23
  next.u32_min_value=8
  next.u32_max_value=23 <--- different
  next.s32_min_value=8
  next.s32_max_value=23

a: https://github.com/shunghsiyu/reg_bounds_sync-review/tree/reg_deduce_bounds-convergence
b: https://github.com/shunghsiyu/reg_bounds_sync-review/blob/reg_deduce_bounds-convergence/reg_deduce_bounds_new-counter-example.c

> [1] https://github.com/eddyz87/deduce-bounds-verif/tree/experiment

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-10  7:56       ` Shung-Hsi Yu
@ 2026-03-10 19:45         ` Eduard Zingerman
  2026-03-12 18:35           ` Paul Chaignon
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-10 19:45 UTC (permalink / raw)
  To: Shung-Hsi Yu, Paul Chaignon
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan

On Tue, 2026-03-10 at 15:56 +0800, Shung-Hsi Yu wrote:

[...]

> Agree that perhaps we still need to keep 3 rounds of deduction after
> poking at this further.
> 
> I tried to run an updated cbmc script[a], manually editing the file to
> change the number of __reg_deduce_bounds calls made, and check whether
> an extra call would further change reg_state. For
> __reg_deduce_bounds_old, it needs 3 calls to pass verification, and for
> __reg_deduce_bounds_new, it still needs 3.

Using the same logic I played with orderings a bit:
- For new ordering 3 and 4 deduction rounds are proven to be identical.
- For ordering in [1] 2 and 3 deduction rounds are proven to be identical,
  1 round is not enough.

So, I think [2] is the way to go:

static void __reg_deduce_bounds(struct bpf_reg_state *reg)
{
        deduce_bounds_64_from_64(reg);
        deduce_bounds_32_from_64(reg);
        deduce_bounds_32_from_32(reg);
        deduce_bounds_64_from_32(reg);
}

With __reg_deduce_bounds() done 2 times.

[1] https://github.com/eddyz87/deduce-bounds-verif/blob/76755f763f9282b74ca4e2251f83767502a98e5e/deduce_bounds_new.c#L321
[2] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-10 19:45         ` Eduard Zingerman
@ 2026-03-12 18:35           ` Paul Chaignon
  2026-03-13  2:17             ` Shung-Hsi Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Chaignon @ 2026-03-12 18:35 UTC (permalink / raw)
  To: Eduard Zingerman, Shung-Hsi Yu
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan

On Tue, Mar 10, 2026 at 12:45:39PM -0700, Eduard Zingerman wrote:
> On Tue, 2026-03-10 at 15:56 +0800, Shung-Hsi Yu wrote:
> 
> [...]
> 
> > Agree that perhaps we still need to keep 3 rounds of deduction after
> > poking at this further.
> > 
> > I tried to run an updated cbmc script[a], manually editing the file to
> > change the number of __reg_deduce_bounds calls made, and check whether
> > an extra call would further change reg_state. For
> > __reg_deduce_bounds_old, it needs 3 calls to pass verification, and for
> > __reg_deduce_bounds_new, it still needs 3.
> 
> Using the same logic I played with orderings a bit:
> - For new ordering 3 and 4 deduction rounds are proven to be identical.
> - For ordering in [1] 2 and 3 deduction rounds are proven to be identical,

Nice!

>   1 round is not enough.
> 
> So, I think [2] is the way to go:
> 
> static void __reg_deduce_bounds(struct bpf_reg_state *reg)
> {
>         deduce_bounds_64_from_64(reg);
>         deduce_bounds_32_from_64(reg);
>         deduce_bounds_32_from_32(reg);
>         deduce_bounds_64_from_32(reg);
> }
> 
> With __reg_deduce_bounds() done 2 times.

Thanks Eduard and Shung-Hsi for the amazing reviews and contributions!
I'll send a v3 with:
1. The renaming commit from Eduard.
2. The initial reshuffle + additional reshuffle from [1].
3. The removal of one __reg_deduce_bounds.
4. The selftest and I'll see if I can craft a second selftest from the
   new inputs you shared Shung-Hsi.

> 
> [1] https://github.com/eddyz87/deduce-bounds-verif/blob/76755f763f9282b74ca4e2251f83767502a98e5e/deduce_bounds_new.c#L321
> [2] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> 
> [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-12 18:35           ` Paul Chaignon
@ 2026-03-13  2:17             ` Shung-Hsi Yu
  2026-03-13  4:54               ` Eduard Zingerman
  2026-03-13 10:45               ` Paul Chaignon
  0 siblings, 2 replies; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-13  2:17 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Harishankar Vishwanathan

On Thu, Mar 12, 2026 at 07:35:15PM +0100, Paul Chaignon wrote:
> On Tue, Mar 10, 2026 at 12:45:39PM -0700, Eduard Zingerman wrote:
> > On Tue, 2026-03-10 at 15:56 +0800, Shung-Hsi Yu wrote:
> > 
> > [...]
> > 
> > > Agree that perhaps we still need to keep 3 rounds of deduction after
> > > poking at this further.
> > > 
> > > I tried to run an updated cbmc script[a], manually editing the file to
> > > change the number of __reg_deduce_bounds calls made, and check whether
> > > an extra call would further change reg_state. For
> > > __reg_deduce_bounds_old, it needs 3 calls to pass verification, and for
> > > __reg_deduce_bounds_new, it still needs 3.
> > 
> > Using the same logic I played with orderings a bit:
> > - For new ordering 3 and 4 deduction rounds are proven to be identical.
> > - For ordering in [1] 2 and 3 deduction rounds are proven to be identical,
> 
> Nice!

\o/

> >   1 round is not enough.
> > 
> > So, I think [2] is the way to go:
> > 
> > static void __reg_deduce_bounds(struct bpf_reg_state *reg)
> > {
> >         deduce_bounds_64_from_64(reg);
> >         deduce_bounds_32_from_64(reg);
> >         deduce_bounds_32_from_32(reg);
> >         deduce_bounds_64_from_32(reg);
> > }
> > 
> > With __reg_deduce_bounds() done 2 times.

Can confirm that's the case on my side as well.

> Thanks Eduard and Shung-Hsi for the amazing reviews and contributions!
> I'll send a v3 with:
> 1. The renaming commit from Eduard.
> 2. The initial reshuffle + additional reshuffle from [1].
> 3. The removal of one __reg_deduce_bounds.
> 4. The selftest and I'll see if I can craft a second selftest from the
>    new inputs you shared Shung-Hsi.

Perhaps I can try to see if I can come up with a better example, from
the difference between initial reshuffle vs additional reshuffle.

The problem is that I'm operating on a simplified assumption of "not
being able to converage to a fix-point is bad because it could lead to
bound invariant violation". So taking this chance to ask, does that all
ties back to is_scalar_branch_taken and reg_set_min_max (or was it
regs_refine_cond_op) difference?

Also, when we fail to reach a fix-point, it does not become a problem
immediately, but only becomes a invariant violation some where down the
line when we again do bound deduction?

I haven't actually look into an invariant violation myself, so details
are rather hazy. 

> > [1] https://github.com/eddyz87/deduce-bounds-verif/blob/76755f763f9282b74ca4e2251f83767502a98e5e/deduce_bounds_new.c#L321
> > [2] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> > 
> > [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-13  2:17             ` Shung-Hsi Yu
@ 2026-03-13  4:54               ` Eduard Zingerman
  2026-03-17  5:52                 ` Shung-Hsi Yu
  2026-03-13 10:45               ` Paul Chaignon
  1 sibling, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-13  4:54 UTC (permalink / raw)
  To: Shung-Hsi Yu, Paul Chaignon
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Harishankar Vishwanathan

On Fri, 2026-03-13 at 10:17 +0800, Shung-Hsi Yu wrote:

[...]

> The problem is that I'm operating on a simplified assumption of "not
> being able to converage to a fix-point is bad because it could lead to
> bound invariant violation". So taking this chance to ask, does that all
> ties back to is_scalar_branch_taken and reg_set_min_max (or was it
> regs_refine_cond_op) difference?
> 
> Also, when we fail to reach a fix-point, it does not become a problem
> immediately, but only becomes a invariant violation some where down the
> line when we again do bound deduction?
> 
> I haven't actually look into an invariant violation myself, so details
> are rather hazy. 

I don't think failing to reach fixed point causes invariant violation.
The root cause for invariant violation is that decisions in
is_scalar_branch_taken() for e.g. >=, <= are taken based on ranges,
but additional information in tnum is not taken into account.
Later reg_set_min_max() will call reg_bounds_sync() and it might turn
out that updated ranges do not intersect with tnums.
The inverse might happen for tnum based decisions in
is_scalar_branch_taken().

So, if we don't reach fixed point in reg_bounds_sync() we just leave
some precision on the table w/o other negative consequences,
as far as I understand.

One solution for invariants violation looks as follows:
- Adjust is_scalar_branch_taken():
  - Do not make predictions directly, instead apply range and tnum
    changes corresponding to the operation (e.g. for true branch of
    the `rX <= 0` operation intersect `rX` range with [-∞, 0]).
  - Call reg_set_min_max() and check if resulting registers are in a
    valid state. If they are for some branch but not another a
    prediction can be made.
- Adjust check_cond_jmp_op() to use registers computed by
  is_scalar_branch_taken() for true and false verifier states.
- Adjust reg_bounds_sync() to consistently bail out if invariants
  violation is encountered. I think it has a few places with implicit
  assumption that ranges/tnums must overlap.

I have a stale branch with these changes, need to pull my act together
and finish with those changes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-13  2:17             ` Shung-Hsi Yu
  2026-03-13  4:54               ` Eduard Zingerman
@ 2026-03-13 10:45               ` Paul Chaignon
  2026-03-17  6:03                 ` Shung-Hsi Yu
  1 sibling, 1 reply; 13+ messages in thread
From: Paul Chaignon @ 2026-03-13 10:45 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Harishankar Vishwanathan

On Fri, Mar 13, 2026 at 10:17:52AM +0800, Shung-Hsi Yu wrote:
> On Thu, Mar 12, 2026 at 07:35:15PM +0100, Paul Chaignon wrote:
> > On Tue, Mar 10, 2026 at 12:45:39PM -0700, Eduard Zingerman wrote:
> > > On Tue, 2026-03-10 at 15:56 +0800, Shung-Hsi Yu wrote:

[...]

> > Thanks Eduard and Shung-Hsi for the amazing reviews and contributions!
> > I'll send a v3 with:
> > 1. The renaming commit from Eduard.
> > 2. The initial reshuffle + additional reshuffle from [1].
> > 3. The removal of one __reg_deduce_bounds.
> > 4. The selftest and I'll see if I can craft a second selftest from the
> >    new inputs you shared Shung-Hsi.
> 
> Perhaps I can try to see if I can come up with a better example, from
> the difference between initial reshuffle vs additional reshuffle.

If you have another example, that'd be great. I tried to convert the
inputs you gave into a selftest, but I'm starting to suspect it's not
possible to reach those inputs.

I initially tried manually. For the first example, the trick was to
write all (or most) jumps needed for the different bounds and then find
the ordering that puts us in the exact case generated with CBMC. After
failing to find that manually, I tried the brute force approach:
generating all ordering of those jumps with a selftest that fails if we
reach the state we're looking for. That also failed.

> 
> The problem is that I'm operating on a simplified assumption of "not
> being able to converage to a fix-point is bad because it could lead to
> bound invariant violation". So taking this chance to ask, does that all
> ties back to is_scalar_branch_taken and reg_set_min_max (or was it
> regs_refine_cond_op) difference?

As Eduard answered, not reaching a fixed point in the refinement doesn't
lead to invariant violations. It's just a missed opportunity to get more
precise bounds. In general, Agni tells us the refinement functions are
sound so if they output ill-formed bounds it's because they were given
ill-formed bounds as input.

> 
> Also, when we fail to reach a fix-point, it does not become a problem
> immediately, but only becomes a invariant violation some where down the
> line when we again do bound deduction?
> 
> I haven't actually look into an invariant violation myself, so details
> are rather hazy. 
> 
> > > [1] https://github.com/eddyz87/deduce-bounds-verif/blob/76755f763f9282b74ca4e2251f83767502a98e5e/deduce_bounds_new.c#L321
> > > [2] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> > > 
> > > [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-13  4:54               ` Eduard Zingerman
@ 2026-03-17  5:52                 ` Shung-Hsi Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-17  5:52 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Paul Chaignon, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Harishankar Vishwanathan

On Thu, Mar 12, 2026 at 09:54:19PM -0700, Eduard Zingerman wrote:
> On Fri, 2026-03-13 at 10:17 +0800, Shung-Hsi Yu wrote:
> 
> [...]
> 
> > The problem is that I'm operating on a simplified assumption of "not
> > being able to converage to a fix-point is bad because it could lead to
> > bound invariant violation". So taking this chance to ask, does that all
> > ties back to is_scalar_branch_taken and reg_set_min_max (or was it
> > regs_refine_cond_op) difference?
> > 
> > Also, when we fail to reach a fix-point, it does not become a problem
> > immediately, but only becomes a invariant violation some where down the
> > line when we again do bound deduction?
> > 
> > I haven't actually look into an invariant violation myself, so details
> > are rather hazy. 
> 
> I don't think failing to reach fixed point causes invariant violation.
> The root cause for invariant violation is that decisions in
> is_scalar_branch_taken() for e.g. >=, <= are taken based on ranges,
> but additional information in tnum is not taken into account.
> Later reg_set_min_max() will call reg_bounds_sync() and it might turn
> out that updated ranges do not intersect with tnums.
> The inverse might happen for tnum based decisions in
> is_scalar_branch_taken().
> 
> So, if we don't reach fixed point in reg_bounds_sync() we just leave
> some precision on the table w/o other negative consequences,
> as far as I understand.

Understood. Then my assumption was wrong.

> One solution for invariants violation looks as follows:
> - Adjust is_scalar_branch_taken():
>   - Do not make predictions directly, instead apply range and tnum
>     changes corresponding to the operation (e.g. for true branch of
>     the `rX <= 0` operation intersect `rX` range with [-∞, 0]).
>   - Call reg_set_min_max() and check if resulting registers are in a
>     valid state. If they are for some branch but not another a
>     prediction can be made.
> - Adjust check_cond_jmp_op() to use registers computed by
>   is_scalar_branch_taken() for true and false verifier states.
> - Adjust reg_bounds_sync() to consistently bail out if invariants
>   violation is encountered. I think it has a few places with implicit
>   assumption that ranges/tnums must overlap.

Thanks for the explanations! I did see Paul and Harishankar had a
patchset that tackles, will look into that.

> I have a stale branch with these changes, need to pull my act together
> and finish with those changes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction
  2026-03-13 10:45               ` Paul Chaignon
@ 2026-03-17  6:03                 ` Shung-Hsi Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-17  6:03 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Harishankar Vishwanathan

On Fri, Mar 13, 2026 at 11:45:56AM +0100, Paul Chaignon wrote:
> On Fri, Mar 13, 2026 at 10:17:52AM +0800, Shung-Hsi Yu wrote:
> > On Thu, Mar 12, 2026 at 07:35:15PM +0100, Paul Chaignon wrote:
> > > On Tue, Mar 10, 2026 at 12:45:39PM -0700, Eduard Zingerman wrote:
> > > > On Tue, 2026-03-10 at 15:56 +0800, Shung-Hsi Yu wrote:
> 
> [...]
> 
> > > Thanks Eduard and Shung-Hsi for the amazing reviews and contributions!
> > > I'll send a v3 with:
> > > 1. The renaming commit from Eduard.
> > > 2. The initial reshuffle + additional reshuffle from [1].
> > > 3. The removal of one __reg_deduce_bounds.
> > > 4. The selftest and I'll see if I can craft a second selftest from the
> > >    new inputs you shared Shung-Hsi.
> > 
> > Perhaps I can try to see if I can come up with a better example, from
> > the difference between initial reshuffle vs additional reshuffle.
> 
> If you have another example, that'd be great. I tried to convert the
> inputs you gave into a selftest, but I'm starting to suspect it's not
> possible to reach those inputs.
> 
> I initially tried manually. For the first example, the trick was to
> write all (or most) jumps needed for the different bounds and then find
> the ordering that puts us in the exact case generated with CBMC. After
> failing to find that manually, I tried the brute force approach:
> generating all ordering of those jumps with a selftest that fails if we
> reach the state we're looking for. That also failed.

Hadn't further toyed with the counter examples yet, but given what
you've tried, I agree that it is highly likely not possible to reach
such inputs. So perhaps that additional reshuffle was not needed after
for the current codebase afterall (but still, we're much better off with
that merged).

> > 
> > The problem is that I'm operating on a simplified assumption of "not
> > being able to converage to a fix-point is bad because it could lead to
> > bound invariant violation". So taking this chance to ask, does that all
> > ties back to is_scalar_branch_taken and reg_set_min_max (or was it
> > regs_refine_cond_op) difference?
> 
> As Eduard answered, not reaching a fixed point in the refinement doesn't
> lead to invariant violations. It's just a missed opportunity to get more
> precise bounds. In general, Agni tells us the refinement functions are
> sound so if they output ill-formed bounds it's because they were given
> ill-formed bounds as input.

Thanks for the confirming as well! I was thinking about whether the
bound deduction patches were needed in stable as well, but looks like
there shouldn't be any security implication without them.

> > 
> > Also, when we fail to reach a fix-point, it does not become a problem
> > immediately, but only becomes a invariant violation some where down the
> > line when we again do bound deduction?
> > 
> > I haven't actually look into an invariant violation myself, so details
> > are rather hazy. 
> > 
> > > > [1] https://github.com/eddyz87/deduce-bounds-verif/blob/76755f763f9282b74ca4e2251f83767502a98e5e/deduce_bounds_new.c#L321
> > > > [2] https://github.com/eddyz87/bpf/tree/deduce-bounds-reshuffle
> > > > 
> > > > [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-17  6:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-07  0:01 [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Paul Chaignon
2026-03-07  0:03 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Test case for refinement improvement using 64b bounds Paul Chaignon
2026-03-10  1:07 ` [PATCH v2 bpf-next 1/2] bpf: Avoid one round of bounds deduction Eduard Zingerman
2026-03-10  1:30   ` Eduard Zingerman
2026-03-10  5:53     ` Eduard Zingerman
2026-03-10  7:56       ` Shung-Hsi Yu
2026-03-10 19:45         ` Eduard Zingerman
2026-03-12 18:35           ` Paul Chaignon
2026-03-13  2:17             ` Shung-Hsi Yu
2026-03-13  4:54               ` Eduard Zingerman
2026-03-17  5:52                 ` Shung-Hsi Yu
2026-03-13 10:45               ` Paul Chaignon
2026-03-17  6:03                 ` Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox