[PATCH] bpf: Simplify tnum

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] bpf: Simplify tnum_step()
@ 2026-03-18 17:19 Hao Sun
  2026-03-19  5:22 ` Shung-Hsi Yu
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Hao Sun @ 2026-03-18 17:19 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, eddyz87, john.fastabend, martin.lau,
	linux-kernel, sunhao.th

Simplify tnum_step() from a 10-variable algorithm into a straight
line sequence of bitwise operations.

tnum_step(): Given a tnum `(tval, tmask)` where `tval & tmask == 0`,
and a value `z` with `tval ≤ z < (tval | tmask)`, find the smallest
`r > z`, a tnum-satisfying value, i.e., `r & ~tmask == tval`.

Every tnum-satisfying value has the form tval | s where s is a subset
of tmask bits (s & ~tmask == 0).  Since tval and tmask are disjoint:

    tval | s  =  tval + s

Similarly z = tval + d where d = z - tval, so r > z becomes:

    tval + s  >  tval + d
    s > d

The problem reduces to: find the smallest s, a subset of tmask, such
that s > d.

Notice that `s` must be a subset of tmask, the problem now is simplified.

The mask bits of `d` form a "counter" that we want to increment by one,
but the counter has gaps at the fixed-bit positions.  A normal +1 would
stop at the first 0-bit it meets; we need it to skip over fixed-bit
gaps and land on the next mask bit.

Step 1 -- plug the gaps:

    d | carry_mask | ~tmask

  - ~tmask fills all fixed-bit positions with 1.
  - carry_mask = (1 << fls64(d & ~tmask)) - 1 fills all positions
    (including mask positions) below the highest non-mask bit of d.

After this, the only remaining 0s are mask bits above the highest
non-mask bit of d where d is also 0 -- exactly the positions where
the carry can validly land.

Step 2 -- increment:

    (d | carry_mask | ~tmask) + 1

Adding 1 flips all trailing 1s to 0 and sets the first 0 to 1.  Since
every gap has been plugged, that first 0 is guaranteed to be a mask bit
above all non-mask bits of d.

Step 3 -- mask:

    ((d | carry_mask | ~tmask) + 1) & tmask

Strip the scaffolding, keeping only mask bits.  Call the result inc.

Step 4 -- result:

    tval | inc

Reattach the fixed bits.

A simple 8-bit example:
    tmask:        1  1  0  1  0  1  1  0
    d:            1  0  1  0  0  0  1  0     (d = 162)
                        ^
                        non-mask 1 at bit 5

With carry_mask = 0b00111111 (smeared from bit 5):

    d|carry|~tm   1  0  1  1  1  1  1  1
    + 1           1  1  0  0  0  0  0  0
    & tmask       1  1  0  0  0  0  0  0

The patch passes my local test: test_verifier, test_prog for
`-t verifier` and `-t reg_bounds`.

Signed-off-by: Hao Sun <hao.sun@inf.ethz.ch>
---
The original algorithm is not intuitive to me, let me know if you
spot any inconsistency.

A Lean4 proof for the correctness of the algorithm is also available
in case anyone is interested:
	[1] https://pastebin.com/raw/czHKiyY0

 kernel/bpf/tnum.c | 32 +++++---------------------------
 1 file changed, 5 insertions(+), 27 deletions(-)

diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index 4abc359b3db0..aa35d4355216 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -286,8 +286,7 @@ struct tnum tnum_bswap64(struct tnum a)
  */
 u64 tnum_step(struct tnum t, u64 z)
 {
-	u64 tmax, j, p, q, r, s, v, u, w, res;
-	u8 k;
+	u64 tmax, d, carry_mask, inc;
 
 	tmax = t.value | t.mask;
 
@@ -299,29 +298,8 @@ u64 tnum_step(struct tnum t, u64 z)
 	if (z < t.value)
 		return t.value;
 
-	/* keep t's known bits, and match all unknown bits to z */
-	j = t.value | (z & t.mask);
-
-	if (j > z) {
-		p = ~z & t.value & ~t.mask;
-		k = fls64(p); /* k is the most-significant 0-to-1 flip */
-		q = U64_MAX << k;
-		r = q & z; /* positions > k matched to z */
-		s = ~q & t.value; /* positions <= k matched to t.value */
-		v = r | s;
-		res = v;
-	} else {
-		p = z & ~t.value & ~t.mask;
-		k = fls64(p); /* k is the most-significant 1-to-0 flip */
-		q = U64_MAX << k;
-		r = q & t.mask & z; /* unknown positions > k, matched to z */
-		s = q & ~t.mask; /* known positions > k, set to 1 */
-		v = r | s;
-		/* add 1 to unknown positions > k to make value greater than z */
-		u = v + (1ULL << k);
-		/* extract bits in unknown positions > k from u, rest from t.value */
-		w = (u & t.mask) | t.value;
-		res = w;
-	}
-	return res;
+	d = z - t.value;
+	carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
+	inc = ((d | carry_mask | ~t.mask) + 1) & t.mask;
+	return t.value | inc;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-18 17:19 [PATCH] bpf: Simplify tnum_step() Hao Sun
@ 2026-03-19  5:22 ` Shung-Hsi Yu
  2026-03-19  9:01   ` Hao Sun
  2026-03-19  7:24 ` Eduard Zingerman
  2026-03-19  8:17 ` Kumar Kartikeya Dwivedi
  2 siblings, 1 reply; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-19  5:22 UTC (permalink / raw)
  To: Hao Sun
  Cc: Paul Chaignon, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

Cc Harishankar and Paul since they've sent the original patchset that
added tnum_step().

On Wed, Mar 18, 2026 at 06:19:06PM +0100, Hao Sun wrote:
> Simplify tnum_step() from a 10-variable algorithm into a straight
> line sequence of bitwise operations.

Nit: my brain is a bit fried trying to go through current implementation
and this new one in parallel. Would be nice to get a short tldr;
something like

  The trick is realizing the problem can be simplified to finding a
  submask of t.mask that is greater than (z - t.value).

> tnum_step(): Given a tnum `(tval, tmask)` where `tval & tmask == 0`,
> and a value `z` with `tval ≤ z < (tval | tmask)`, find the smallest
> `r > z`, a tnum-satisfying value, i.e., `r & ~tmask == tval`.
> 
> Every tnum-satisfying value has the form tval | s where s is a subset
> of tmask bits (s & ~tmask == 0).  Since tval and tmask are disjoint:
> 
>     tval | s  =  tval + s

This seems the key to simplification, quite interested at how you came
across this.

> Similarly z = tval + d where d = z - tval, so r > z becomes:
> 
>     tval + s  >  tval + d
>     s > d
> 
> The problem reduces to: find the smallest s, a subset of tmask, such
> that s > d.
> 
> Notice that `s` must be a subset of tmask, the problem now is simplified.
...
> A Lean4 proof for the correctness of the algorithm is also available
> in case anyone is interested:
> 	[1] https://pastebin.com/raw/czHKiyY0

Cool! First time Lean4 proof showing up BPF mailing list.

...
> -		res = w;
> -	}
> -	return res;
> +	d = z - t.value;

A bit of comment explaining would be nice.

> +	carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> +	inc = ((d | carry_mask | ~t.mask) + 1) & t.mask;

Maybe break out the calculation of inc, give it a name (next_submask?),
and have it as preceding patch? It seems generic enough that I thought
would have been implemented already, but bitmap_scatter() was the
closest I could find.

Perhaps could be submitted to include/linux/bitops.h in the future.

...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  5:22 ` Shung-Hsi Yu
@ 2026-03-19  9:01   ` Hao Sun
  2026-03-19  9:35     ` Paul Chaignon
  0 siblings, 1 reply; 13+ messages in thread
From: Hao Sun @ 2026-03-19  9:01 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: Paul Chaignon, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

[...]
> > -             res = w;
> > -     }
> > -     return res;
> > +     d = z - t.value;
>
> A bit of comment explaining would be nice.
>

The commit message is self-contained; anyone interested can git blame.
But I am unsure. If a detailed description is preferred, I can add a
comment to v2.

> > +     carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> > +     inc = ((d | carry_mask | ~t.mask) + 1) & t.mask;
>
> Maybe break out the calculation of inc, give it a name (next_submask?),

Maybe we can break this down as:

d = z - t.value;
carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
filled = d | carry_mask | ~t.mask;
inc = (filled + 1) & t.mask;

`next_submask` is not precise, as we ored `~t.mask` into it.
It's always hard to name the intermediate result :)

> and have it as preceding patch? It seems generic enough that I thought
> would have been implemented already, but bitmap_scatter() was the
> closest I could find.
>
> Perhaps could be submitted to include/linux/bitops.h in the future.
>

The calculation above is direct, adding a bitmap helper intros an indirection.
For me, it means I would have to check out the meaning of the helper
and then get back to understanding the algorithm.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  9:01   ` Hao Sun
@ 2026-03-19  9:35     ` Paul Chaignon
  2026-03-19 13:12       ` Hao Sun
  2026-03-20  2:04       ` Shung-Hsi Yu
  0 siblings, 2 replies; 13+ messages in thread
From: Paul Chaignon @ 2026-03-19  9:35 UTC (permalink / raw)
  To: Hao Sun
  Cc: Shung-Hsi Yu, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

On Thu, Mar 19, 2026 at 10:01:31AM +0100, Hao Sun wrote:

Thanks for the cc Shung-Hsi. We were discussing it with Hari who's
planning to double check soundness.

> [...]
> > > -             res = w;
> > > -     }
> > > -     return res;
> > > +     d = z - t.value;
> >
> > A bit of comment explaining would be nice.
> >
> 
> The commit message is self-contained; anyone interested can git blame.
> But I am unsure. If a detailed description is preferred, I can add a
> comment to v2.

Not disagreeing about the role of git blame, but it's not a replacement
for code comments either. Having just the intuition in comments can help
a lot; as you mention below, it's one less indirection ;)

In its current form, I find the proof harder to follow than the
previous, very verbose version. Maybe there's a good middle ground?

> 
> > > +     carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> > > +     inc = ((d | carry_mask | ~t.mask) + 1) & t.mask;
> >
> > Maybe break out the calculation of inc, give it a name (next_submask?),
> 
> Maybe we can break this down as:
> 
> d = z - t.value;
> carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> filled = d | carry_mask | ~t.mask;
> inc = (filled + 1) & t.mask;
> 
> `next_submask` is not precise, as we ored `~t.mask` into it.
> It's always hard to name the intermediate result :)
> 
> > and have it as preceding patch? It seems generic enough that I thought
> > would have been implemented already, but bitmap_scatter() was the
> > closest I could find.
> >
> > Perhaps could be submitted to include/linux/bitops.h in the future.
> >
> 
> The calculation above is direct, adding a bitmap helper intros an indirection.
> For me, it means I would have to check out the meaning of the helper
> and then get back to understanding the algorithm.

I think I agree with Hao in this case. If we can reuse existing helpers,
great, but otherwise it's short enough that it's probably easier to keep
inlined here.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  9:35     ` Paul Chaignon
@ 2026-03-19 13:12       ` Hao Sun
  2026-03-19 13:24         ` Paul Chaignon
  2026-03-20  2:04       ` Shung-Hsi Yu
  1 sibling, 1 reply; 13+ messages in thread
From: Hao Sun @ 2026-03-19 13:12 UTC (permalink / raw)
  To: Paul Chaignon
  Cc: Shung-Hsi Yu, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

On Thu, Mar 19, 2026 at 10:35 AM Paul Chaignon <paul.chaignon@gmail.com> wrote:
>
> On Thu, Mar 19, 2026 at 10:01:31AM +0100, Hao Sun wrote:
>
> Thanks for the cc Shung-Hsi. We were discussing it with Hari who's
> planning to double check soundness.

Great, let me know the result when the verification is done; I will then
send the v2.

>
> > [...]
> > > > -             res = w;
> > > > -     }
> > > > -     return res;
> > > > +     d = z - t.value;
> > >
> > > A bit of comment explaining would be nice.
> > >
> >
> > The commit message is self-contained; anyone interested can git blame.
> > But I am unsure. If a detailed description is preferred, I can add a
> > comment to v2.
>
> Not disagreeing about the role of git blame, but it's not a replacement
> for code comments either. Having just the intuition in comments can help
> a lot; as you mention below, it's one less indirection ;)
>
> In its current form, I find the proof harder to follow than the
> previous, very verbose version. Maybe there's a good middle ground?
>

By `proof`, you are referring to the description/commit message,  not
the Lean4 proof, right? I was a bit confused here.

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19 13:12       ` Hao Sun
@ 2026-03-19 13:24         ` Paul Chaignon
  0 siblings, 0 replies; 13+ messages in thread
From: Paul Chaignon @ 2026-03-19 13:24 UTC (permalink / raw)
  To: Hao Sun
  Cc: Shung-Hsi Yu, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

On Thu, Mar 19, 2026 at 02:12:54PM +0100, Hao Sun wrote:
> On Thu, Mar 19, 2026 at 10:35 AM Paul Chaignon <paul.chaignon@gmail.com> wrote:
> >
> > On Thu, Mar 19, 2026 at 10:01:31AM +0100, Hao Sun wrote:
> >
> > Thanks for the cc Shung-Hsi. We were discussing it with Hari who's
> > planning to double check soundness.
> 
> Great, let me know the result when the verification is done; I will then
> send the v2.
> 
> >
> > > [...]
> > > > > -             res = w;
> > > > > -     }
> > > > > -     return res;
> > > > > +     d = z - t.value;
> > > >
> > > > A bit of comment explaining would be nice.
> > > >
> > >
> > > The commit message is self-contained; anyone interested can git blame.
> > > But I am unsure. If a detailed description is preferred, I can add a
> > > comment to v2.
> >
> > Not disagreeing about the role of git blame, but it's not a replacement
> > for code comments either. Having just the intuition in comments can help
> > a lot; as you mention below, it's one less indirection ;)
> >
> > In its current form, I find the proof harder to follow than the
> > previous, very verbose version. Maybe there's a good middle ground?
> >
> 
> By `proof`, you are referring to the description/commit message,  not
> the Lean4 proof, right? I was a bit confused here.

Seems I thought one word and wrote another :/ I meant the code, the
tnum_step function.

> 
> [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  9:35     ` Paul Chaignon
  2026-03-19 13:12       ` Hao Sun
@ 2026-03-20  2:04       ` Shung-Hsi Yu
  1 sibling, 0 replies; 13+ messages in thread
From: Shung-Hsi Yu @ 2026-03-20  2:04 UTC (permalink / raw)
  To: Hao Sun
  Cc: Paul Chaignon, Harishankar Vishwanathan, bpf, ast, daniel, andrii,
	eddyz87, john.fastabend, martin.lau, linux-kernel

On Thu, Mar 19, 2026 at 10:35:06AM +0100, Paul Chaignon wrote:
> On Thu, Mar 19, 2026 at 10:01:31AM +0100, Hao Sun wrote:
...
> > > > +     carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> > > > +     inc = ((d | carry_mask | ~t.mask) + 1) & t.mask;
> > >
> > > Maybe break out the calculation of inc, give it a name (next_submask?),
> > 
> > Maybe we can break this down as:
> > 
> > d = z - t.value;
> > carry_mask = (1ULL << fls64(d & ~t.mask)) - 1;
> > filled = d | carry_mask | ~t.mask;
> > inc = (filled + 1) & t.mask;
> > 
> > `next_submask` is not precise, as we ored `~t.mask` into it.

I was thinking of something like this: 

	static inline u64 refactored(u64 mask, u64 d)
	{
		u64 carry_mask;
	
		carry_mask = (1ULL << fls64(d & ~mask)) - 1;
		return ((d | carry_mask | ~mask) + 1) & mask;
	}

	u64 tnum_step(struct tnum t, u64 z)
	{
		...
		d = z - t.value;
		inc = refactored(t.mask, d);
		return t.value | inc;
	}

For me I find having to grapple 1) new way to calculate tnum_step and 2)
finding a submask of tmask > d, both at the same time rather difficult.
Hence the suggestion for separation.

> > It's always hard to name the intermediate result :)

Indeed, and let me not even try to name the function here.

> > > and have it as preceding patch? It seems generic enough that I thought
> > > would have been implemented already, but bitmap_scatter() was the
> > > closest I could find.
> > >
> > > Perhaps could be submitted to include/linux/bitops.h in the future.
> > 
> > The calculation above is direct, adding a bitmap helper intros an indirection.
> > For me, it means I would have to check out the meaning of the helper
> > and then get back to understanding the algorithm.
> 
> I think I agree with Hao in this case. If we can reuse existing helpers,
> great, but otherwise it's short enough that it's probably easier to keep
> inlined here.

Okay, I can now slightly see that it does introducing indirection, make
sense to keep as-is.

Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-18 17:19 [PATCH] bpf: Simplify tnum_step() Hao Sun
  2026-03-19  5:22 ` Shung-Hsi Yu
@ 2026-03-19  7:24 ` Eduard Zingerman
  2026-03-19  9:02   ` Hao Sun
  2026-03-19  8:17 ` Kumar Kartikeya Dwivedi
  2 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-19  7:24 UTC (permalink / raw)
  To: Hao Sun, bpf
  Cc: ast, daniel, andrii, john.fastabend, martin.lau, linux-kernel

On Wed, 2026-03-18 at 18:19 +0100, Hao Sun wrote:
> Simplify tnum_step() from a 10-variable algorithm into a straight
> line sequence of bitwise operations.
> 
> tnum_step(): Given a tnum `(tval, tmask)` where `tval & tmask == 0`,
> and a value `z` with `tval ≤ z < (tval | tmask)`, find the smallest
> `r > z`, a tnum-satisfying value, i.e., `r & ~tmask == tval`.
> 
> Every tnum-satisfying value has the form tval | s where s is a subset
> of tmask bits (s & ~tmask == 0).  Since tval and tmask are disjoint:
> 
>     tval | s  =  tval + s
> 
> Similarly z = tval + d where d = z - tval, so r > z becomes:
> 
>     tval + s  >  tval + d
>     s > d
> 
> The problem reduces to: find the smallest s, a subset of tmask, such
> that s > d.
> 
> Notice that `s` must be a subset of tmask, the problem now is simplified.
> 
> The mask bits of `d` form a "counter" that we want to increment by one,
> but the counter has gaps at the fixed-bit positions.  A normal +1 would
> stop at the first 0-bit it meets; we need it to skip over fixed-bit
> gaps and land on the next mask bit.
> 
> Step 1 -- plug the gaps:
> 
>     d | carry_mask | ~tmask
> 
>   - ~tmask fills all fixed-bit positions with 1.
>   - carry_mask = (1 << fls64(d & ~tmask)) - 1 fills all positions
>     (including mask positions) below the highest non-mask bit of d.
> 
> After this, the only remaining 0s are mask bits above the highest
> non-mask bit of d where d is also 0 -- exactly the positions where
> the carry can validly land.
> 
> Step 2 -- increment:
> 
>     (d | carry_mask | ~tmask) + 1
> 
> Adding 1 flips all trailing 1s to 0 and sets the first 0 to 1.  Since
> every gap has been plugged, that first 0 is guaranteed to be a mask bit
> above all non-mask bits of d.
> 
> Step 3 -- mask:
> 
>     ((d | carry_mask | ~tmask) + 1) & tmask
> 
> Strip the scaffolding, keeping only mask bits.  Call the result inc.
> 
> Step 4 -- result:
> 
>     tval | inc
> 
> Reattach the fixed bits.
> 
> A simple 8-bit example:
>     tmask:        1  1  0  1  0  1  1  0
>     d:            1  0  1  0  0  0  1  0     (d = 162)
>                         ^
>                         non-mask 1 at bit 5
> 
> With carry_mask = 0b00111111 (smeared from bit 5):
> 
>     d|carry|~tm   1  0  1  1  1  1  1  1
>     + 1           1  1  0  0  0  0  0  0
>     & tmask       1  1  0  0  0  0  0  0
> 
> The patch passes my local test: test_verifier, test_prog for
> `-t verifier` and `-t reg_bounds`.
> 
> Signed-off-by: Hao Sun <hao.sun@inf.ethz.ch>

I hacked a cbmc test in [1] and the checker says that the new function
performs according to specification (and identically to old function).

[1] https://github.com/eddyz87/tnum-step-verif/blob/master/main.c

Acked-by: Eduard Zingerman <eddyz87@gmail.com>

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  7:24 ` Eduard Zingerman
@ 2026-03-19  9:02   ` Hao Sun
  0 siblings, 0 replies; 13+ messages in thread
From: Hao Sun @ 2026-03-19  9:02 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, ast, daniel, andrii, john.fastabend, martin.lau,
	linux-kernel

[...]
>
> I hacked a cbmc test in [1] and the checker says that the new function
> performs according to specification (and identically to old function).
>
> [1] https://github.com/eddyz87/tnum-step-verif/blob/master/main.c
>

Thanks, great to know this additional confirmation!

> Acked-by: Eduard Zingerman <eddyz87@gmail.com>
>
> [...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-18 17:19 [PATCH] bpf: Simplify tnum_step() Hao Sun
  2026-03-19  5:22 ` Shung-Hsi Yu
  2026-03-19  7:24 ` Eduard Zingerman
@ 2026-03-19  8:17 ` Kumar Kartikeya Dwivedi
  2026-03-19  9:06   ` Hao Sun
  2 siblings, 1 reply; 13+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2026-03-19  8:17 UTC (permalink / raw)
  To: Hao Sun
  Cc: bpf, ast, daniel, andrii, eddyz87, john.fastabend, martin.lau,
	linux-kernel

On Wed, 18 Mar 2026 at 19:21, Hao Sun <sunhao.th@gmail.com> wrote:
>
> Simplify tnum_step() from a 10-variable algorithm into a straight
> line sequence of bitwise operations.
>
> tnum_step(): Given a tnum `(tval, tmask)` where `tval & tmask == 0`,
> and a value `z` with `tval ≤ z < (tval | tmask)`, find the smallest
> `r > z`, a tnum-satisfying value, i.e., `r & ~tmask == tval`.
>
> Every tnum-satisfying value has the form tval | s where s is a subset
> of tmask bits (s & ~tmask == 0).  Since tval and tmask are disjoint:
>
>     tval | s  =  tval + s
>
> Similarly z = tval + d where d = z - tval, so r > z becomes:
>
>     tval + s  >  tval + d
>     s > d
>
> The problem reduces to: find the smallest s, a subset of tmask, such
> that s > d.
>
> Notice that `s` must be a subset of tmask, the problem now is simplified.
>
> The mask bits of `d` form a "counter" that we want to increment by one,
> but the counter has gaps at the fixed-bit positions.  A normal +1 would
> stop at the first 0-bit it meets; we need it to skip over fixed-bit
> gaps and land on the next mask bit.
>
> Step 1 -- plug the gaps:
>
>     d | carry_mask | ~tmask
>
>   - ~tmask fills all fixed-bit positions with 1.
>   - carry_mask = (1 << fls64(d & ~tmask)) - 1 fills all positions
>     (including mask positions) below the highest non-mask bit of d.
>
> After this, the only remaining 0s are mask bits above the highest
> non-mask bit of d where d is also 0 -- exactly the positions where
> the carry can validly land.
>
> Step 2 -- increment:
>
>     (d | carry_mask | ~tmask) + 1
>
> Adding 1 flips all trailing 1s to 0 and sets the first 0 to 1.  Since
> every gap has been plugged, that first 0 is guaranteed to be a mask bit
> above all non-mask bits of d.
>
> Step 3 -- mask:
>
>     ((d | carry_mask | ~tmask) + 1) & tmask
>
> Strip the scaffolding, keeping only mask bits.  Call the result inc.
>
> Step 4 -- result:
>
>     tval | inc
>
> Reattach the fixed bits.
>
> A simple 8-bit example:
>     tmask:        1  1  0  1  0  1  1  0
>     d:            1  0  1  0  0  0  1  0     (d = 162)
>                         ^
>                         non-mask 1 at bit 5
>
> With carry_mask = 0b00111111 (smeared from bit 5):
>
>     d|carry|~tm   1  0  1  1  1  1  1  1
>     + 1           1  1  0  0  0  0  0  0
>     & tmask       1  1  0  0  0  0  0  0
>
> The patch passes my local test: test_verifier, test_prog for
> `-t verifier` and `-t reg_bounds`.
>
> Signed-off-by: Hao Sun <hao.sun@inf.ethz.ch>
> ---
> The original algorithm is not intuitive to me, let me know if you
> spot any inconsistency.
>
> A Lean4 proof for the correctness of the algorithm is also available
> in case anyone is interested:
>         [1] https://pastebin.com/raw/czHKiyY0

IMO it is worth it to include this proof inline in the commit log,
since links are fragile.
It's not that big, and I think it's more useful to have it inline than not.

[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  8:17 ` Kumar Kartikeya Dwivedi
@ 2026-03-19  9:06   ` Hao Sun
  2026-03-19 17:38     ` Eduard Zingerman
  0 siblings, 1 reply; 13+ messages in thread
From: Hao Sun @ 2026-03-19  9:06 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, ast, daniel, andrii, eddyz87, john.fastabend, martin.lau,
	linux-kernel

On Thu, Mar 19, 2026 at 9:18 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Wed, 18 Mar 2026 at 19:21, Hao Sun <sunhao.th@gmail.com> wrote:
> > [...]
> > in case anyone is interested:
> >         [1] https://pastebin.com/raw/czHKiyY0
>
> IMO it is worth it to include this proof inline in the commit log,
> since links are fragile.
> It's not that big, and I think it's more useful to have it inline than not.
>

The only concern is that the proof mainly uses `bv_decide`, which does not
provide much insight. But it's not big, I will inline it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19  9:06   ` Hao Sun
@ 2026-03-19 17:38     ` Eduard Zingerman
  2026-03-19 19:41       ` Hao Sun
  0 siblings, 1 reply; 13+ messages in thread
From: Eduard Zingerman @ 2026-03-19 17:38 UTC (permalink / raw)
  To: Hao Sun, Kumar Kartikeya Dwivedi
  Cc: bpf, ast, daniel, andrii, john.fastabend, martin.lau,
	linux-kernel

On Thu, 2026-03-19 at 10:06 +0100, Hao Sun wrote:
> On Thu, Mar 19, 2026 at 9:18 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> > 
> > On Wed, 18 Mar 2026 at 19:21, Hao Sun <sunhao.th@gmail.com> wrote:
> > > [...]
> > > in case anyone is interested:
> > >         [1] https://pastebin.com/raw/czHKiyY0
> > 
> > IMO it is worth it to include this proof inline in the commit log,
> > since links are fragile.
> > It's not that big, and I think it's more useful to have it inline than not.
> > 
> 
> The only concern is that the proof mainly uses `bv_decide`, which does not
> provide much insight. But it's not big, I will inline it.

I agree with you, not sure it would provide much signal, tbh.
As far as I understand `bv_decide` means: SAT-solver, please do the magic :)

A more interesting discussion would be to have some model-checker
based tests in the selftests, but Alexei was not excited last time we
talked about that.

If we have enough interested people, we can pick a checker and
maintain a "shadow copy" of relevant functions and data structures in
some repo e.g. on github + current proofs/tests. To have a starting
point for future changes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] bpf: Simplify tnum_step()
  2026-03-19 17:38     ` Eduard Zingerman
@ 2026-03-19 19:41       ` Hao Sun
  0 siblings, 0 replies; 13+ messages in thread
From: Hao Sun @ 2026-03-19 19:41 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: Kumar Kartikeya Dwivedi, bpf, ast, daniel, andrii, john.fastabend,
	martin.lau, linux-kernel

On Thu, Mar 19, 2026 at 6:38 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Thu, 2026-03-19 at 10:06 +0100, Hao Sun wrote:
> > On Thu, Mar 19, 2026 at 9:18 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Wed, 18 Mar 2026 at 19:21, Hao Sun <sunhao.th@gmail.com> wrote:
> > > > [...]
> > > > in case anyone is interested:
> > > >         [1] https://pastebin.com/raw/czHKiyY0
> > >
> > > IMO it is worth it to include this proof inline in the commit log,
> > > since links are fragile.
> > > It's not that big, and I think it's more useful to have it inline than not.
> > >
> >
> > The only concern is that the proof mainly uses `bv_decide`, which does not
> > provide much insight. But it's not big, I will inline it.
>
> I agree with you, not sure it would provide much signal, tbh.
> As far as I understand `bv_decide` means: SAT-solver, please do the magic :)
>
> A more interesting discussion would be to have some model-checker
> based tests in the selftests, but Alexei was not excited last time we
> talked about that.
>
> If we have enough interested people, we can pick a checker and
> maintain a "shadow copy" of relevant functions and data structures in
> some repo e.g. on github + current proofs/tests. To have a starting
> point for future changes.

I actually have an early prototype for this (developed a few months ago):
https://github.com/SunHao-0/eBPF_verification

The tool extracts the **unmodified** functions relevant to range analysis from
`verifier.c` and `tnum.c` into a self-contained `range_analysis.c`
file (no kernel
dependencies). The extraction is done with a simple approach: maintaining
a list of functions of interest, then extracting by lines. The
extracted code can
then be verified independently.

The specification is defined in `verifier_spec.c` linked with
`range_analysis.c`,
cmbc and bitwuzla are used for bounded model checking. The workflow is
simple enough for the original goal.

But the solver later became a bottleneck as the target function grows larger.
This required more modular verification, breaking down the verification
granularity. Nevertheless, it's a practical setup.

I am recently more interested in a Lean4-based solution: we can embed a
very small subset of C (large enough for the verifier's algorithm) as DSL
into Lean4, so that we can still use the same code. Then we can define
the spec for the function (similar to the theorem provided in the link), and
prove the soundness (even other interesting properties) using Lean4
tractic mode, which is more powerful (and becomes necessary once
the property is nontrivial). AI can certainly help with writing the proof:).
Future changes can then go through the proof; one interesting point
is that we only need to review the spec, Lean4 checks the proof, and
the trust here is Lean4's kernel, relatively small and robust.

If either of the above is of interest, we can certainly develop and maintain.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-03-20  2:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 17:19 [PATCH] bpf: Simplify tnum_step() Hao Sun
2026-03-19  5:22 ` Shung-Hsi Yu
2026-03-19  9:01   ` Hao Sun
2026-03-19  9:35     ` Paul Chaignon
2026-03-19 13:12       ` Hao Sun
2026-03-19 13:24         ` Paul Chaignon
2026-03-20  2:04       ` Shung-Hsi Yu
2026-03-19  7:24 ` Eduard Zingerman
2026-03-19  9:02   ` Hao Sun
2026-03-19  8:17 ` Kumar Kartikeya Dwivedi
2026-03-19  9:06   ` Hao Sun
2026-03-19 17:38     ` Eduard Zingerman
2026-03-19 19:41       ` Hao Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox