[PATCH v1 0/3] Add tnum

BPF List
 help / color / mirror / Atom feed

* [PATCH v1 0/3] Add tnum_scast helper
@ 2025-11-25 12:56 Dimitar Kanaliev
  2025-11-25 12:56 ` [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper Dimitar Kanaliev
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-11-25 12:56 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shung-Hsi Yu, Dimitar Kanaliev

Hello,

This patch series introduces a new tnum helper function called tnum_scast
which can perform sign extension on tnums. 

The goal of this patch is to help simplify (and unify) sign extension
operations performed on tnums inside the verifier. Additionally,
the changes also improve the precision with which boundaries are tracked
for s64/u64, since we can more accurately deduce said ranges. The patch
removes assignments to worst case {S,U}64_{MIN,MAX}.

There are a bunch of other places in which existing sign extensions can
be replaced with the new primitive, but I thought I'd start off with
register coercion as a minimal self contained example.

The first patch in the series introduces tnum_scast. The second patch
makes use of tnum_scast to simplify the logic in coerce_reg_to_size_sx
and coerce_subreg_to_size_sx. The last patch introduces some more tests
related to sign extension.

Changelog:
	v0 -> v1:
	- Simplified tnum_scast() implementation to use native s64
	arithmetic shifts for sign extension instead of manual bit masking.
	- Refactored coerce_{reg,subreg}_to_size_sx in verifier to rely
	on __update_reg_bounds() instead of the previously introduced
	manual logic. Removed some dead code for set_sext{32,64} values.
	- Removed irrelevant tests, added one that fails at base without
	our changes.

Dimitar Kanaliev (3):
  bpf: Introduce tnum_scast as a tnum native sign extension helper
  bpf: verifier: Simplify register sign extension with tnum_scast
  selftests/bpf: Add verifier bounds checks for sign extension

 include/linux/tnum.h                          |   3 +
 kernel/bpf/tnum.c                             |  13 ++
 kernel/bpf/verifier.c                         | 150 ++++--------------
 .../selftests/bpf/progs/verifier_movsx.c      |  19 +++
 4 files changed, 65 insertions(+), 120 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper
  2025-11-25 12:56 [PATCH v1 0/3] Add tnum_scast helper Dimitar Kanaliev
@ 2025-11-25 12:56 ` Dimitar Kanaliev
  2025-11-25 13:22   ` bot+bpf-ci
  2025-11-25 12:56 ` [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast Dimitar Kanaliev
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-11-25 12:56 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shung-Hsi Yu, Dimitar Kanaliev

This patch introduces a new helper function - tnum_scast(), which
sign-extends a tnum from a smaller integer size to the full 64-bit bpf
register range.

This is achieved by utilizing the native sign-extension behavior of signed
64-bit integers. By casting the value and mask to s64, shifting left to
align the target sign bit with the 64-bit MSB, and then performing an
arithmetic right shift, the sign bit is automatically propagated to the
upper bits.

For the mask, this works because if the sign bit is unknown (1), the
arithmetic shift propagates 1s (making upper bits unknonw). If known (0),
it propagates 0s (making upper bits known).

a) When the sign bit is known:
Assume a tnum with value = 0xFF, mask = 0x00, size = 1, which corresponds
to an 8-bit subregister of value 0xFF (-1 in 8 bits).
  s = 64 - 8 = 56
  value = ((s64)0xFF << 56) >> 56; // 0xFF...FF (-1)
  mask  = ((s64)0x00 << 56) >> 56; // 0x00...00

Because the sign bit is known to be 1, we sign-extend with 1s. The
resulting tnum is (0xFFFFFFFFFFFFFFFF, 0x0000000000000000).

b) When the sign bit is unknown:
Assume a tnum with value = 0x7F, mask = 0x80, size = 1.
  s = 56
  value = ((s64)0x7F << 56) >> 56; // 0x00...7F
  mask  = ((s64)0x80 << 56) >> 56; // 0xFF...80

The lower 8 bits can be 0x7F or 0xFF. The mask sign bit was 1 (unknown),
so the arithmetic shift propagated 1s, making all higher 56 bits unknown.
In 64-bit form, this tnum correctly represents the range from
0x000000000000007F (+127) to 0xFFFFFFFFFFFFFFFF (-1).

Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com>
---
 include/linux/tnum.h |  3 +++
 kernel/bpf/tnum.c    | 13 +++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/include/linux/tnum.h b/include/linux/tnum.h
index c52b862dad45..ed18ee1148b6 100644
--- a/include/linux/tnum.h
+++ b/include/linux/tnum.h
@@ -63,6 +63,9 @@ struct tnum tnum_union(struct tnum t1, struct tnum t2);
 /* Return @a with all but the lowest @size bytes cleared */
 struct tnum tnum_cast(struct tnum a, u8 size);
 
+/* Return @a sign-extended from @size bytes */
+struct tnum tnum_scast(struct tnum a, u8 size);
+
 /* Returns true if @a is a known constant */
 static inline bool tnum_is_const(struct tnum a)
 {
diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
index f8e70e9c3998..eabcec2ebc26 100644
--- a/kernel/bpf/tnum.c
+++ b/kernel/bpf/tnum.c
@@ -199,6 +199,19 @@ struct tnum tnum_cast(struct tnum a, u8 size)
 	return a;
 }
 
+struct tnum tnum_scast(struct tnum a, u8 size)
+{
+	u8 s = 64 - size * 8;
+	u64 value, mask;
+
+	if (size >= 8)
+		return a;
+
+	value = ((s64)a.value << s) >> s;
+	mask = ((s64)a.mask << s) >> s;
+	return TNUM(value, mask);
+}
+
 bool tnum_is_aligned(struct tnum a, u64 size)
 {
 	if (!size)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-11-25 12:56 [PATCH v1 0/3] Add tnum_scast helper Dimitar Kanaliev
  2025-11-25 12:56 ` [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper Dimitar Kanaliev
@ 2025-11-25 12:56 ` Dimitar Kanaliev
  2025-11-25 13:22   ` bot+bpf-ci
  2025-12-01 23:49   ` Eduard Zingerman
  2025-11-25 12:56 ` [PATCH v1 3/3] selftests/bpf: Add verifier bounds checks for sign extension Dimitar Kanaliev
  2025-11-26  9:04 ` [PATCH v1 0/3] Add tnum_scast helper Shung-Hsi Yu
  3 siblings, 2 replies; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-11-25 12:56 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shung-Hsi Yu, Dimitar Kanaliev

This patch refactors the verifier's sign-extension logic for narrow
register values to use the new tnum_scast helper.

Previously, coerce_reg_to_size_sx and coerce_subreg_to_size_sx employed
manual logic to determine bounds, sometimes falling back to loose ranges
when sign bits were uncertain.

We simplify said logic by delegating the bounds calculation to tnum_scast
+ the existing bounds synchronization logic:

1. The register's tnum is updated via tnum_scast()
2. The signed bounds (smin/smax) are reset to the maximum theoretical
   range for the target size.
3. The unsigned bounds are reset to the full register width.
4. __update_reg_bounds() is called.

By invoking __update_reg_bounds(), the verifier automatically calculates
the intersection between the theoretical signed range and the bitwise info
in reg->var_off. This ensures bounds are as tight as possible without
requiring custom logic in the coercion functions.

This commit also removes set_sext64_default_val() and
set_sext32_default_val() as they are no longer used.

Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com>
---
 kernel/bpf/verifier.c | 150 +++++++++---------------------------------
 1 file changed, 30 insertions(+), 120 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 766695491bc5..c9a6bf85b4ad 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
 	reg_bounds_sync(reg);
 }
 
-static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
-{
-	if (size == 1) {
-		reg->smin_value = reg->s32_min_value = S8_MIN;
-		reg->smax_value = reg->s32_max_value = S8_MAX;
-	} else if (size == 2) {
-		reg->smin_value = reg->s32_min_value = S16_MIN;
-		reg->smax_value = reg->s32_max_value = S16_MAX;
-	} else {
-		/* size == 4 */
-		reg->smin_value = reg->s32_min_value = S32_MIN;
-		reg->smax_value = reg->s32_max_value = S32_MAX;
-	}
-	reg->umin_value = reg->u32_min_value = 0;
-	reg->umax_value = U64_MAX;
-	reg->u32_max_value = U32_MAX;
-	reg->var_off = tnum_unknown;
-}
-
 static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
 {
-	s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval;
-	u64 top_smax_value, top_smin_value;
-	u64 num_bits = size * 8;
+	s64 smin_value, smax_value;
 
-	if (tnum_is_const(reg->var_off)) {
-		u64_cval = reg->var_off.value;
-		if (size == 1)
-			reg->var_off = tnum_const((s8)u64_cval);
-		else if (size == 2)
-			reg->var_off = tnum_const((s16)u64_cval);
-		else
-			/* size == 4 */
-			reg->var_off = tnum_const((s32)u64_cval);
-
-		u64_cval = reg->var_off.value;
-		reg->smax_value = reg->smin_value = u64_cval;
-		reg->umax_value = reg->umin_value = u64_cval;
-		reg->s32_max_value = reg->s32_min_value = u64_cval;
-		reg->u32_max_value = reg->u32_min_value = u64_cval;
+	if (size >= 8)
 		return;
-	}
 
-	top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
-	top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
+	reg->var_off = tnum_scast(reg->var_off, size);
 
-	if (top_smax_value != top_smin_value)
-		goto out;
+	smin_value = -(1LL << (size * 8 - 1));
+	smax_value = (1LL << (size * 8 - 1)) - 1;
 
-	/* find the s64_min and s64_min after sign extension */
-	if (size == 1) {
-		init_s64_max = (s8)reg->smax_value;
-		init_s64_min = (s8)reg->smin_value;
-	} else if (size == 2) {
-		init_s64_max = (s16)reg->smax_value;
-		init_s64_min = (s16)reg->smin_value;
-	} else {
-		init_s64_max = (s32)reg->smax_value;
-		init_s64_min = (s32)reg->smin_value;
-	}
-
-	s64_max = max(init_s64_max, init_s64_min);
-	s64_min = min(init_s64_max, init_s64_min);
+	reg->smin_value = smin_value;
+	reg->smax_value = smax_value;
 
-	/* both of s64_max/s64_min positive or negative */
-	if ((s64_max >= 0) == (s64_min >= 0)) {
-		reg->s32_min_value = reg->smin_value = s64_min;
-		reg->s32_max_value = reg->smax_value = s64_max;
-		reg->u32_min_value = reg->umin_value = s64_min;
-		reg->u32_max_value = reg->umax_value = s64_max;
-		reg->var_off = tnum_range(s64_min, s64_max);
-		return;
-	}
+	reg->s32_min_value = (s32)smin_value;
+	reg->s32_max_value = (s32)smax_value;
 
-out:
-	set_sext64_default_val(reg, size);
-}
-
-static void set_sext32_default_val(struct bpf_reg_state *reg, int size)
-{
-	if (size == 1) {
-		reg->s32_min_value = S8_MIN;
-		reg->s32_max_value = S8_MAX;
-	} else {
-		/* size == 2 */
-		reg->s32_min_value = S16_MIN;
-		reg->s32_max_value = S16_MAX;
-	}
+	reg->umin_value = 0;
+	reg->umax_value = U64_MAX;
 	reg->u32_min_value = 0;
 	reg->u32_max_value = U32_MAX;
-	reg->var_off = tnum_subreg(tnum_unknown);
+
+	__update_reg_bounds(reg);
 }
 
 static void coerce_subreg_to_size_sx(struct bpf_reg_state *reg, int size)
 {
-	s32 init_s32_max, init_s32_min, s32_max, s32_min, u32_val;
-	u32 top_smax_value, top_smin_value;
-	u32 num_bits = size * 8;
-
-	if (tnum_is_const(reg->var_off)) {
-		u32_val = reg->var_off.value;
-		if (size == 1)
-			reg->var_off = tnum_const((s8)u32_val);
-		else
-			reg->var_off = tnum_const((s16)u32_val);
+	s32 smin_value, smax_value;
 
-		u32_val = reg->var_off.value;
-		reg->s32_min_value = reg->s32_max_value = u32_val;
-		reg->u32_min_value = reg->u32_max_value = u32_val;
+	if (size >= 4)
 		return;
-	}
 
-	top_smax_value = ((u32)reg->s32_max_value >> num_bits) << num_bits;
-	top_smin_value = ((u32)reg->s32_min_value >> num_bits) << num_bits;
+	reg->var_off = tnum_subreg(tnum_scast(reg->var_off, size));
 
-	if (top_smax_value != top_smin_value)
-		goto out;
+	smin_value = -(1 << (size * 8 - 1));
+	smax_value = (1 << (size * 8 - 1)) - 1;
 
-	/* find the s32_min and s32_min after sign extension */
-	if (size == 1) {
-		init_s32_max = (s8)reg->s32_max_value;
-		init_s32_min = (s8)reg->s32_min_value;
-	} else {
-		/* size == 2 */
-		init_s32_max = (s16)reg->s32_max_value;
-		init_s32_min = (s16)reg->s32_min_value;
-	}
-	s32_max = max(init_s32_max, init_s32_min);
-	s32_min = min(init_s32_max, init_s32_min);
-
-	if ((s32_min >= 0) == (s32_max >= 0)) {
-		reg->s32_min_value = s32_min;
-		reg->s32_max_value = s32_max;
-		reg->u32_min_value = (u32)s32_min;
-		reg->u32_max_value = (u32)s32_max;
-		reg->var_off = tnum_subreg(tnum_range(s32_min, s32_max));
-		return;
-	}
+	reg->s32_min_value = smin_value;
+	reg->s32_max_value = smax_value;
 
-out:
-	set_sext32_default_val(reg, size);
+	reg->u32_min_value = 0;
+	reg->u32_max_value = U32_MAX;
+
+	__update_reg32_bounds(reg);
+
+	reg->umin_value = reg->u32_min_value;
+	reg->umax_value = reg->u32_max_value;
+
+	reg->smin_value = reg->umin_value;
+	reg->smax_value = reg->umax_value;
 }
 
 static bool bpf_map_is_rdonly(const struct bpf_map *map)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 3/3] selftests/bpf: Add verifier bounds checks for sign extension
  2025-11-25 12:56 [PATCH v1 0/3] Add tnum_scast helper Dimitar Kanaliev
  2025-11-25 12:56 ` [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper Dimitar Kanaliev
  2025-11-25 12:56 ` [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast Dimitar Kanaliev
@ 2025-11-25 12:56 ` Dimitar Kanaliev
  2025-11-26  9:04 ` [PATCH v1 0/3] Add tnum_scast helper Shung-Hsi Yu
  3 siblings, 0 replies; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-11-25 12:56 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko, Shung-Hsi Yu, Dimitar Kanaliev

This patch adds a new test cases to validate the improved register bounds
tracking logic.

We perform the sequence:

  call bpf_get_prandom_u32;
  r1 &= 0x100;
  r1 = (s8)r1;

After the bitwise AND, `r1` is either 0 or 256 (0x100).
If 0: The lower 8 bits are 0.
If 256: The bit at index 8 is set, but the lower 8 bits are 0.

Since the cast to s8 only considers bits 0-7, the set bit at index 8 is
truncated. In both cases, the sign bit (bit 7) is 0, so the
result is exactly 0.

With the coercion logic before this series:
  1: (bf) r1 = r0
    ; R0=scalar(id=1) R1=scalar(id=1)
  2: (57) r1 &= 256
    ; R1=scalar(...,var_off=(0x0; 0x100))
  3: (bf) r1 = (s8)r1
    ; R1=scalar(smin=smin32=-128,smax=smax32=127)

With our changes:
  1: (bf) r1 = r0
    ; R0=scalar(id=1) R1=scalar(id=1)
  2: (57) r1 &= 256
    ; R1=scalar(...,var_off=(0x0; 0x100))
  3: (bf) r1 = (s8)r1
    ; R1=0

Signed-off-by: Dimitar Kanaliev <dimitar.kanaliev@siteground.com>
---
 .../selftests/bpf/progs/verifier_movsx.c      | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c
index a4d8814eb5ed..df7ad41af172 100644
--- a/tools/testing/selftests/bpf/progs/verifier_movsx.c
+++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c
@@ -339,6 +339,25 @@ label_%=: 	                                        \
 	: __clobber_all);
 }
 
+SEC("socket")
+__description("MOV64SX, S8, upper bits truncation")
+__log_level(2)
+__msg("R1={{P?}}0")
+__success __success_unpriv __retval(0)
+__naked void mov64sx_s8_truncated_range(void)
+{
+	asm volatile ("                                      \
+	call %[bpf_get_prandom_u32];                         \
+	r1 = r0;                                             \
+	r1 &= 0x100;                                         \
+	r1 = (s8)r1;                                         \
+	r0 = 0;                                              \
+	exit;                                                \
+"	:
+	: __imm(bpf_get_prandom_u32)
+	: __clobber_all);
+}
+
 #else
 
 SEC("socket")
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper
  2025-11-25 12:56 ` [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper Dimitar Kanaliev
@ 2025-11-25 13:22   ` bot+bpf-ci
  2025-11-26  8:56     ` Shung-Hsi Yu
  0 siblings, 1 reply; 15+ messages in thread
From: bot+bpf-ci @ 2025-11-25 13:22 UTC (permalink / raw)
  To: dimitar.kanaliev, bpf
  Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mykolal, shung-hsi.yu,
	dimitar.kanaliev, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 1881 bytes --]

> diff --git a/include/linux/tnum.h b/include/linux/tnum.h
> index c52b862da..ed18ee114 100644
> --- a/include/linux/tnum.h
> +++ b/include/linux/tnum.h
> @@ -63,6 +63,9 @@ struct tnum tnum_union(struct tnum t1, struct tnum t2);
>  /* Return @a with all but the lowest @size bytes cleared */
>  struct tnum tnum_cast(struct tnum a, u8 size);
>
> +/* Return @a sign-extended from @size bytes */
> +struct tnum tnum_scast(struct tnum a, u8 size);
> +
>  /* Returns true if @a is a known constant */
>  static inline bool tnum_is_const(struct tnum a)
>  {
> diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> index f8e70e9c3..eabcec2eb 100644
> --- a/kernel/bpf/tnum.c
> +++ b/kernel/bpf/tnum.c
> @@ -199,6 +199,19 @@ struct tnum tnum_cast(struct tnum a, u8 size)
>  	return a;
>  }
>
> +struct tnum tnum_scast(struct tnum a, u8 size)
> +{
> +	u8 s = 64 - size * 8;
> +	u64 value, mask;
> +
> +	if (size >= 8)
> +		return a;
> +
> +	value = ((s64)a.value << s) >> s;
> +	mask = ((s64)a.mask << s) >> s;
                                    ^^

Can this invoke undefined behavior when size is 0? When size==0,
s becomes 64, and shifting a 64-bit value by 64 bits is undefined
behavior according to the C standard. The guard only checks size >= 8,
allowing size==0 to reach the shift operations.

The similar function tnum_cast() handles size==0 by producing a zero
value (since (1ULL << 0) - 1 == 0), which is well-defined. Should
tnum_scast() also handle size==0 explicitly, perhaps by adding
"size == 0 ||" to the guard condition?

> +	return TNUM(value, mask);
> +}
> +
>  bool tnum_is_aligned(struct tnum a, u64 size)
>  {
>  	if (!size)


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19670538430

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-11-25 12:56 ` [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast Dimitar Kanaliev
@ 2025-11-25 13:22   ` bot+bpf-ci
  2025-11-26 10:32     ` Dimitar Kanaliev
  2025-12-01 23:49   ` Eduard Zingerman
  1 sibling, 1 reply; 15+ messages in thread
From: bot+bpf-ci @ 2025-11-25 13:22 UTC (permalink / raw)
  To: dimitar.kanaliev, bpf
  Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
	yonghong.song, kpsingh, sdf, haoluo, jolsa, mykolal, shung-hsi.yu,
	dimitar.kanaliev, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 4028 bytes --]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 766695491..c9a6bf85b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c

[ ... ]

> @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
>  	reg_bounds_sync(reg);
>  }
>
> -static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
> -{

[ ... ]

> -}
> -
>  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
>  {

[ ... ]

>  }
>
> -static void set_sext32_default_val(struct bpf_reg_state *reg, int size)
> -{

[ ... ]

> -}
> -
>  static void coerce_subreg_to_size_sx(struct bpf_reg_state *reg, int size)
>  {
> -	s32 init_s32_max, init_s32_min, s32_max, s32_min, u32_val;
> -	u32 top_smax_value, top_smin_value;
> -	u32 num_bits = size * 8;
> -
> -	if (tnum_is_const(reg->var_off)) {
> -		u32_val = reg->var_off.value;
> -		if (size == 1)
> -			reg->var_off = tnum_const((s8)u32_val);
> -		else
> -			reg->var_off = tnum_const((s16)u32_val);
> +	s32 smin_value, smax_value;
>
> -		u32_val = reg->var_off.value;
> -		reg->s32_min_value = reg->s32_max_value = u32_val;
> -		reg->u32_min_value = reg->u32_max_value = u32_val;
> +	if (size >= 4)
>  		return;
> -	}
>
> -	top_smax_value = ((u32)reg->s32_max_value >> num_bits) << num_bits;
> -	top_smin_value = ((u32)reg->s32_min_value >> num_bits) << num_bits;
> +	reg->var_off = tnum_subreg(tnum_scast(reg->var_off, size));
>
> -	if (top_smax_value != top_smin_value)
> -		goto out;
> +	smin_value = -(1 << (size * 8 - 1));
> +	smax_value = (1 << (size * 8 - 1)) - 1;
>
> -	/* find the s32_min and s32_min after sign extension */
> -	if (size == 1) {
> -		init_s32_max = (s8)reg->s32_max_value;
> -		init_s32_min = (s8)reg->s32_min_value;
> -	} else {
> -		/* size == 2 */
> -		init_s32_max = (s16)reg->s32_max_value;
> -		init_s32_min = (s16)reg->s32_min_value;
> -	}
> -	s32_max = max(init_s32_max, init_s32_min);
> -	s32_min = min(init_s32_max, init_s32_min);
> -
> -	if ((s32_min >= 0) == (s32_max >= 0)) {
> -		reg->s32_min_value = s32_min;
> -		reg->s32_max_value = s32_max;
> -		reg->u32_min_value = (u32)s32_min;
> -		reg->u32_max_value = (u32)s32_max;
> -		reg->var_off = tnum_subreg(tnum_range(s32_min, s32_max));
> -		return;
> -	}
> +	reg->s32_min_value = smin_value;
> +	reg->s32_max_value = smax_value;
>
> -out:
> -	set_sext32_default_val(reg, size);
> +	reg->u32_min_value = 0;
> +	reg->u32_max_value = U32_MAX;
> +
> +	__update_reg32_bounds(reg);
> +
> +	reg->umin_value = reg->u32_min_value;
> +	reg->umax_value = reg->u32_max_value;
> +
> +	reg->smin_value = reg->umin_value;
> +	reg->smax_value = reg->umax_value;
                        ^^^^^^^^
Can reg->smin_value and reg->smax_value be correctly set from unsigned
values here? For sign-extended negative values, the 64-bit unsigned
bounds (umin/umax) will be positive after zero-extension, but the 64-bit
signed bounds (smin/smax) should remain negative.

Consider an 8-bit value -1 (0xFF) that gets sign-extended:
- After sign extension: s32_min_value = s32_max_value = -1 (0xFFFFFFFF)
- After __update_reg32_bounds: u32_min_value = u32_max_value = 0xFFFFFFFF
- After zero-extension to 64-bit: umin_value = umax_value = 0x00000000FFFFFFFF
- Then: smin_value = 0x00000000FFFFFFFF (interpreted as +4294967295)

But the correct 64-bit signed value should be -1 (0xFFFFFFFFFFFFFFFF),
not +4294967295.

The pattern in __reg_assign_32_into_64() at verifier.c:2771-2772 shows
the correct approach for propagating signed 32-bit bounds to 64-bit:

    reg->smin_value = reg->s32_min_value;
    reg->smax_value = reg->s32_max_value;

This uses implicit sign-extension from s32 to s64. Should
coerce_subreg_to_size_sx() follow the same pattern?

>  }
>
>  static bool bpf_map_is_rdonly(const struct bpf_map *map)


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19670538430

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper
  2025-11-25 13:22   ` bot+bpf-ci
@ 2025-11-26  8:56     ` Shung-Hsi Yu
  2025-12-01  7:43       ` Dimitar Kanaliev
  0 siblings, 1 reply; 15+ messages in thread
From: Shung-Hsi Yu @ 2025-11-26  8:56 UTC (permalink / raw)
  To: dimitar.kanaliev
  Cc: bot+bpf-ci, bpf, ast, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mykolal, martin.lau, clm, ihor.solodrai

On Tue, Nov 25, 2025 at 01:22:13PM +0000, bot+bpf-ci@kernel.org wrote:
> > diff --git a/include/linux/tnum.h b/include/linux/tnum.h
> > index c52b862da..ed18ee114 100644
> > --- a/include/linux/tnum.h
> > +++ b/include/linux/tnum.h
> > @@ -63,6 +63,9 @@ struct tnum tnum_union(struct tnum t1, struct tnum t2);
> >  /* Return @a with all but the lowest @size bytes cleared */
> >  struct tnum tnum_cast(struct tnum a, u8 size);
> >
> > +/* Return @a sign-extended from @size bytes */
> > +struct tnum tnum_scast(struct tnum a, u8 size);
> > +
> >  /* Returns true if @a is a known constant */
> >  static inline bool tnum_is_const(struct tnum a)
> >  {
> > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> > index f8e70e9c3..eabcec2eb 100644
> > --- a/kernel/bpf/tnum.c
> > +++ b/kernel/bpf/tnum.c
> > @@ -199,6 +199,19 @@ struct tnum tnum_cast(struct tnum a, u8 size)
> >  	return a;
> >  }
> >
> > +struct tnum tnum_scast(struct tnum a, u8 size)
> > +{
> > +	u8 s = 64 - size * 8;
> > +	u64 value, mask;
> > +
> > +	if (size >= 8)
> > +		return a;

The unsigned tnum_cast() doesn't have this check. I think we want to be
consistent here, either check size in both or none of them.

Given you next patch added the size check in coerce_reg_to_size_sx(), I
suggest we simply drop this.

> > +	value = ((s64)a.value << s) >> s;
> > +	mask = ((s64)a.mask << s) >> s;
>                                     ^^
> 
> Can this invoke undefined behavior when size is 0? When size==0,
> s becomes 64, and shifting a 64-bit value by 64 bits is undefined
> behavior according to the C standard. The guard only checks size >= 8,
> allowing size==0 to reach the shift operations.
> 
> The similar function tnum_cast() handles size==0 by producing a zero
> value (since (1ULL << 0) - 1 == 0), which is well-defined. Should
> tnum_scast() also handle size==0 explicitly, perhaps by adding
> "size == 0 ||" to the guard condition?

IIUC size should always be one of the values return from
bpf_size_to_bytes() with either BPF_W, BPF_H, or BPF_B (guaranteed by
bpf_opcode_in_insntable and check_mem_access), and thus can't be 0.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/3] Add tnum_scast helper
  2025-11-25 12:56 [PATCH v1 0/3] Add tnum_scast helper Dimitar Kanaliev
                   ` (2 preceding siblings ...)
  2025-11-25 12:56 ` [PATCH v1 3/3] selftests/bpf: Add verifier bounds checks for sign extension Dimitar Kanaliev
@ 2025-11-26  9:04 ` Shung-Hsi Yu
  3 siblings, 0 replies; 15+ messages in thread
From: Shung-Hsi Yu @ 2025-11-26  9:04 UTC (permalink / raw)
  To: Dimitar Kanaliev
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko

On Tue, Nov 25, 2025 at 02:56:31PM +0200, Dimitar Kanaliev wrote:
[...]
> Changelog:
> 	v0 -> v1:
> 	- Simplified tnum_scast() implementation to use native s64
> 	arithmetic shifts for sign extension instead of manual bit masking.
> 	- Refactored coerce_{reg,subreg}_to_size_sx in verifier to rely
> 	on __update_reg_bounds() instead of the previously introduced
> 	manual logic. Removed some dead code for set_sext{32,64} values.

Nit: you might want to point out who requested the change in the
changelog for clarity, e.g.

 - Simplified tnum_scast() implementation to use native s64 arithmetic
   shifts for sign extension instead of manual bit masking (Eduard)

> 	- Removed irrelevant tests, added one that fails at base without
> 	our changes.
[...]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-11-25 13:22   ` bot+bpf-ci
@ 2025-11-26 10:32     ` Dimitar Kanaliev
  0 siblings, 0 replies; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-11-26 10:32 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: bpf, ast, daniel, john.fastabend, andrii, martin.lau, eddyz87,
	song, yonghong.song, kpsingh, sdf, haoluo, jolsa, mykolal,
	shung-hsi.yu, martin.lau, clm, ihor.solodrai

On Tue, Nov 25, 2025 at 3:22 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 766695491..c9a6bf85b 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
>
> [ ... ]
>
> > @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
> >       reg_bounds_sync(reg);
> >  }
> >
> > -static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
> > -{
>
> [ ... ]
>
> > -}
> > -
> >  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
> >  {
>
> [ ... ]
>
> >  }
> >
> > -static void set_sext32_default_val(struct bpf_reg_state *reg, int size)
> > -{
>
> [ ... ]
>
> > -}
> > -
> >  static void coerce_subreg_to_size_sx(struct bpf_reg_state *reg, int size)
> >  {
> > -     s32 init_s32_max, init_s32_min, s32_max, s32_min, u32_val;
> > -     u32 top_smax_value, top_smin_value;
> > -     u32 num_bits = size * 8;
> > -
> > -     if (tnum_is_const(reg->var_off)) {
> > -             u32_val = reg->var_off.value;
> > -             if (size == 1)
> > -                     reg->var_off = tnum_const((s8)u32_val);
> > -             else
> > -                     reg->var_off = tnum_const((s16)u32_val);
> > +     s32 smin_value, smax_value;
> >
> > -             u32_val = reg->var_off.value;
> > -             reg->s32_min_value = reg->s32_max_value = u32_val;
> > -             reg->u32_min_value = reg->u32_max_value = u32_val;
> > +     if (size >= 4)
> >               return;
> > -     }
> >
> > -     top_smax_value = ((u32)reg->s32_max_value >> num_bits) << num_bits;
> > -     top_smin_value = ((u32)reg->s32_min_value >> num_bits) << num_bits;
> > +     reg->var_off = tnum_subreg(tnum_scast(reg->var_off, size));
> >
> > -     if (top_smax_value != top_smin_value)
> > -             goto out;
> > +     smin_value = -(1 << (size * 8 - 1));
> > +     smax_value = (1 << (size * 8 - 1)) - 1;
> >
> > -     /* find the s32_min and s32_min after sign extension */
> > -     if (size == 1) {
> > -             init_s32_max = (s8)reg->s32_max_value;
> > -             init_s32_min = (s8)reg->s32_min_value;
> > -     } else {
> > -             /* size == 2 */
> > -             init_s32_max = (s16)reg->s32_max_value;
> > -             init_s32_min = (s16)reg->s32_min_value;
> > -     }
> > -     s32_max = max(init_s32_max, init_s32_min);
> > -     s32_min = min(init_s32_max, init_s32_min);
> > -
> > -     if ((s32_min >= 0) == (s32_max >= 0)) {
> > -             reg->s32_min_value = s32_min;
> > -             reg->s32_max_value = s32_max;
> > -             reg->u32_min_value = (u32)s32_min;
> > -             reg->u32_max_value = (u32)s32_max;
> > -             reg->var_off = tnum_subreg(tnum_range(s32_min, s32_max));
> > -             return;
> > -     }
> > +     reg->s32_min_value = smin_value;
> > +     reg->s32_max_value = smax_value;
> >
> > -out:
> > -     set_sext32_default_val(reg, size);
> > +     reg->u32_min_value = 0;
> > +     reg->u32_max_value = U32_MAX;
> > +
> > +     __update_reg32_bounds(reg);
> > +
> > +     reg->umin_value = reg->u32_min_value;
> > +     reg->umax_value = reg->u32_max_value;
> > +
> > +     reg->smin_value = reg->umin_value;
> > +     reg->smax_value = reg->umax_value;
>                         ^^^^^^^^
> Can reg->smin_value and reg->smax_value be correctly set from unsigned
> values here? For sign-extended negative values, the 64-bit unsigned
> bounds (umin/umax) will be positive after zero-extension, but the 64-bit
> signed bounds (smin/smax) should remain negative.
>
> Consider an 8-bit value -1 (0xFF) that gets sign-extended:
> - After sign extension: s32_min_value = s32_max_value = -1 (0xFFFFFFFF)
> - After __update_reg32_bounds: u32_min_value = u32_max_value = 0xFFFFFFFF
> - After zero-extension to 64-bit: umin_value = umax_value = 0x00000000FFFFFFFF
> - Then: smin_value = 0x00000000FFFFFFFF (interpreted as +4294967295)
>
> But the correct 64-bit signed value should be -1 (0xFFFFFFFFFFFFFFFF),
> not +4294967295.
>
> The pattern in __reg_assign_32_into_64() at verifier.c:2771-2772 shows
> the correct approach for propagating signed 32-bit bounds to 64-bit:
>
>     reg->smin_value = reg->s32_min_value;
>     reg->smax_value = reg->s32_max_value;
>
> This uses implicit sign-extension from s32 to s64. Should
> coerce_subreg_to_size_sx() follow the same pattern?
>
Correct me if I'm wrong, but my assumption here is that ALU ops on wX
registers, we need to sign extend to 32-bits and then zero extend to the
whole 64-bit range, so zeroing the upper 32 bits means every possible
runtime value is in the range:

    0 <= value <= U32_MAX

which implies

    s64(value) == u64(value)

For the 64-bit view of the register after zero extending?

This is what the spec from the docs says on the topic:

    The MOVSX instruction does a move operation with sign extension.
    {MOVSX, X, ALU} sign extends 8-bit and 16-bit operands into 32-bit
    operands, and zeroes the remaining upper 32 bits. {MOVSX, X, ALU64}
    sign extends 8-bit, 16-bit, and 32-bit operands into 64-bit operands.

> >  }
> >
> >  static bool bpf_map_is_rdonly(const struct bpf_map *map)
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19670538430

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper
  2025-11-26  8:56     ` Shung-Hsi Yu
@ 2025-12-01  7:43       ` Dimitar Kanaliev
  2025-12-15  2:40         ` Shung-Hsi Yu
  0 siblings, 1 reply; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-12-01  7:43 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: bot+bpf-ci, bpf, ast, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mykolal, martin.lau, clm, ihor.solodrai

On Wed, Nov 26, 2025 at 10:56 AM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
>
> On Tue, Nov 25, 2025 at 01:22:13PM +0000, bot+bpf-ci@kernel.org wrote:
> > > diff --git a/include/linux/tnum.h b/include/linux/tnum.h
> > > index c52b862da..ed18ee114 100644
> > > --- a/include/linux/tnum.h
> > > +++ b/include/linux/tnum.h
> > > @@ -63,6 +63,9 @@ struct tnum tnum_union(struct tnum t1, struct tnum t2);
> > >  /* Return @a with all but the lowest @size bytes cleared */
> > >  struct tnum tnum_cast(struct tnum a, u8 size);
> > >
> > > +/* Return @a sign-extended from @size bytes */
> > > +struct tnum tnum_scast(struct tnum a, u8 size);
> > > +
> > >  /* Returns true if @a is a known constant */
> > >  static inline bool tnum_is_const(struct tnum a)
> > >  {
> > > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> > > index f8e70e9c3..eabcec2eb 100644
> > > --- a/kernel/bpf/tnum.c
> > > +++ b/kernel/bpf/tnum.c
> > > @@ -199,6 +199,19 @@ struct tnum tnum_cast(struct tnum a, u8 size)
> > >     return a;
> > >  }
> > >
> > > +struct tnum tnum_scast(struct tnum a, u8 size)
> > > +{
> > > +   u8 s = 64 - size * 8;
> > > +   u64 value, mask;
> > > +
> > > +   if (size >= 8)
> > > +           return a;
>
> The unsigned tnum_cast() doesn't have this check. I think we want to be
> consistent here, either check size in both or none of them.
>
> Given you next patch added the size check in coerce_reg_to_size_sx(), I
> suggest we simply drop this.
>
> > > +   value = ((s64)a.value << s) >> s;
> > > +   mask = ((s64)a.mask << s) >> s;
> >                                     ^^
> >
> > Can this invoke undefined behavior when size is 0? When size==0,
> > s becomes 64, and shifting a 64-bit value by 64 bits is undefined
> > behavior according to the C standard. The guard only checks size >= 8,
> > allowing size==0 to reach the shift operations.
> >
> > The similar function tnum_cast() handles size==0 by producing a zero
> > value (since (1ULL << 0) - 1 == 0), which is well-defined. Should
> > tnum_scast() also handle size==0 explicitly, perhaps by adding
> > "size == 0 ||" to the guard condition?
>
> IIUC size should always be one of the values return from
> bpf_size_to_bytes() with either BPF_W, BPF_H, or BPF_B (guaranteed by
> bpf_opcode_in_insntable and check_mem_access), and thus can't be 0.

Ack, thanks. I'm leaning towards removing the size validation from both
tnum_scast and the coerce_{reg,subreg}_to_size_sx code since the latter is
guarded like you pointed out, and the check doesn't really exist in the
current implementation either.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-11-25 12:56 ` [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast Dimitar Kanaliev
  2025-11-25 13:22   ` bot+bpf-ci
@ 2025-12-01 23:49   ` Eduard Zingerman
  2025-12-02 10:53     ` Dimitar Kanaliev
  1 sibling, 1 reply; 15+ messages in thread
From: Eduard Zingerman @ 2025-12-01 23:49 UTC (permalink / raw)
  To: Dimitar Kanaliev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shung-Hsi Yu

On Tue, 2025-11-25 at 14:56 +0200, Dimitar Kanaliev wrote:

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 766695491bc5..c9a6bf85b4ad 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
>  	reg_bounds_sync(reg);
>  }
>  
> -static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
> -{
> -	if (size == 1) {
> -		reg->smin_value = reg->s32_min_value = S8_MIN;
> -		reg->smax_value = reg->s32_max_value = S8_MAX;
> -	} else if (size == 2) {
> -		reg->smin_value = reg->s32_min_value = S16_MIN;
> -		reg->smax_value = reg->s32_max_value = S16_MAX;
> -	} else {
> -		/* size == 4 */
> -		reg->smin_value = reg->s32_min_value = S32_MIN;
> -		reg->smax_value = reg->s32_max_value = S32_MAX;
> -	}
> -	reg->umin_value = reg->u32_min_value = 0;
> -	reg->umax_value = U64_MAX;
> -	reg->u32_max_value = U32_MAX;
> -	reg->var_off = tnum_unknown;
> -}
> -
>  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
>  {
> -	s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval;
> -	u64 top_smax_value, top_smin_value;
> -	u64 num_bits = size * 8;
> +	s64 smin_value, smax_value;
>  
> -	if (tnum_is_const(reg->var_off)) {
> -		u64_cval = reg->var_off.value;
> -		if (size == 1)
> -			reg->var_off = tnum_const((s8)u64_cval);
> -		else if (size == 2)
> -			reg->var_off = tnum_const((s16)u64_cval);
> -		else
> -			/* size == 4 */
> -			reg->var_off = tnum_const((s32)u64_cval);
> -
> -		u64_cval = reg->var_off.value;
> -		reg->smax_value = reg->smin_value = u64_cval;
> -		reg->umax_value = reg->umin_value = u64_cval;
> -		reg->s32_max_value = reg->s32_min_value = u64_cval;
> -		reg->u32_max_value = reg->u32_min_value = u64_cval;
> +	if (size >= 8)
>  		return;
> -	}
>  
> -	top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
> -	top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
> +	reg->var_off = tnum_scast(reg->var_off, size);
>  
> -	if (top_smax_value != top_smin_value)
> -		goto out;
> +	smin_value = -(1LL << (size * 8 - 1));
> +	smax_value = (1LL << (size * 8 - 1)) - 1;
>  
> -	/* find the s64_min and s64_min after sign extension */
> -	if (size == 1) {
> -		init_s64_max = (s8)reg->smax_value;
> -		init_s64_min = (s8)reg->smin_value;
> -	} else if (size == 2) {
> -		init_s64_max = (s16)reg->smax_value;
> -		init_s64_min = (s16)reg->smin_value;
> -	} else {
> -		init_s64_max = (s32)reg->smax_value;
> -		init_s64_min = (s32)reg->smin_value;
> -	}
> -
> -	s64_max = max(init_s64_max, init_s64_min);
> -	s64_min = min(init_s64_max, init_s64_min);
> +	reg->smin_value = smin_value;
> +	reg->smax_value = smax_value;
>  
> -	/* both of s64_max/s64_min positive or negative */
> -	if ((s64_max >= 0) == (s64_min >= 0)) {
> -		reg->s32_min_value = reg->smin_value = s64_min;
> -		reg->s32_max_value = reg->smax_value = s64_max;
> -		reg->u32_min_value = reg->umin_value = s64_min;
> -		reg->u32_max_value = reg->umax_value = s64_max;
> -		reg->var_off = tnum_range(s64_min, s64_max);
> -		return;
> -	}
> +	reg->s32_min_value = (s32)smin_value;
> +	reg->s32_max_value = (s32)smax_value;
>  
> -out:
> -	set_sext64_default_val(reg, size);
> -}

Assume that size == 1, s64_min = 0b000, s64_max == 0b100.
This corresponds to tnum with value == 0b000 and mask == 0b111.
Old algorithm computes more precise range in this situation.
Old:

  0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
  1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
  2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
  3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
  4: (b7) r0 = 0                        ; R0=0
  5: (95) exit

New:

  0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
  1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
  2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
  3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=7,var_off=(0x0; 0x7)) ...
  4: (b7) r0 = 0                        ; R0=0
  5: (95) exit

Note that range for R0 at (3) is 0..4 for old algorithm and 0..7 for
new algorithm.

Can we keep both algorithms by e.g. replacing set_sext64_default_val()
implementation with tnum_scast() adding tnum_scast() in
coerce_reg_to_size_sx()?

In general, for such kinds of patch-sets it is interesting to see how
much precision is gained/lost with the change. It shouldn't be hard to
collect such data for e.g. complete s8 range by writing a small
user-space program that enumerates the s8 x s8 range and applies both
old an new range computations.

[...]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-12-01 23:49   ` Eduard Zingerman
@ 2025-12-02 10:53     ` Dimitar Kanaliev
  2025-12-02 18:03       ` Eduard Zingerman
  2025-12-04  6:50       ` Shung-Hsi Yu
  0 siblings, 2 replies; 15+ messages in thread
From: Dimitar Kanaliev @ 2025-12-02 10:53 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shung-Hsi Yu

On Tue, Dec 2, 2025 at 1:50 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2025-11-25 at 14:56 +0200, Dimitar Kanaliev wrote:
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 766695491bc5..c9a6bf85b4ad 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
> >       reg_bounds_sync(reg);
> >  }
> >
> > -static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
> > -{
> > -     if (size == 1) {
> > -             reg->smin_value = reg->s32_min_value = S8_MIN;
> > -             reg->smax_value = reg->s32_max_value = S8_MAX;
> > -     } else if (size == 2) {
> > -             reg->smin_value = reg->s32_min_value = S16_MIN;
> > -             reg->smax_value = reg->s32_max_value = S16_MAX;
> > -     } else {
> > -             /* size == 4 */
> > -             reg->smin_value = reg->s32_min_value = S32_MIN;
> > -             reg->smax_value = reg->s32_max_value = S32_MAX;
> > -     }
> > -     reg->umin_value = reg->u32_min_value = 0;
> > -     reg->umax_value = U64_MAX;
> > -     reg->u32_max_value = U32_MAX;
> > -     reg->var_off = tnum_unknown;
> > -}
> > -
> >  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
> >  {
> > -     s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval;
> > -     u64 top_smax_value, top_smin_value;
> > -     u64 num_bits = size * 8;
> > +     s64 smin_value, smax_value;
> >
> > -     if (tnum_is_const(reg->var_off)) {
> > -             u64_cval = reg->var_off.value;
> > -             if (size == 1)
> > -                     reg->var_off = tnum_const((s8)u64_cval);
> > -             else if (size == 2)
> > -                     reg->var_off = tnum_const((s16)u64_cval);
> > -             else
> > -                     /* size == 4 */
> > -                     reg->var_off = tnum_const((s32)u64_cval);
> > -
> > -             u64_cval = reg->var_off.value;
> > -             reg->smax_value = reg->smin_value = u64_cval;
> > -             reg->umax_value = reg->umin_value = u64_cval;
> > -             reg->s32_max_value = reg->s32_min_value = u64_cval;
> > -             reg->u32_max_value = reg->u32_min_value = u64_cval;
> > +     if (size >= 8)
> >               return;
> > -     }
> >
> > -     top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
> > -     top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
> > +     reg->var_off = tnum_scast(reg->var_off, size);
> >
> > -     if (top_smax_value != top_smin_value)
> > -             goto out;
> > +     smin_value = -(1LL << (size * 8 - 1));
> > +     smax_value = (1LL << (size * 8 - 1)) - 1;
> >
> > -     /* find the s64_min and s64_min after sign extension */
> > -     if (size == 1) {
> > -             init_s64_max = (s8)reg->smax_value;
> > -             init_s64_min = (s8)reg->smin_value;
> > -     } else if (size == 2) {
> > -             init_s64_max = (s16)reg->smax_value;
> > -             init_s64_min = (s16)reg->smin_value;
> > -     } else {
> > -             init_s64_max = (s32)reg->smax_value;
> > -             init_s64_min = (s32)reg->smin_value;
> > -     }
> > -
> > -     s64_max = max(init_s64_max, init_s64_min);
> > -     s64_min = min(init_s64_max, init_s64_min);
> > +     reg->smin_value = smin_value;
> > +     reg->smax_value = smax_value;
> >
> > -     /* both of s64_max/s64_min positive or negative */
> > -     if ((s64_max >= 0) == (s64_min >= 0)) {
> > -             reg->s32_min_value = reg->smin_value = s64_min;
> > -             reg->s32_max_value = reg->smax_value = s64_max;
> > -             reg->u32_min_value = reg->umin_value = s64_min;
> > -             reg->u32_max_value = reg->umax_value = s64_max;
> > -             reg->var_off = tnum_range(s64_min, s64_max);
> > -             return;
> > -     }
> > +     reg->s32_min_value = (s32)smin_value;
> > +     reg->s32_max_value = (s32)smax_value;
> >
> > -out:
> > -     set_sext64_default_val(reg, size);
> > -}
>
> Assume that size == 1, s64_min = 0b000, s64_max == 0b100.
> This corresponds to tnum with value == 0b000 and mask == 0b111.
> Old algorithm computes more precise range in this situation.
> Old:
>
>   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
>   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
>   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
>   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
>   4: (b7) r0 = 0                        ; R0=0
>   5: (95) exit
>
> New:
>
>   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
>   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
>   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
>   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=7,var_off=(0x0; 0x7)) ...
>   4: (b7) r0 = 0                        ; R0=0
>   5: (95) exit
>
> Note that range for R0 at (3) is 0..4 for old algorithm and 0..7 for
> new algorithm.
>
> Can we keep both algorithms by e.g. replacing set_sext64_default_val()
> implementation with tnum_scast() adding tnum_scast() in
> coerce_reg_to_size_sx()?
>
> In general, for such kinds of patch-sets it is interesting to see how
> much precision is gained/lost with the change. It shouldn't be hard to
> collect such data for e.g. complete s8 range by writing a small
> user-space program that enumerates the s8 x s8 range and applies both
> old an new range computations.
>
> [...]

I was mostly focused on preserving the info from tnum for sparse ranges, so
I kind of forgot about continuous ranges entirely.
As per your suggestion, I plucked out anything relevant from the kernel,
and compared the smax / smin values for the entire -1024,1024 range in a
loop, like so:

  struct bpf_reg_state r_old, r_new;

  init_reg(&r_old, start, end);
  r_new = r_old;

  coerce_reg_to_size_sx_old(&r_old, size);
  coerce_reg_to_size_sx_new(&r_new, size);

  s64 range_old = r_old.smax_value - r_old.smin_value;
  s64 range_new = r_new.smax_value - r_new.smin_value;

  if (range_old < range_new) { ...

In these continous ranges, the old implementation is much better:

  [-1024, 1024]:
  Old Better: 128016
  New Better: 0
  Equal: 1972209

So I endeed up drafting this:

  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
  {
    s64 smin_value, smax_value;
    u64 num_bits = size * 8;
    u64 top_smax_value, top_smin_value;

    reg->var_off = tnum_scast(reg->var_off, size);

    top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
    top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;

    if (top_smax_value == top_smin_value) {
            if (size == 1) {
                smin_value = (s8)reg->smin_value;
                smax_value = (s8)reg->smax_value;
            } else if (size == 2) {
                smin_value = (s16)reg->smin_value;
                smax_value = (s16)reg->smax_value;
            } else {
                smin_value = (s32)reg->smin_value;
                smax_value = (s32)reg->smax_value;
            }
        } else {
            smin_value = -(1LL << (num_bits - 1));
            smax_value = (1LL << (num_bits - 1)) - 1;
        }

        reg->smin_value = smin_value;
        reg->smax_value = smax_value;

        reg->umin_value = 0;
        reg->umax_value = U64_MAX;

        reg->s32_min_value = (s32)smin_value;
        reg->s32_max_value = (s32)smax_value;
        reg->u32_min_value = 0;
        reg->u32_max_value = U32_MAX;

        __update_reg_bounds(reg);
}

I'm trying to always perform tnum_scast in order to preserve bitwise
info, but attempt to use the old numeric logic first. If the range fits
into the target size, we preserve the existing numeric bounds. If not, we
fall back to the type limits and let __update_reg_bounds reconstruct the
range from var_off. The imeplementation is similar for the subreg variant.

Rerunning the comparison for the same range looks much better, we should be
consistently seeing precision gains in the cases where the original
implementation bails out via goto:

  [-1024, 1024]:
  Old Better: 0
  New Better: 131072
  Equal: 1969153

I also went through the CI, the existing selftest in the series still
covers the change.

wdyt?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-12-02 10:53     ` Dimitar Kanaliev
@ 2025-12-02 18:03       ` Eduard Zingerman
  2025-12-04  6:50       ` Shung-Hsi Yu
  1 sibling, 0 replies; 15+ messages in thread
From: Eduard Zingerman @ 2025-12-02 18:03 UTC (permalink / raw)
  To: Dimitar Kanaliev
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, John Fastabend,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Mykola Lysenko,
	Shung-Hsi Yu

On Tue, 2025-12-02 at 12:53 +0200, Dimitar Kanaliev wrote:
> On Tue, Dec 2, 2025 at 1:50 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > On Tue, 2025-11-25 at 14:56 +0200, Dimitar Kanaliev wrote:
> > 
> > [...]
> > 
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 766695491bc5..c9a6bf85b4ad 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -6876,147 +6876,57 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
> > >       reg_bounds_sync(reg);
> > >  }
> > > 
> > > -static void set_sext64_default_val(struct bpf_reg_state *reg, int size)
> > > -{
> > > -     if (size == 1) {
> > > -             reg->smin_value = reg->s32_min_value = S8_MIN;
> > > -             reg->smax_value = reg->s32_max_value = S8_MAX;
> > > -     } else if (size == 2) {
> > > -             reg->smin_value = reg->s32_min_value = S16_MIN;
> > > -             reg->smax_value = reg->s32_max_value = S16_MAX;
> > > -     } else {
> > > -             /* size == 4 */
> > > -             reg->smin_value = reg->s32_min_value = S32_MIN;
> > > -             reg->smax_value = reg->s32_max_value = S32_MAX;
> > > -     }
> > > -     reg->umin_value = reg->u32_min_value = 0;
> > > -     reg->umax_value = U64_MAX;
> > > -     reg->u32_max_value = U32_MAX;
> > > -     reg->var_off = tnum_unknown;
> > > -}
> > > -
> > >  static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
> > >  {
> > > -     s64 init_s64_max, init_s64_min, s64_max, s64_min, u64_cval;
> > > -     u64 top_smax_value, top_smin_value;
> > > -     u64 num_bits = size * 8;
> > > +     s64 smin_value, smax_value;
> > > 
> > > -     if (tnum_is_const(reg->var_off)) {
> > > -             u64_cval = reg->var_off.value;
> > > -             if (size == 1)
> > > -                     reg->var_off = tnum_const((s8)u64_cval);
> > > -             else if (size == 2)
> > > -                     reg->var_off = tnum_const((s16)u64_cval);
> > > -             else
> > > -                     /* size == 4 */
> > > -                     reg->var_off = tnum_const((s32)u64_cval);
> > > -
> > > -             u64_cval = reg->var_off.value;
> > > -             reg->smax_value = reg->smin_value = u64_cval;
> > > -             reg->umax_value = reg->umin_value = u64_cval;
> > > -             reg->s32_max_value = reg->s32_min_value = u64_cval;
> > > -             reg->u32_max_value = reg->u32_min_value = u64_cval;
> > > +     if (size >= 8)
> > >               return;
> > > -     }
> > > 
> > > -     top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
> > > -     top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
> > > +     reg->var_off = tnum_scast(reg->var_off, size);
> > > 
> > > -     if (top_smax_value != top_smin_value)
> > > -             goto out;
> > > +     smin_value = -(1LL << (size * 8 - 1));
> > > +     smax_value = (1LL << (size * 8 - 1)) - 1;
> > > 
> > > -     /* find the s64_min and s64_min after sign extension */
> > > -     if (size == 1) {
> > > -             init_s64_max = (s8)reg->smax_value;
> > > -             init_s64_min = (s8)reg->smin_value;
> > > -     } else if (size == 2) {
> > > -             init_s64_max = (s16)reg->smax_value;
> > > -             init_s64_min = (s16)reg->smin_value;
> > > -     } else {
> > > -             init_s64_max = (s32)reg->smax_value;
> > > -             init_s64_min = (s32)reg->smin_value;
> > > -     }
> > > -
> > > -     s64_max = max(init_s64_max, init_s64_min);
> > > -     s64_min = min(init_s64_max, init_s64_min);
> > > +     reg->smin_value = smin_value;
> > > +     reg->smax_value = smax_value;
> > > 
> > > -     /* both of s64_max/s64_min positive or negative */
> > > -     if ((s64_max >= 0) == (s64_min >= 0)) {
> > > -             reg->s32_min_value = reg->smin_value = s64_min;
> > > -             reg->s32_max_value = reg->smax_value = s64_max;
> > > -             reg->u32_min_value = reg->umin_value = s64_min;
> > > -             reg->u32_max_value = reg->umax_value = s64_max;
> > > -             reg->var_off = tnum_range(s64_min, s64_max);
> > > -             return;
> > > -     }
> > > +     reg->s32_min_value = (s32)smin_value;
> > > +     reg->s32_max_value = (s32)smax_value;
> > > 
> > > -out:
> > > -     set_sext64_default_val(reg, size);
> > > -}
> > 
> > Assume that size == 1, s64_min = 0b000, s64_max == 0b100.
> > This corresponds to tnum with value == 0b000 and mask == 0b111.
> > Old algorithm computes more precise range in this situation.
> > Old:
> > 
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
> >   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   4: (b7) r0 = 0                        ; R0=0
> >   5: (95) exit
> > 
> > New:
> > 
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
> >   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=7,var_off=(0x0; 0x7)) ...
> >   4: (b7) r0 = 0                        ; R0=0
> >   5: (95) exit
> > 
> > Note that range for R0 at (3) is 0..4 for old algorithm and 0..7 for
> > new algorithm.
> > 
> > Can we keep both algorithms by e.g. replacing set_sext64_default_val()
> > implementation with tnum_scast() adding tnum_scast() in
> > coerce_reg_to_size_sx()?
> > 
> > In general, for such kinds of patch-sets it is interesting to see how
> > much precision is gained/lost with the change. It shouldn't be hard to
> > collect such data for e.g. complete s8 range by writing a small
> > user-space program that enumerates the s8 x s8 range and applies both
> > old an new range computations.
> > 
> > [...]
> 
> I was mostly focused on preserving the info from tnum for sparse ranges, so
> I kind of forgot about continuous ranges entirely.
> As per your suggestion, I plucked out anything relevant from the kernel,
> and compared the smax / smin values for the entire -1024,1024 range in a
> loop, like so:
> 
>   struct bpf_reg_state r_old, r_new;
> 
>   init_reg(&r_old, start, end);
>   r_new = r_old;
> 
>   coerce_reg_to_size_sx_old(&r_old, size);
>   coerce_reg_to_size_sx_new(&r_new, size);
> 
>   s64 range_old = r_old.smax_value - r_old.smin_value;
>   s64 range_new = r_new.smax_value - r_new.smin_value;
> 
>   if (range_old < range_new) { ...
> 
> In these continous ranges, the old implementation is much better:
> 
>   [-1024, 1024]:
>   Old Better: 128016
>   New Better: 0
>   Equal: 1972209
> 
> So I endeed up drafting this:
> 
>   static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
>   {
>     s64 smin_value, smax_value;
>     u64 num_bits = size * 8;
>     u64 top_smax_value, top_smin_value;
> 
>     reg->var_off = tnum_scast(reg->var_off, size);
> 
>     top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
>     top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
> 
>     if (top_smax_value == top_smin_value) {
>             if (size == 1) {
>                 smin_value = (s8)reg->smin_value;
>                 smax_value = (s8)reg->smax_value;
>             } else if (size == 2) {
>                 smin_value = (s16)reg->smin_value;
>                 smax_value = (s16)reg->smax_value;
>             } else {
>                 smin_value = (s32)reg->smin_value;
>                 smax_value = (s32)reg->smax_value;
>             }
>         } else {
>             smin_value = -(1LL << (num_bits - 1));
>             smax_value = (1LL << (num_bits - 1)) - 1;
>         }

The current implementation has the following part:

         s64_max = max(init_s64_max, init_s64_min);
         s64_min = min(init_s64_max, init_s64_min);

And I think this part is necessary. E.g. consider that
smin_value==0x00 and smax_value=0x80, then with suggested
implementation:

  clang-repl> #include <stdio.h>
  clang-repl> printf("%ld\n", (long)(char)(0x80UL));
  -128

Sign of the smin_value will be positive, while sign of the smax_value
will become negative.

When respining for v2, could you please also provide a link to a
repository with test harness you use to check range [-1024,1024]?
(E.g. push it to the github).

> 
>         reg->smin_value = smin_value;
>         reg->smax_value = smax_value;
> 
>         reg->umin_value = 0;
>         reg->umax_value = U64_MAX;
> 
>         reg->s32_min_value = (s32)smin_value;
>         reg->s32_max_value = (s32)smax_value;
>         reg->u32_min_value = 0;
>         reg->u32_max_value = U32_MAX;
> 
>         __update_reg_bounds(reg);
> }
> 
> I'm trying to always perform tnum_scast in order to preserve bitwise
> info, but attempt to use the old numeric logic first. If the range fits
> into the target size, we preserve the existing numeric bounds. If not, we
> fall back to the type limits and let __update_reg_bounds reconstruct the
> range from var_off. The imeplementation is similar for the subreg variant.
> 
> Rerunning the comparison for the same range looks much better, we should be
> consistently seeing precision gains in the cases where the original
> implementation bails out via goto:
> 
>   [-1024, 1024]:
>   Old Better: 0
>   New Better: 131072
>   Equal: 1969153
> 
> I also went through the CI, the existing selftest in the series still
> covers the change.
> 
> wdyt?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast
  2025-12-02 10:53     ` Dimitar Kanaliev
  2025-12-02 18:03       ` Eduard Zingerman
@ 2025-12-04  6:50       ` Shung-Hsi Yu
  1 sibling, 0 replies; 15+ messages in thread
From: Shung-Hsi Yu @ 2025-12-04  6:50 UTC (permalink / raw)
  To: Dimitar Kanaliev
  Cc: Eduard Zingerman, bpf, Alexei Starovoitov, Daniel Borkmann,
	John Fastabend, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Mykola Lysenko

On Tue, Dec 02, 2025 at 12:53:45PM +0200, Dimitar Kanaliev wrote:
> On Tue, Dec 2, 2025 at 1:50 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > On Tue, 2025-11-25 at 14:56 +0200, Dimitar Kanaliev wrote:
[...]
> > Assume that size == 1, s64_min = 0b000, s64_max == 0b100.
> > This corresponds to tnum with value == 0b000 and mask == 0b111.
> > Old algorithm computes more precise range in this situation.
> > Old:
> >
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
> >   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   4: (b7) r0 = 0                        ; R0=0
> >   5: (95) exit
> >
> > New:
> >
> >   0: (85) call bpf_get_prandom_u32#7    ; R0=scalar()
> >   1: (25) if r0 > 0x4 goto pc+2         ; R0=scalar(smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7))
> >   2: (7b) *(u64 *)(r10 -8) = r0         ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=4,var_off=(0x0; 0x7)) ...
> >   3: (91) r0 = *(s8 *)(r10 -8)          ; R0=scalar(id=1,smin=smin32=0,smax=umax=smax32=umax32=7,var_off=(0x0; 0x7)) ...
> >   4: (b7) r0 = 0                        ; R0=0
> >   5: (95) exit
> >
> > Note that range for R0 at (3) is 0..4 for old algorithm and 0..7 for
> > new algorithm.
> >
> > Can we keep both algorithms by e.g. replacing set_sext64_default_val()
> > implementation with tnum_scast() adding tnum_scast() in
> > coerce_reg_to_size_sx()?
> >
[...]
> 
> So I endeed up drafting this:
> 
>   static void coerce_reg_to_size_sx(struct bpf_reg_state *reg, int size)
>   {
>     s64 smin_value, smax_value;
>     u64 num_bits = size * 8;
>     u64 top_smax_value, top_smin_value;
> 
>     reg->var_off = tnum_scast(reg->var_off, size);
> 
>     top_smax_value = ((u64)reg->smax_value >> num_bits) << num_bits;
>     top_smin_value = ((u64)reg->smin_value >> num_bits) << num_bits;
> 
>     if (top_smax_value == top_smin_value) {
>             if (size == 1) {
>                 smin_value = (s8)reg->smin_value;
>                 smax_value = (s8)reg->smax_value;
>             } else if (size == 2) {
>                 smin_value = (s16)reg->smin_value;
>                 smax_value = (s16)reg->smax_value;
>             } else {
>                 smin_value = (s32)reg->smin_value;
>                 smax_value = (s32)reg->smax_value;
>             }
>     } else {
>         smin_value = -(1LL << (num_bits - 1));
>         smax_value = (1LL << (num_bits - 1)) - 1;
>     }
> 
>     reg->smin_value = smin_value;
>     reg->smax_value = smax_value;
> 
>     reg->umin_value = 0;
>     reg->umax_value = U64_MAX;
> 
>     reg->s32_min_value = (s32)smin_value;
>     reg->s32_max_value = (s32)smax_value;
>     reg->u32_min_value = 0;
>     reg->u32_max_value = U32_MAX;
> 
>     __update_reg_bounds(reg);

I'm rather unsure about keeping the __update_reg_bounds() call here, not
that it is incorrect, just that it is too convinent to throw in and
would take a lot of head scratching on why its there in the future.

Digging in a bit it seems like it might be because in ALU64 case (e.g.
"R1 = (s8, s16 s32)R2"), because the bounds were not synced after
calling coerce_reg_to_size_sx(); unlike coerce_subreg_to_size_sx(),
which is followed by zext_32_to_64() and reg_bounds_sync().

Given we have var_off at hand maybe we can just get the unsigned ranges
from there?

    reg->umin_value = reg->var_off.value;
    reg->umax_value = reg->var_off.value | reg->var_off.mask;

    /* Should be the same as using tnum_subreg(reg->var_off) to get u32
     * ranges.
     */
    reg->u32_min_value = (u32)reg->umin_value;
    reg->u32_max_value = (u32)reg->umax_value;

> }
> 
> I'm trying to always perform tnum_scast in order to preserve bitwise
> info, but attempt to use the old numeric logic first. If the range fits
> into the target size, we preserve the existing numeric bounds. If not, we
> fall back to the type limits and let __update_reg_bounds reconstruct the
> range from var_off. The imeplementation is similar for the subreg variant.
> 
> Rerunning the comparison for the same range looks much better, we should be
> consistently seeing precision gains in the cases where the original
> implementation bails out via goto:
> 
>   [-1024, 1024]:
>   Old Better: 0
>   New Better: 131072
>   Equal: 1969153
> 
> I also went through the CI, the existing selftest in the series still
> covers the change.
> 
> wdyt?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper
  2025-12-01  7:43       ` Dimitar Kanaliev
@ 2025-12-15  2:40         ` Shung-Hsi Yu
  0 siblings, 0 replies; 15+ messages in thread
From: Shung-Hsi Yu @ 2025-12-15  2:40 UTC (permalink / raw)
  To: Dimitar Kanaliev
  Cc: bot+bpf-ci, bpf, ast, daniel, john.fastabend, andrii, martin.lau,
	eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa,
	mykolal, martin.lau, clm, ihor.solodrai

On Mon, Dec 01, 2025 at 09:43:32AM +0200, Dimitar Kanaliev wrote:
> On Wed, Nov 26, 2025 at 10:56 AM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> > On Tue, Nov 25, 2025 at 01:22:13PM +0000, bot+bpf-ci@kernel.org wrote:
> > > > diff --git a/include/linux/tnum.h b/include/linux/tnum.h
> > > > index c52b862da..ed18ee114 100644
> > > > --- a/include/linux/tnum.h
> > > > +++ b/include/linux/tnum.h
> > > > @@ -63,6 +63,9 @@ struct tnum tnum_union(struct tnum t1, struct tnum t2);
> > > >  /* Return @a with all but the lowest @size bytes cleared */
> > > >  struct tnum tnum_cast(struct tnum a, u8 size);
> > > >
> > > > +/* Return @a sign-extended from @size bytes */
> > > > +struct tnum tnum_scast(struct tnum a, u8 size);
> > > > +
> > > >  /* Returns true if @a is a known constant */
> > > >  static inline bool tnum_is_const(struct tnum a)
> > > >  {
> > > > diff --git a/kernel/bpf/tnum.c b/kernel/bpf/tnum.c
> > > > index f8e70e9c3..eabcec2eb 100644
> > > > --- a/kernel/bpf/tnum.c
> > > > +++ b/kernel/bpf/tnum.c
> > > > @@ -199,6 +199,19 @@ struct tnum tnum_cast(struct tnum a, u8 size)
> > > >     return a;
> > > >  }
> > > >
> > > > +struct tnum tnum_scast(struct tnum a, u8 size)
> > > > +{
> > > > +   u8 s = 64 - size * 8;
> > > > +   u64 value, mask;
[...]
> > > > +   value = ((s64)a.value << s) >> s;
> > > > +   mask = ((s64)a.mask << s) >> s;
> > >                                     ^^
> > >
> > > Can this invoke undefined behavior when size is 0? When size==0,
> > > s becomes 64, and shifting a 64-bit value by 64 bits is undefined
> > > behavior according to the C standard. The guard only checks size >= 8,
> > > allowing size==0 to reach the shift operations.
[...]

Looking into this again I realized that while size can't be 0, we could
still have undefined behavior here. The first is that there could be
left shift on negative value, for example when:

     size =  4
        s = 32
  a.value = -1 (0xffffffffffffffff)
  a.mask  =  0 (0x0000000000000000)

In calculating of `a.value << s`, there is overflow of s64 because we're
doing -1 << 32. Similarly for `a.mask << s`.

IIUC another one is that is that when left shift causes a positive
integer to become a negative one, e.g. ((s64)255 << 56 ==
-72057594037927936). Less sure about this one.

In short, seems best to avoid left shift on signed value by moving the
cast outside of the left shift, and there shouldn't be any difference
when it comes to the intention. We're only interested in arithmatic
right shift since that is where we get signed extension anyway.

  value = (s64)(a.value << s) >> s;
  mask = (s64)(a.mask << s) >> s;

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-12-15  2:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-25 12:56 [PATCH v1 0/3] Add tnum_scast helper Dimitar Kanaliev
2025-11-25 12:56 ` [PATCH v1 1/3] bpf: Introduce tnum_scast as a tnum native sign extension helper Dimitar Kanaliev
2025-11-25 13:22   ` bot+bpf-ci
2025-11-26  8:56     ` Shung-Hsi Yu
2025-12-01  7:43       ` Dimitar Kanaliev
2025-12-15  2:40         ` Shung-Hsi Yu
2025-11-25 12:56 ` [PATCH v1 2/3] bpf: verifier: Simplify register sign extension with tnum_scast Dimitar Kanaliev
2025-11-25 13:22   ` bot+bpf-ci
2025-11-26 10:32     ` Dimitar Kanaliev
2025-12-01 23:49   ` Eduard Zingerman
2025-12-02 10:53     ` Dimitar Kanaliev
2025-12-02 18:03       ` Eduard Zingerman
2025-12-04  6:50       ` Shung-Hsi Yu
2025-11-25 12:56 ` [PATCH v1 3/3] selftests/bpf: Add verifier bounds checks for sign extension Dimitar Kanaliev
2025-11-26  9:04 ` [PATCH v1 0/3] Add tnum_scast helper Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox