Re: [PATCH bpf-next] selftests/bpf: Fix reg_bounds to match new tnum-based refinement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Paul Chaignon <paul.chaignon@gmail.com>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Subject: Re: [PATCH bpf-next] selftests/bpf: Fix reg_bounds to match new tnum-based refinement
Date: Wed, 8 Apr 2026 22:48:23 +0200	[thread overview]
Message-ID: <ada_F2WbRcnOYXWb@mail.gmail.com> (raw)
In-Reply-To: <ada9UuSQi2SE2IfB@mail.gmail.com>

On Wed, Apr 08, 2026 at 10:40:50PM +0200, Paul Chaignon wrote:
> Commit efc11a667878 ("bpf: Improve bounds when tnum has a single
> possible value") improved the bounds refinement to detect when the tnum
> and u64 range overlap in a single value (and the bounds can thus be set
> to that value).
> 
> Eduard then noticed that it broke the slow-mode reg_bounds selftests
> because they don't have an equivalent logic and are therefore unable to
> refine the bounds as much as the verifier. The following test case
> illustrates this.
> 
>   ACTUAL   TRUE1:  scalar(u64=0xffffffff00000000,u32=0,s64=0xffffffff00000000,s32=0)
>   EXPECTED TRUE1:  scalar(u64=[0xfffffffe00000001; 0xffffffff00000000],u32=0,s64=[0xfffffffe00000001; 0xffffffff00000000],s32=0)
>   [...]
>   #323/1007 reg_bounds_gen_consts_s64_s32/(s64)[0xfffffffe00000001; 0xffffffff00000000] (s32)<op> S64_MIN:FAIL
> 
> with the verifier logs:
> 
>   [...]
>   19: w0 = w6                 ; R0=scalar(smin=0,smax=umax=0xffffffff,
>                                           var_off=(0x0; 0xffffffff))
>                                 R6=scalar(smin=0xfffffffe00000001,smax=0xffffffff00000000,
>                                           umin=0xfffffffe00000001,umax=0xffffffff00000000,
>                                           var_off=(0xfffffffe00000000; 0x1ffffffff))
>   20: w0 = w7                 ; R0=0 R7=0x8000000000000000
>   21: if w6 == w7 goto pc+3
>   [...]
>   from 21 to 25: [...]
>   25: w0 = w6                 ; R0=0 R6=0xffffffff00000000
>                               ;         ^
>                               ;         unexpected refined value
>   26: w0 = w7                 ; R0=0 R7=0x8000000000000000
>   27: exit
> 
> When w6 == w7 is true, the verifier can deduce that the R6's tnum is
> equal to (0xfffffffe00000000; 0x100000000) and then use that information
> to refine the bounds: the tnum only overlap with the u64 range in
> 0xffffffff00000000. The reg_bounds selftest doesn't know about tnums
> and therefore fails to perform the same refinement.
> 
> This issue happens when the tnum carries information that cannot be
> represented in the ranges, as otherwise the selftest could reach the
> same refined value using just the ranges. The tnum thus needs to
> represent non-contiguous values (ex., R6's tnum above, after the
> condition). The only way this can happen in the reg_bounds selftest is
> at the boundary between the 32 and 64bit ranges. We therefore only need
> to handle that case.
> 
> This patch fixes the selftest refinement logic by checking if the u32
> and u64 ranges overlap in a single value. If so, the ranges can be set
> to that value. We need to handle two cases: either they overlap in
> umin64...
> 
>   u64 values
>   matching u32 range:     xxx        xxx        xxx        xxx
>                       |--------------------------------------|
>   u64 range:          0                xxxxx                 UMAX64
> 
> or in umax64:
> 
>   u64 values
>   matching u32 range:     xxx        xxx        xxx        xxx
>                       |--------------------------------------|
>   u64 range:          0          xxxxx                       UMAX64
> 
> To detect the first case, we decrease umax64 to the maximum value that
> matches the u32 range. If that happens to be umin64, then umin64 is the
> only overlap. We proceed similarly for the second case, increasing
> umin64 to the minimum value that matches the u32 range.
> 
> Note this is similar to how the verifier handles the general case using
> tnum, but we don't need to care about a single-value overlap in the
> middle of the range. That case is not possible when comparing two
> ranges.
> 
> This patch also adds two test cases reproducing this bug as part of the
> normal test runs (without SLOW_TESTS=1).
> 
> Fixes: efc11a667878 ("bpf: Improve bounds when tnum has a single possible value")
> Reported-by: Eduard Zingerman <eddyz87@gmail.com>
> Closes: https://lore.kernel.org/bpf/4e6dd64a162b3cab3635706ae6abfdd0be4db5db.camel@gmail.com/
> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
> ---

Hi Eduard,

This patch fixes the test case you reported and a couple variants:

reg_bounds_gen_consts_s64_u32/(s64)[0xfffffffe00000001; 0xffffffff00000000] (u32)<op> S64_MIN
reg_bounds_gen_consts_s64_s32/(s64)[0xfffffffe00000001; 0xffffffff00000000] (s32)<op> S64_MIN
reg_bounds_gen_consts_s64_u32/(s64)[0xfffffffe00000000; 0xfffffffffffffffe] (u32)<op> 0xffffffffffffffff
reg_bounds_gen_consts_s64_s32/(s64)[0xfffffffe00000000; 0xfffffffffffffffe] (s32)<op> 0xffffffffffffffff

but we're not out of the woods yet. While running reg_bounds* tests, I
noticed a few other unrelated failures.

---

reg_bounds_gen_consts_s64_u32/(s64)[0xfffffffe00000002; 0xffffffff00000000] (u32)<op> S64_MIN+1

This one hits an invariant violation on an impossible branch and the
bounds are set to an incorrect value that doesn't match what the test
expects.

  19: w0 = w6                ; R0=scalar(smin=0,smax=umax=0xffffffff,
                                         var_off=(0x0; 0xffffffff))
                               R6=scalar(smin=0xfffffffe00000002,smax=0xffffffff00000000,
                                         umin=0xfffffffe00000002,umax=0xffffffff00000000,
                                         var_off=(0xfffffffe00000000; 0x1ffffffff))
  20: w0 = w7                ; R0=1 R7=0x8000000000000001
  21: if w6 == w7 goto pc+3  ; [...]
  [...]

  from 21 to 25: R0=1 R1=0x8000000000000001 R2=0x8000000000000001 R6=0xffffffff00000001 R7=0x8000000000000001 R10=fp0
  [...]

  ACTUAL   TRUE1:  scalar(u64=0xffffffff00000001,u32=1,s64=0xffffffff00000001,s32=0x1)
  EXPECTED TRUE1:  scalar(u64=[0xfffffffe00000002; 0xffffffff00000000],u32=1,s64=[0xfffffffe00000002; 0xffffffff00000000],s32=0x1)

W7 is equal to 1 and, given R6's ranges, cannot be equal to W6. The
condition is always false. On the true branch, the verifier thus
incorrectly refines R6's value to 0xffffffff00000001.

This is a new type of invariant violation (i.e., involving the tnum)
that is not detected by range_bounds_violation(). I'm expecting it will
be handled by Hari's followup patchset. I've shared the program with
Hari so it can maybe be used as a selftest. I'm guessing we're fine
waiting for that fix as it's not failing in CI; if not, we could do a
quick fix in the verifier.

---

reg_bounds_gen_consts_s64_u32/(s64)[0xffffffff00000002; 0] (u32)<op> S64_MIN+1

This one fails because reg_bounds' branch detection logic doesn't match
the kernel's.

  ACTUAL   FALSE1: scalar(u64=[0; U64_MAX],u32=[0; 4294967295],s64=[0xffffffff00000002; 0],s32=[S32_MIN; S32_MAX])
  EXPECTED FALSE1: scalar(u64=[0; U64_MAX],u32=[0; 4294967295],s64=[0xffffffff00000002; 0],s32=[S32_MIN; S32_MAX])
  ACTUAL   FALSE2: scalar(u64=0x8000000000000001,u32=1,s64=S64_MIN+1,s32=0x1)
  EXPECTED FALSE2: scalar(u64=0x8000000000000001,u32=1,s64=S64_MIN+1,s32=0x1)
  ACTUAL   TRUE1:  <not found>
  EXPECTED TRUE1:  scalar(u64=[0xffffffff00000002; 0x7fffffffffffffff],u32=[2147483648; 1],s64=[0xffffffff00000002; 0xffffffff00000001],s32=0x1)
  ACTUAL   TRUE2:  <not found>
  EXPECTED TRUE2:  scalar(u64=0x8000000000000001,u32=1,s64=S64_MIN+1,s32=0x1)

It's failing with that error since b254c6d816e5 ("bpf: Simulate branches
to prune based on range violations"), but was already failing before
with a different error (unexpected range). The root cause seems to be
that the test runs into an invariant violation, here as well.

We'll probably need to update reg_bounds's branch prediction logic to
match what the kernel is now doing. I can look into this next.

[...]

next prev parent reply	other threads:[~2026-04-08 20:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 20:40 [PATCH bpf-next] selftests/bpf: Fix reg_bounds to match new tnum-based refinement Paul Chaignon
2026-04-08 20:48 ` Paul Chaignon [this message]
2026-04-09  5:18   ` Harishankar Vishwanathan
2026-04-12 20:20 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ada_F2WbRcnOYXWb@mail.gmail.com \
    --to=paul.chaignon@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=harishankar.vishwanathan@gmail.com \
    --cc=memxor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.