* [PATCH v2 bpf-next 0/2] bpf: Recognize special arithmetic shift in the verifier
@ 2025-11-15 2:26 Alexei Starovoitov
2025-11-15 2:26 ` [PATCH v2 bpf-next 1/2] " Alexei Starovoitov
2025-11-15 2:26 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63 Alexei Starovoitov
0 siblings, 2 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2025-11-15 2:26 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
v1->v2:
Use __mark_reg32_known() or __mark_reg_known() for zero too.
Add comment to selftest.
v1:
https://lore.kernel.org/bpf/20251114031039.63852-1-alexei.starovoitov@gmail.com/
Alexei Starovoitov (2):
bpf: Recognize special arithmetic shift in the verifier
selftests/bpf: Add tests for s>>=31 and s>>=63
kernel/bpf/verifier.c | 32 ++++++++++++++
.../selftests/bpf/progs/verifier_subreg.c | 43 +++++++++++++++++++
2 files changed, 75 insertions(+)
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 bpf-next 1/2] bpf: Recognize special arithmetic shift in the verifier
2025-11-15 2:26 [PATCH v2 bpf-next 0/2] bpf: Recognize special arithmetic shift in the verifier Alexei Starovoitov
@ 2025-11-15 2:26 ` Alexei Starovoitov
2025-11-15 2:48 ` Eduard Zingerman
2025-11-15 2:26 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63 Alexei Starovoitov
1 sibling, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2025-11-15 2:26 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
cilium bpf_wiregard.bpf.c when compiled with -O1 fails to load
with the following verifier log:
192: (79) r2 = *(u64 *)(r10 -304) ; R2=pkt(r=40) R10=fp0 fp-304=pkt(r=40)
...
227: (85) call bpf_skb_store_bytes#9 ; R0=scalar()
228: (bc) w2 = w0 ; R0=scalar() R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff))
230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a))
...
232: (66) if w2 s> 0xffffffff goto pc+125 ; R2=scalar(smin=umin=umin32=0x80000000,smax=umax=umax32=0xffffff7a,smax32=-134,var_off=(0x80000000; 0x7fffff7a))
...
238: (79) r4 = *(u64 *)(r10 -304) ; R4=scalar() R10=fp0 fp-304=scalar()
239: (56) if w2 != 0xffffff78 goto pc+210 ; R2=0xffffff78 // -136
...
258: (71) r1 = *(u8 *)(r4 +0)
R4 invalid mem access 'scalar'
The error might confuse most bpf authors, since fp-304 slot had 'pkt'
pointer at insn 192 and became 'scalar' at 238. That happened because
bpf_skb_store_bytes() clears all packet pointers including those in
the stack. On the first glance it might look like a bug in the source
code, since ctx->data pointer should have been reloaded after the call
to bpf_skb_store_bytes().
The relevant part of cilium source code looks like this:
// bpf/lib/nodeport.h
int dsr_set_ipip6()
{
if (ctx_adjust_hroom(...))
return DROP_INVALID; // -134
if (ctx_store_bytes(...))
return DROP_WRITE_ERROR; // -141
return 0;
}
bool dsr_fail_needs_reply(int code)
{
if (code == DROP_FRAG_NEEDED) // -136
return true;
return false;
}
tail_nodeport_ipv6_dsr()
{
ret = dsr_set_ipip6(...);
if (!IS_ERR(ret)) {
...
} else {
if (dsr_fail_needs_reply(ret))
return dsr_reply_icmp6(...);
}
}
The code doesn't have arithmetic shift by 31 and it reloads ctx->data
every time it needs to access it. So it's not a bug in the source code.
The reason is DAGCombiner::foldSelectCCToShiftAnd() LLVM transformation:
// If this is a select where the false operand is zero and the compare is a
// check of the sign bit, see if we can perform the "gzip trick":
// select_cc setlt X, 0, A, 0 -> and (sra X, size(X)-1), A
// select_cc setgt X, 0, A, 0 -> and (not (sra X, size(X)-1)), A
The conditional branch in dsr_set_ipip6() and its return values
are optimized into BPF_ARSH plus BPF_AND:
227: (85) call bpf_skb_store_bytes#9
228: (bc) w2 = w0
229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff))
230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a))
after insn 230 the register w2 can only be 0 or -134,
but the verifier approximates it, since there is no way to
represent two scalars in bpf_reg_state.
After fallthough at insn 232 the w2 can only be -134,
hence the branch at insn
239: (56) if w2 != -136 goto pc+210
should be always taken, and trapping insn 258 should never execute.
LLVM generated correct code, but the verifier follows impossible
path and rejects valid program. To fix this issue recognize this
special LLVM optimization and fork the verifier state.
So after insn 229: (c4) w2 s>>= 31
the verifier has two states to explore:
one with w2 = 0 and another with w2 = 0xffffffff
which makes the verifier accept bpf_wiregard.c
Note there are 20+ such patterns in bpf_wiregard.o compiled
with -O1 and -O2, but they're rarely seen in other production
bpf programs, so push_stack() approach is not a concern.
Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
kernel/bpf/verifier.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 098dd7f21c89..c6e9bf38815a 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -15470,6 +15470,35 @@ static bool is_safe_to_compute_dst_reg_range(struct bpf_insn *insn,
}
}
+static int maybe_fork_scalars(struct bpf_verifier_env *env, struct bpf_insn *insn,
+ struct bpf_reg_state *dst_reg)
+{
+ struct bpf_verifier_state *branch;
+ struct bpf_reg_state *regs;
+ bool alu32;
+
+ if (dst_reg->smin_value == -1 && dst_reg->smax_value == 0)
+ alu32 = false;
+ else if (dst_reg->s32_min_value == -1 && dst_reg->s32_max_value == 0)
+ alu32 = true;
+ else
+ return 0;
+
+ branch = push_stack(env, env->insn_idx + 1, env->insn_idx, false);
+ if (IS_ERR(branch))
+ return PTR_ERR(branch);
+
+ regs = branch->frame[branch->curframe]->regs;
+ if (alu32) {
+ __mark_reg32_known(®s[insn->dst_reg], 0);
+ __mark_reg32_known(dst_reg, -1ull);
+ } else {
+ __mark_reg_known(®s[insn->dst_reg], 0);
+ __mark_reg_known(dst_reg, -1ull);
+ }
+ return 0;
+}
+
/* WARNING: This function does calculations on 64-bit values, but the actual
* execution may occur on 32-bit values. Therefore, things like bitshifts
* need extra checks in the 32-bit case.
@@ -15563,6 +15592,9 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
scalar32_min_max_arsh(dst_reg, &src_reg);
else
scalar_min_max_arsh(dst_reg, &src_reg);
+ ret = maybe_fork_scalars(env, insn, dst_reg);
+ if (ret)
+ return ret;
break;
default:
break;
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63
2025-11-15 2:26 [PATCH v2 bpf-next 0/2] bpf: Recognize special arithmetic shift in the verifier Alexei Starovoitov
2025-11-15 2:26 ` [PATCH v2 bpf-next 1/2] " Alexei Starovoitov
@ 2025-11-15 2:26 ` Alexei Starovoitov
2025-11-15 2:53 ` Eduard Zingerman
2025-11-15 3:01 ` bot+bpf-ci
1 sibling, 2 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2025-11-15 2:26 UTC (permalink / raw)
To: bpf; +Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team
From: Alexei Starovoitov <ast@kernel.org>
Add tests for special arithmetic shift right.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
.../selftests/bpf/progs/verifier_subreg.c | 43 +++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/tools/testing/selftests/bpf/progs/verifier_subreg.c b/tools/testing/selftests/bpf/progs/verifier_subreg.c
index 8613ea160dcd..0b572c067276 100644
--- a/tools/testing/selftests/bpf/progs/verifier_subreg.c
+++ b/tools/testing/selftests/bpf/progs/verifier_subreg.c
@@ -670,4 +670,47 @@ __naked void ldx_w_zero_extend_check(void)
: __clobber_all);
}
+SEC("socket")
+__description("s>>=31")
+__success __success_unpriv __retval(0)
+__naked void arsh_31(void)
+{
+ /* Below is what LLVM generates in cilium's bpf_wiregard.o */
+ asm volatile (" \
+ call %[bpf_get_prandom_u32]; \
+ w2 = w0; \
+ w2 s>>= 31; \
+ w2 &= -134; /* w2 becomes 0 or -134 */ \
+ if w2 s> -1 goto +2; \
+ if w2 != -136 goto +1; \
+ w0 /= 0; \
+ w0 = 0; \
+ exit; \
+" :
+ : __imm(bpf_get_prandom_u32)
+ : __clobber_all);
+}
+
+SEC("socket")
+__description("s>>=63")
+__success __success_unpriv __retval(0)
+__naked void arsh_63(void)
+{
+ /* Copy of arsh_31 with s/w/r/ */
+ asm volatile (" \
+ call %[bpf_get_prandom_u32]; \
+ r2 = r0; \
+ r2 <<= 32; \
+ r2 s>>= 63; \
+ r2 &= -134; \
+ if r2 s> -1 goto +2; \
+ if r2 != -136 goto +1; \
+ r0 /= 0; \
+ r0 = 0; \
+ exit; \
+" :
+ : __imm(bpf_get_prandom_u32)
+ : __clobber_all);
+}
+
char _license[] SEC("license") = "GPL";
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 1/2] bpf: Recognize special arithmetic shift in the verifier
2025-11-15 2:26 ` [PATCH v2 bpf-next 1/2] " Alexei Starovoitov
@ 2025-11-15 2:48 ` Eduard Zingerman
0 siblings, 0 replies; 7+ messages in thread
From: Eduard Zingerman @ 2025-11-15 2:48 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team
On Fri, 2025-11-14 at 18:26 -0800, Alexei Starovoitov wrote:
[...]
> 227: (85) call bpf_skb_store_bytes#9
> 228: (bc) w2 = w0
> 229: (c4) w2 s>>= 31 ; R2=scalar(smin=0,smax=umax=0xffffffff,smin32=-1,smax32=0,var_off=(0x0; 0xffffffff))
> 230: (54) w2 &= -134 ; R2=scalar(smin=0,smax=umax=umax32=0xffffff7a,smax32=0x7fffff7a,var_off=(0x0; 0xffffff7a))
>
> after insn 230 the register w2 can only be 0 or -134,
> but the verifier approximates it, since there is no way to
> represent two scalars in bpf_reg_state.
> After fallthough at insn 232 the w2 can only be -134,
> hence the branch at insn
> 239: (56) if w2 != -136 goto pc+210
> should be always taken, and trapping insn 258 should never execute.
> LLVM generated correct code, but the verifier follows impossible
> path and rejects valid program. To fix this issue recognize this
> special LLVM optimization and fork the verifier state.
> So after insn 229: (c4) w2 s>>= 31
> the verifier has two states to explore:
> one with w2 = 0 and another with w2 = 0xffffffff
> which makes the verifier accept bpf_wiregard.c
>
> Note there are 20+ such patterns in bpf_wiregard.o compiled
> with -O1 and -O2, but they're rarely seen in other production
> bpf programs, so push_stack() approach is not a concern.
>
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63
2025-11-15 2:26 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63 Alexei Starovoitov
@ 2025-11-15 2:53 ` Eduard Zingerman
2025-11-15 3:01 ` bot+bpf-ci
1 sibling, 0 replies; 7+ messages in thread
From: Eduard Zingerman @ 2025-11-15 2:53 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team
On Fri, 2025-11-14 at 18:26 -0800, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add tests for special arithmetic shift right.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63
2025-11-15 2:26 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63 Alexei Starovoitov
2025-11-15 2:53 ` Eduard Zingerman
@ 2025-11-15 3:01 ` bot+bpf-ci
2025-11-15 16:00 ` Alexei Starovoitov
1 sibling, 1 reply; 7+ messages in thread
From: bot+bpf-ci @ 2025-11-15 3:01 UTC (permalink / raw)
To: alexei.starovoitov, bpf
Cc: daniel, andrii, martin.lau, sunhao.th, kernel-team, ast, andrii,
daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 2527 bytes --]
> diff --git a/tools/testing/selftests/bpf/progs/verifier_subreg.c b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> index 8613ea160..0b572c067 100644
> --- a/tools/testing/selftests/bpf/progs/verifier_subreg.c
> +++ b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> @@ -670,4 +670,47 @@ __naked void ldx_w_zero_extend_check(void)
> : __clobber_all);
> }
>
> +SEC("socket")
> +__description("s>>=31")
> +__success __success_unpriv __retval(0)
> +__naked void arsh_31(void)
> +{
> + /* Below is what LLVM generates in cilium's bpf_wiregard.o */
> + asm volatile (" \
> + call %[bpf_get_prandom_u32]; \
> + w2 = w0; \
> + w2 s>>= 31; \
> + w2 &= -134; /* w2 becomes 0 or -134 */ \
> + if w2 s> -1 goto +2; \
> + if w2 != -136 goto +1; \
^^^^
Should this be -134 instead of -136? The comment on the previous line
states that w2 becomes either 0 or -134 after the AND operation, so w2
can never equal -136. This means the condition is always true and the
division by zero at "w0 /= 0" is never reachable.
In arsh_31(), after the arithmetic right shift by 31 bits, w2 will be
either 0 (if the sign bit was 0) or -1 (if the sign bit was 1). Then
"w2 &= -134" produces either 0 or -134, never -136.
> + w0 /= 0; \
> + w0 = 0; \
> + exit; \
> +" :
> + : __imm(bpf_get_prandom_u32)
> + : __clobber_all);
> +}
> +
> +SEC("socket")
> +__description("s>>=63")
> +__success __success_unpriv __retval(0)
> +__naked void arsh_63(void)
> +{
> + /* Copy of arsh_31 with s/w/r/ */
> + asm volatile (" \
> + call %[bpf_get_prandom_u32]; \
> + r2 = r0; \
> + r2 <<= 32; \
> + r2 s>>= 63; \
> + r2 &= -134; \
> + if r2 s> -1 goto +2; \
> + if r2 != -136 goto +1; \
^^^^
Same issue here - should this be -134? After the shift and AND
operations, r2 can only be 0 or -134, making the -136 check always true
and the "r0 /= 0" unreachable.
In arsh_63(), after left-shifting by 32 bits and then arithmetic
right-shifting by 63 bits, r2 becomes either 0 or -1. Then "r2 &= -134"
produces either 0 or -134, never -136.
> + r0 /= 0; \
> + r0 = 0; \
> + exit; \
> +" :
> + : __imm(bpf_get_prandom_u32)
> + : __clobber_all);
> +}
> +
> char _license[] SEC("license") = "GPL";
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19383126218
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63
2025-11-15 3:01 ` bot+bpf-ci
@ 2025-11-15 16:00 ` Alexei Starovoitov
0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2025-11-15 16:00 UTC (permalink / raw)
To: bot+bpf-ci
Cc: bpf, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Hao Sun,
Kernel Team, Alexei Starovoitov, Eduard, Yonghong Song,
Chris Mason, Ihor Solodrai
On Fri, Nov 14, 2025 at 7:01 PM <bot+bpf-ci@kernel.org> wrote:
>
> > diff --git a/tools/testing/selftests/bpf/progs/verifier_subreg.c b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> > index 8613ea160..0b572c067 100644
> > --- a/tools/testing/selftests/bpf/progs/verifier_subreg.c
> > +++ b/tools/testing/selftests/bpf/progs/verifier_subreg.c
> > @@ -670,4 +670,47 @@ __naked void ldx_w_zero_extend_check(void)
> > : __clobber_all);
> > }
> >
> > +SEC("socket")
> > +__description("s>>=31")
> > +__success __success_unpriv __retval(0)
> > +__naked void arsh_31(void)
> > +{
> > + /* Below is what LLVM generates in cilium's bpf_wiregard.o */
> > + asm volatile (" \
> > + call %[bpf_get_prandom_u32]; \
> > + w2 = w0; \
> > + w2 s>>= 31; \
> > + w2 &= -134; /* w2 becomes 0 or -134 */ \
> > + if w2 s> -1 goto +2; \
> > + if w2 != -136 goto +1; \
> ^^^^
>
> Should this be -134 instead of -136? The comment on the previous line
> states that w2 becomes either 0 or -134 after the AND operation, so w2
> can never equal -136. This means the condition is always true and the
> division by zero at "w0 /= 0" is never reachable.
exactly :)
As the side note it's nice to see that AI understands bpf assembly
and can do this logical reasoning.
Maybe the verifier should defer to AI for prog validation :)
> In arsh_31(), after the arithmetic right shift by 31 bits, w2 will be
> either 0 (if the sign bit was 0) or -1 (if the sign bit was 1). Then
> "w2 &= -134" produces either 0 or -134, never -136.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-11-15 16:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-15 2:26 [PATCH v2 bpf-next 0/2] bpf: Recognize special arithmetic shift in the verifier Alexei Starovoitov
2025-11-15 2:26 ` [PATCH v2 bpf-next 1/2] " Alexei Starovoitov
2025-11-15 2:48 ` Eduard Zingerman
2025-11-15 2:26 ` [PATCH v2 bpf-next 2/2] selftests/bpf: Add tests for s>>=31 and s>>=63 Alexei Starovoitov
2025-11-15 2:53 ` Eduard Zingerman
2025-11-15 3:01 ` bot+bpf-ci
2025-11-15 16:00 ` Alexei Starovoitov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox