From: Yonghong Song <yonghong.song@linux.dev>
To: Vineet Gupta <vineet.gupta@linux.dev>, bpf@gcc.gnu.org
Cc: jose.marchesi@oracle.com, ast@kernel.org,
Eduard Zingerman <eddyz87@gmail.com>,
David Faust <david.faust@oracle.com>, bpf <bpf@vger.kernel.org>
Subject: Re: Verifier aspects of (was Re: gcc [RFC] bpf: don't synthesize sign_extend for ISA v4)
Date: Wed, 1 Jul 2026 10:58:01 -0700 [thread overview]
Message-ID: <1585500e-ad00-427e-91f7-7beb60303a0b@linux.dev> (raw)
In-Reply-To: <992e0697-f0da-4bfd-8178-abb206410e9d@linux.dev>
On 6/30/26 11:05 AM, Vineet Gupta wrote:
> On 4/13/26 11:58 AM, Vineet Gupta wrote:
>> Currently the extendsidi2 expander generates a shift left+right to
>> materialize a sign_extend. This is not needed for ISA v4 which
>> natively supports the sign extending move.
>>
>> There are various other reasons for fixing this:
>>
>> - The shifts are throwaway work since Combine subsequently undoes them
>> anyways, while diminishing potential opportunities for other optim
>> transformations it could have done, given to its own limitations of
>> max 3 -> 2 combinations.
>>
>> - Interim passes scuh as CSE1 can also "see thru" a native
>> sign_extend:DI
>> better than the shifts, again opening up more optimization
>> opportunites.
>>
>> - It also helps slightly with debugging Expand dumps as sign_extend is
>> obvious vs. 2 shifts.
>>
>> The fix itself is easy: just gate the expander on non-availabilty of
>> smov
>> which in turn is keyed off of -msmov or -mcpu.
>>
>> Before After
>> ----------------------------------------+---------------------------------------
>>
>> (insn 8 7 9 2 (set (reg:DI 19 [ _1 ]) | (insn 8 7 9 2 (set (reg:DI
>> 19 [ _1 ])
>> (ashift:DI (subreg:DI (reg:SI 24) 0) | (sign_extend:DI
>> (reg:SI 24)))
>> (const_int 32 [0x20]))) | (nil))
>> (nil)) |
>> (insn 9 8 10 2 (set (reg:DI 19 [ _1 ]) |
>> (ashiftrt:DI (reg:DI 19 [ _1 ]) |
>> (const_int 32 [0x20]))) |
>> (nil)) |
>>
>> This change is clean running bpf.exp testsuite tests.
>>
>> The complication comes from a subtle yet very important aspect, which is
>> in the old regime, expander unconditionally forced src operand to
>> DImode.
>>
>> | operands[1] = gen_lowpart (DImode, operands[1])
>>
>> With the patch, src remains SImode which exposes some latent issues in
>> the backend and/or "impedance mismatch [1]" with the kernel verifier
>> and it
>> reports following additional fails vs. gcc trunk against kernel
>> bpf-next [2]
>>
>> | #12 attach_probe:FAIL
>> | #74 cgroup_xattr:FAIL
>> | #122 file_reader:FAIL
>> | #412 sockopt_inherit:FAIL
>> | #413 sockopt_multi:FAIL
>
> For sockopt_multi, the trunk version we have the following snippet:
>
> 0: (81) r0 = *(s32 *)(r1 +24) ;
> R0=scalar(smin=0xffffffff80000000,smax=0x7fffffff) R1=ctx()
> 1: (55) if r0 != 0x0 goto pc+10 ; R0=0
> ...
> 11: (95) exit
>
> Insn 1 branch is not taken, so upon exit, verifier is happy that r0 == 0.
>
> W/ the patch, codegen changes slightly
>
> 0: (61) r2 = *(u32 *)(r1 +24) ; R1=ctx()
> R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> 1: (bf) r0 = (s32)r2 ;
> R0=scalar(smin=0xffffffff80000000,smax=0x7fffffff)
> R2=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
> 2: (56) if w2 != 0x0 goto pc+10 ; R2=0
> ...
> 12: (95) exit
>
> Here's verifier can't conclude that r0 is 0 or 1 and complains.
>
> For sign extending move, verifier: check_alu_op () can't establish
> that truncated+extended r2 value (r0) and r2 are and equivalent and
> delinks the scalar ids. So on insn 2 it can deduce that r2 == 0, it
> can't be backpropagated due to delinking of the regs. We initially
> thought this would be a simple "bug fix" in the verifier. But it seems
> this requires additional state tracking - currently all 64-bits are
> tracked with copy_register_state, this need to be broken down into
> separate lo/hi bits with additional link kind similar to BPF_ADD_CONST
> - will that be acceptable ?
>
> What's also interesting is for a reduced version the codegen
> difference between gcc trunk and clang.
> The first load is a regular signed int access so technically it needs
> to be either *(s32 *) or a *(u32*) + (s32).
> Clang seems to be eliding sign extension altogether [1] and only
> generating *(u32 *)
> Any thoughts why clang can do that ?
The clang didn't use s32 becomes there is no need.
Related code:
struct bpf_sockopt {
int level, optname, optlen, retval;
void *optval_end, *optval;
};
...
if (ctx->level != 0 || ctx->optname != 1)
goto out;
In such case, there is NO sign-extenstion need from 32bit to 64bit.
So llvm decides to just use unsigned load (LDW32).
>
> [1] https://godbolt.org/z/fsdrExPWd
>
> Thx,
> -Vineet
>
>> | #415 sockopt_sk:FAIL
>> | #517 uprobe_multi_test:FAIL
>> | #519 usdt:FAIL
>> | #649 verif_scale_pyperf600_iter:FAIL
>>
>> These fallout issues need to be addressed first but as discussed in the
>> bpf patchworks call this morning, I'm sending this out as an RFC.
>>
>> [1] https://devblogs.microsoft.com/oldnewthing/20180123-00/?p=97865
>> [2] 2026-02-18 f620af11c27b ("xsk: avoid double checking against rx
>> queue being full")
>>
>> gcc/ChangeLog:
>>
>> * config/bpf/bpf.md (define_expand extendsidi2): Only emit shifts
>> if !bpf_has_smov.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/bpf/zero-ext.c: Check Expand dumps to ensure
>> sign_extend is present and two shifts are not.
>>
>> Signed-off-by: Vineet Gupta <vineet.gupta@linux.dev>
>> ---
>> gcc/config/bpf/bpf.md | 12 ++++++++----
>> gcc/testsuite/gcc.target/bpf/zero-ext.c | 5 ++++-
>> 2 files changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
>> index a2bceb8998d7..1431bfd225e0 100644
>> --- a/gcc/config/bpf/bpf.md
>> +++ b/gcc/config/bpf/bpf.md
>> @@ -308,16 +308,20 @@
>> ;; Sign-extending a 32-bit value into a 64-bit value is achieved
>> using
>> ;; shifting, with instructions generated by the expand below.
>> +;; Only needed for ISA V3 and prior builds.
>> (define_expand "extendsidi2"
>> [(set (match_operand:DI 0 "register_operand")
>> (sign_extend:DI (match_operand:SI 1 "register_operand")))]
>> ""
>> {
>> - operands[1] = gen_lowpart (DImode, operands[1]);
>> - emit_insn (gen_ashldi3 (operands[0], operands[1], GEN_INT (32)));
>> - emit_insn (gen_ashrdi3 (operands[0], operands[0], GEN_INT (32)));
>> - DONE;
>> + if (!bpf_has_smov)
>> + {
>> + operands[1] = gen_lowpart (DImode, operands[1]);
>> + emit_insn (gen_ashldi3 (operands[0], operands[1], GEN_INT (32)));
>> + emit_insn (gen_ashrdi3 (operands[0], operands[0], GEN_INT (32)));
>> + DONE;
>> + }
>> })
>
>
prev parent reply other threads:[~2026-07-01 17:58 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260413185834.58706-1-vineet.gupta@linux.dev>
2026-06-30 18:05 ` Verifier aspects of (was Re: gcc [RFC] bpf: don't synthesize sign_extend for ISA v4) Vineet Gupta
2026-07-01 17:58 ` Yonghong Song [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1585500e-ad00-427e-91f7-7beb60303a0b@linux.dev \
--to=yonghong.song@linux.dev \
--cc=ast@kernel.org \
--cc=bpf@gcc.gnu.org \
--cc=bpf@vger.kernel.org \
--cc=david.faust@oracle.com \
--cc=eddyz87@gmail.com \
--cc=jose.marchesi@oracle.com \
--cc=vineet.gupta@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox