From: Eduard Zingerman <eddyz87@gmail.com>
To: "Jose E. Marchesi" <jose.marchesi@oracle.com>,
Yonghong Song <yhs@meta.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
bpf@vger.kernel.org, ast@kernel.org, andrii@kernel.org,
daniel@iogearbox.net, kernel-team@fb.com, yhs@fb.com,
david.faust@oracle.com,
James Hilliard <james.hilliard1@gmail.com>
Subject: Re: [RFC bpf-next 0/5] Support for BPF_ST instruction in LLVM C compiler
Date: Fri, 13 Jan 2023 18:47:36 +0200 [thread overview]
Message-ID: <f1e4282bf00aa21a72fc5906f8c3be1ae6c94a5e.camel@gmail.com> (raw)
In-Reply-To: <87a62mhl3m.fsf@oracle.com>
On Fri, 2023-01-13 at 09:53 +0100, Jose E. Marchesi wrote:
> > On 1/12/23 2:27 PM, Alexei Starovoitov wrote:
> > > On Thu, Jan 05, 2023 at 02:07:05PM +0200, Eduard Zingerman wrote:
> > > > On Thu, 2023-01-05 at 11:06 +0100, Jose E. Marchesi wrote:
> > > > > > On Sat, Dec 31, 2022 at 8:31 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > > > > > >
> > > > > > > BPF has two documented (non-atomic) memory store instructions:
> > > > > > >
> > > > > > > BPF_STX: *(size *) (dst_reg + off) = src_reg
> > > > > > > BPF_ST : *(size *) (dst_reg + off) = imm32
> > > > > > >
> > > > > > > Currently LLVM BPF back-end does not emit BPF_ST instruction and does
> > > > > > > not allow one to be specified as inline assembly.
> > > > > > >
> > > > > > > Recently I've been exploring ways to port some of the verifier test
> > > > > > > cases from tools/testing/selftests/bpf/verifier/*.c to use inline assembly
> > > > > > > and machinery provided in tools/testing/selftests/bpf/test_loader.c
> > > > > > > (which should hopefully simplify tests maintenance).
> > > > > > > The BPF_ST instruction is popular in these tests: used in 52 of 94 files.
> > > > > > >
> > > > > > > While it is possible to adjust LLVM to only support BPF_ST for inline
> > > > > > > assembly blocks it seems a bit wasteful. This patch-set contains a set
> > > > > > > of changes to verifier necessary in case when LLVM is allowed to
> > > > > > > freely emit BPF_ST instructions (source code is available here [1]).
> > > > > >
> > > > > > Would we gate LLVM's emitting of BPF_ST for C code behind some new
> > > > > > cpu=v4? What is the benefit for compiler to start automatically emit
> > > > > > such instructions? Such thinking about logistics, if there isn't much
> > > > > > benefit, as BPF application owner I wouldn't bother enabling this
> > > > > > behavior risking regressions on old kernels that don't have these
> > > > > > changes.
> > > > >
> > > > > Hmm, GCC happily generates BPF_ST instructions:
> > > > >
> > > > > $ echo 'int v; void foo () { v = 666; }' | bpf-unknown-none-gcc -O2 -xc -S -o foo.s -
> > > > > $ cat foo.s
> > > > > .file "<stdin>"
> > > > > .text
> > > > > .align 3
> > > > > .global foo
> > > > > .type foo, @function
> > > > > foo:
> > > > > lddw %r0,v
> > > > > stw [%r0+0],666
> > > > > exit
> > > > > .size foo, .-foo
> > > > > .global v
> > > > > .type v, @object
> > > > > .lcomm v,4,4
> > > > > .ident "GCC: (GNU) 12.0.0 20211206 (experimental)"
> > > > >
> > > > > Been doing that since October 2019, I think before the cpu versioning
> > > > > mechanism was got in place?
> > > > >
> > > > > We weren't aware this was problematic. Does the verifier reject such
> > > > > instructions?
> > > >
> > > > Interesting, do BPF selftests generated by GCC pass the same way they
> > > > do if generated by clang?
> > > >
> > > > I had to do the following changes to the verifier to make the
> > > > selftests pass when BPF_ST instruction is allowed for selection:
> > > >
> > > > - patch #1 in this patchset: track values of constants written to
> > > > stack using BPF_ST. Currently these are tracked imprecisely, unlike
> > > > the writes using BPF_STX, e.g.:
> > > > fp[-8] = 42; currently verifier assumes that
> > > > fp[-8]=mmmmmmmm
> > > > after such instruction, where m stands for "misc",
> > > > just a note that something is written at fp[-8].
> > > > r1 = 42; verifier tracks r1=42 after
> > > > this instruction.
> > > > fp[-8] = r1; verifier tracks fp[-8]=42 after this instruction.
> > > >
> > > > So the patch makes both cases equivalent.
> > > > - patch #3 in this patchset: adjusts
> > > > verifier.c:convert_ctx_access()
> > > > to operate on BPF_ST alongside BPF_STX.
> > > > Context parameters for some BPF programs types are "fake"
> > > > data
> > > > structures. The verifier matches all BPF_STX and BPF_LDX
> > > > instructions that operate on pointers to such contexts and rewrites
> > > > these instructions. It might change an offset or add another layer
> > > > of indirection, etc. E.g. see filter.c:bpf_convert_ctx_access().
> > > > (This also implies that verifier forbids writes to non-constant
> > > > offsets inside such structures).
> > > > So the patch extends this logic to also handle BPF_ST.
> > > The patch 3 is necessary to land before llvm starts generating 'st'
> > > for ctx access.
> > > That's clear, but I'm missing why patch 1 is necessary.
> > > Sure, it's making the verifier understand scalar spills with 'st' and
> > > makes 'st' equivalent to 'stx', but I'm missing why it's necessary.
> > > What kind of programs fail to be verified when llvm starts generating 'st' ?
I should have added an example to the summary. There are a few
test_prog tests that fail w/o this patch, namely atomic_bounds,
dynptr, for_each, xdp_noinline, xdp_synproxy.
Here is how atomic_bounds looks:
SEC("fentry/bpf_fentry_test1")
int BPF_PROG(sub, int x)
{
int a = 0;
int b = __sync_fetch_and_add(&a, 1);
/* b is certainly 0 here. Can the verifier tell? */
while (b)
continue;
return 0;
}
Compiled w/o BPF_ST Compiled with BPF_ST
<sub>: <sub>:
w1 = 0x0
*(u32 *)(r10 - 0x4) = r1 *(u32 *)(r10 - 0x4) = 0x0
w1 = 0x1 w1 = 0x1
lock *(u32 *)(r10 - 0x4) += r1 lock *(u32 *)(r10 - 0x4) += r1
if w1 == 0x0 goto +0x1 <LBB0_2> if w1 == 0x0 goto +0x1 <LBB0_2>
<LBB0_1>: <LBB0_1>:
goto -0x1 <LBB0_1> goto -0x1 <LBB0_1>
<LBB0_2>: <LBB0_2>:
w0 = 0x0 w0 = 0x0
exit exit
When compiled with BPF_ST and verified w/o the patch #1 verification log
looks as follows:
0: R1=ctx(off=0,imm=0) R10=fp0
0: (62) *(u32 *)(r10 -4) = 0 ; R10=fp0 fp-8=mmmm????
1: (b4) w1 = 1 ; R1_w=1
2: (c3) r1 = atomic_fetch_add((u32 *)(r10 -4), r1) ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 fp-8=mmmm????
3: (16) if w1 == 0x0 goto pc+1 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
4: (05) goto pc-1
4: (05) goto pc-1
4: (05) goto pc-1
4: (05) goto pc-1
infinite loop detected at insn 4
When compiled w/o BPF_ST and verified w/o the patch #1 verification log
looks as follows:
func#0 @0
reg type unsupported for arg#0 function sub#5
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b4) w1 = 0 ; R1_w=0
1: (63) *(u32 *)(r10 -4) = r1
last_idx 1 first_idx 0
regs=2 stack=0 before 0: (b4) w1 = 0
2: R1_w=0 R10=fp0 fp-8=0000????
2: (b4) w1 = 1 ; R1_w=1
; int b = __sync_fetch_and_add(&a, 1);
3: (c3) r1 = atomic_fetch_add((u32 *)(r10 -4), r1) ; R1_w=P0 R10=fp0 fp-8=mmmm????
4: (16) if w1 == 0x0 goto pc+1
6: R1_w=P0
6: (b4) w0 = 0 ; R0_w=0
7: (95) exit
processed 7 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
The difference comes from the way zero write to `r10-4` is processed,
with BPF_ST it is tracked as `fp-8=mmmm????` after write, without BPF_ST
it is tracked as `fp-8=0000???? after` write.
Which is caused by the way `check_stack_write_fixed_off()` is written.
For the register spills it either saves the complete register state or
STACK_ZERO if register is known to be zero. However, for the BPF_ST it
just saves STACK_MISC. Hence, the patch #1.
> > > Regarind -mcpu=v4.
> > > I think we need to add all of our upcoming instructions as a single flag.
> > > Otherwise we'll have -mcpu=v5,v6,v7 and full combinations of them.
> > > -mcpu=v4 could mean:
> > > - ST
> > > - sign extending loads
> > > - sign extend a register
> > > - 32-bit JA
> > > - proper bswap insns: bswap16, bswap32, bswap64
> > > The sign and 32-bit JA we've discussed earlier.
> > > The bswap was on my wish list forever.
> > > The existing TO_LE, TO_BE insns are really odd from compiler pov.
> > > The compiler should translate bswap IR op into proper bswap insn
> > > just like it does on all cpus.
> > > Maybe add SDIV to -mcpu=v4 as well?
There is also BPF_JSET "PC += off if dst & src" which is not currently
emitted by LLVM backend as far as I can tell.
> >
> > Right, we should add these insns in llvm17 with -mcpu=v4, so we
> > can keep the number of cpu generations minimum.
>
> How do you plan to encode the sign-extend load instructions?
>
> I guess a possibility would be to use one of the available op-mode for
> load instructions that are currently marked as reserved. For example:
>
> IMM = 0b000
> ABS = 0b001
> IND = 0b010
> MEM = 0b011
> SEM = 0b100 <- new
>
> Then we would have the following code fields for sign-extending LDX
> instructions:
>
> op-mode:SEM op-size:{W,H,B,DW} op-class:LDX
I second the Jose's question about encoding for the new instructions.
> - sign extend a register
E.g. like this:
BPF_SEXT = 0xb0
opcode: BPF_SEXT | BPF_X | BPF_ALU
imm32: {8,16,32} // to be consistent with BPF_END insn
or BPF_{B,H,W} // to be consistent with LDX/STX
Sign extend 8,16,32-bit value from src to 64-bit dst register:
src = sext{8,16,32}(dst)
> The sign and 32-bit JA we've discussed earlier.
Could you please share a link for this discussion?
> - proper bswap insns: bswap16, bswap32, bswap64
Could you please extrapolate on this.
Current instructions operate on a single register, e.g.:
dst_reg = htole16(dst_reg).
It looks like this could be reflected in x86 assembly as a single
instruction using either bswap or xchg instructions (see [1] and [2]).
x86/net/bpf_jit_comp.c:1266 can probably be adjusted to use xchg for
16-bit case.
For ARM rev16, rev32, rev64 allow to use the same register for source
and destination ([3]).
[1] https://www.felixcloutier.com/x86/bswap
[2] https://www.felixcloutier.com/x86/xchg
[3] https://developer.arm.com/documentation/ddi0602/2022-06/Base-Instructions/REV16--Reverse-bytes-in-16-bit-halfwords-?lang=en
https://developer.arm.com/documentation/ddi0602/2022-06/Base-Instructions/REV32--Reverse-bytes-in-32-bit-words-?lang=en
https://developer.arm.com/documentation/ddi0602/2022-06/Base-Instructions/REV64--Reverse-Bytes--an-alias-of-REV-?lang=en
next prev parent reply other threads:[~2023-01-13 16:49 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-31 16:31 [RFC bpf-next 0/5] Support for BPF_ST instruction in LLVM C compiler Eduard Zingerman
2022-12-31 16:31 ` [RFC bpf-next 1/5] bpf: more precise stack write reasoning for BPF_ST instruction Eduard Zingerman
2022-12-31 16:31 ` [RFC bpf-next 2/5] selftests/bpf: check if verifier tracks constants spilled by BPF_ST_MEM Eduard Zingerman
2022-12-31 16:31 ` [RFC bpf-next 3/5] bpf: allow ctx writes using BPF_ST_MEM instruction Eduard Zingerman
2022-12-31 16:31 ` [RFC bpf-next 4/5] selftests/bpf: test if pointer type is tracked for BPF_ST_MEM Eduard Zingerman
2022-12-31 16:31 ` [RFC bpf-next 5/5] selftests/bpf: don't match exact insn index in expected error message Eduard Zingerman
2023-01-04 22:10 ` [RFC bpf-next 0/5] Support for BPF_ST instruction in LLVM C compiler Andrii Nakryiko
2023-01-05 10:06 ` Jose E. Marchesi
2023-01-05 12:07 ` Eduard Zingerman
2023-01-05 15:07 ` Jose E. Marchesi
2023-01-12 22:27 ` Alexei Starovoitov
2023-01-13 8:02 ` Yonghong Song
2023-01-13 8:53 ` Jose E. Marchesi
2023-01-13 16:47 ` Eduard Zingerman [this message]
2023-01-26 5:49 ` Alexei Starovoitov
2023-01-13 19:23 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f1e4282bf00aa21a72fc5906f8c3be1ae6c94a5e.camel@gmail.com \
--to=eddyz87@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=david.faust@oracle.com \
--cc=james.hilliard1@gmail.com \
--cc=jose.marchesi@oracle.com \
--cc=kernel-team@fb.com \
--cc=yhs@fb.com \
--cc=yhs@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox