From: Jiong Wang <jiong.wang@netronome.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Jiong Wang <jiong.wang@netronome.com>,
Daniel Borkmann <daniel@iogearbox.net>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
oss-drivers@netronome.com
Subject: Re: [PATCH/RFC bpf-next 06/16] bpf: new sysctl "bpf_jit_32bit_opt"
Date: Wed, 27 Mar 2019 19:13:25 +0000 [thread overview]
Message-ID: <874l7oruzu.fsf@netronome.com> (raw)
In-Reply-To: <20190327174530.tyrz335ikudvybi7@ast-mbp>
Alexei Starovoitov writes:
> On Wed, Mar 27, 2019 at 05:18:35PM +0000, Jiong Wang wrote:
>>
>> > On 27 Mar 2019, at 17:17, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>> >
>> > On Wed, Mar 27, 2019 at 05:06:01PM +0000, Jiong Wang wrote:
>> >>
>> >>> On 27 Mar 2019, at 17:00, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>> >>>
>> >>> On Tue, Mar 26, 2019 at 06:05:29PM +0000, Jiong Wang wrote:
>> >>>> After previous patches, verifier has marked those instructions that really
>> >>>> need zero extension on dst_reg.
>> >>>>
>> >>>> It is then for all back-ends to decide how to use such information to
>> >>>> eliminate unnecessary zero extension codegen during JIT compilation.
>> >>>>
>> >>>> One approach is:
>> >>>> 1. Verifier insert explicit zero extension for those instructions that
>> >>>> need zero extension.
>> >>>> 2. All JIT back-ends do NOT generate zero extension for sub-register
>> >>>> write any more.
>> >>>>
>> >>>> The good thing for this approach is no major change on JIT back-end
>> >>>> interface, all back-ends could get this optimization.
>> >>>>
>> >>>> However, only those back-ends that do not have hardware zero extension
>> >>>> want this optimization. For back-ends like x86_64 and AArch64, there is
>> >>>> hardware support, so this optimization should be disabled.
>> >>>>
>> >>>> This patch introduces new sysctl "bpf_jit_32bit_opt" which is the control
>> >>>> variable for whether the optimization should be enabled.
>> >>>>
>> >>>> It is initialized using target hook bpf_jit_hardware_zext which is default
>> >>>> true, meaning the underlying hardware will do zero extension automatically,
>> >>>> therefore the optimization will be disabled.
>> >>>>
>> >>>> Offload targets do not use this native target hook, instead, they could
>> >>>> get the optimization results using bpf_prog_offload_ops.finalize.
>> >>>>
>> >>>> The user could always enable or disable the optimization by using:
>> >>>>
>> >>>> sysctl net/core/bpf_jit_32bit_opt=[0 | 1]
>> >>>
>> >>> I don't think there should be a sysctl for this.
>> >>
>> >> The sysctl introduced mostly because I think it could be useful for testing.
>> >> For example on x86_64, with this sysctl, we can enable the optimisation and
>> >> can run selftest.
>> >>
>> >> Does this make sense?
>> >>
>> >> Or when one insn is marked, we print verbose info, so the tester could catch
>> >> it from log?
>> >
>> > sysctl in this patch only triggers insertion of shifts.
>> > what kind of testing does it enable on x64?
>> > The writing insn is already 32-bit and hw does zero extend.
>> > These two shifts is always a nop?
>> > a sysctl to test that the verifier inserted shifts in the right place?
>>
>> Yes, that’s the test methodology I am using. Match the instruction sequence after
>> shifts insertion.
>
> I see. I don't think such extra shifts right after hw zero extend will catch much.
> imo it would be better to populate upper 32-bit with random values on x64
> where verifier analysis showed that it's ok to do so.
Sound like a good idea, indeed gives much more stressful test on x64, and
if all tests passed under test_progs + -mattr=+alu32, then could be very
good assurance on the correctness.
> Such extra insns can be inserted by the verifier. Since such debugging
> has run-time cost we'd need a flag to turn it on.
> May be a new flag during prog load instead of sysctl?
OK, I will explore on this line, see if could have a clean solution.
> It can be a global switch inside libbpf, so test_verifier and test_progs
> wouldn't need to pass it everywhere explictly. It would double the test time,
> but it's worth doing always on all archs. Especially on x64.
>
> other thoughts...
> I guess it's ok to stick with shifts for now.
> Introducing new insn would be nice, but we can do it later.
> Changing all jits for this new insn as pre-patch to this set is too much.
+1
> peephole to convert shifts is probably useful regardless.
> bpf backend emits a bunch of useless shifts when alu32 is not used.
> Would be great if x86 jit can optimize it for such lazy users
> (and users who don't upgrade llvm fast enough or don't know about alu32)
Will do some checks on generic eBPF code-gen later to see how much peephole
opportunities there are.
Regards,
Jiong
next prev parent reply other threads:[~2019-03-27 19:13 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-26 18:05 [PATCH/RFC bpf-next 00/16] bpf: eliminate zero extensions for sub-register writes Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 01/16] bpf: turn "enum bpf_reg_liveness" into bit representation Jiong Wang
2019-03-27 15:44 ` Alexei Starovoitov
2019-03-26 18:05 ` [PATCH/RFC bpf-next 02/16] bpf: refactor propagate_live implementation Jiong Wang
2019-03-26 18:26 ` Jann Horn
2019-03-26 19:45 ` Jiong Wang
2019-03-27 16:35 ` Alexei Starovoitov
2019-03-27 16:44 ` Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 03/16] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32 Jiong Wang
2019-03-26 20:21 ` Jann Horn
2019-03-26 20:50 ` Jiong Wang
2019-03-27 16:38 ` Alexei Starovoitov
2019-03-26 18:05 ` [PATCH/RFC bpf-next 04/16] bpf: mark sub-register writes that really need zero extension to high bits Jiong Wang
2019-03-26 18:44 ` Edward Cree
2019-03-26 19:47 ` Jiong Wang
2019-04-05 20:44 ` Jiong Wang
2019-04-06 3:41 ` Alexei Starovoitov
2019-04-06 6:56 ` Jiong Wang
2019-04-07 2:51 ` Alexei Starovoitov
2019-03-27 16:50 ` Alexei Starovoitov
2019-03-27 17:06 ` Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 05/16] bpf: reduce false alarm by refining "enum bpf_arg_type" Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 06/16] bpf: new sysctl "bpf_jit_32bit_opt" Jiong Wang
2019-03-27 17:00 ` Alexei Starovoitov
2019-03-27 17:06 ` Jiong Wang
2019-03-27 17:17 ` Alexei Starovoitov
2019-03-27 17:18 ` Jiong Wang
2019-03-27 17:45 ` Alexei Starovoitov
2019-03-27 19:13 ` Jiong Wang [this message]
2019-03-26 18:05 ` [PATCH/RFC bpf-next 07/16] bpf: insert explicit zero extension instructions when bpf_jit_32bit_opt is true Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 08/16] arm: bpf: eliminate zero extension code-gen Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 09/16] powerpc: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 10/16] s390: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 11/16] sparc: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 12/16] x32: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 13/16] riscv: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 14/16] nfp: " Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 15/16] selftests: bpf: new field "xlated_insns" for insn scan test after verification Jiong Wang
2019-03-26 18:05 ` [PATCH/RFC bpf-next 16/16] selftests: bpf: unit testcases for zero extension insertion pass Jiong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874l7oruzu.fsf@netronome.com \
--to=jiong.wang@netronome.com \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=netdev@vger.kernel.org \
--cc=oss-drivers@netronome.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.