bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gary Lin <glin@suse.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: netdev@vger.kernel.org, bpf@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	andreas.taschner@suse.com
Subject: Re: [PATCH] bpf,x64: pad NOPs to make images converge more easily
Date: Mon, 14 Dec 2020 16:15:17 +0800	[thread overview]
Message-ID: <X9cfFVKMFwtKdbNS@GaryWorkstation> (raw)
In-Reply-To: <X9biZvkGPfslPOL4@GaryWorkstation>

On Mon, Dec 14, 2020 at 11:56:22AM +0800, Gary Lin wrote:
> On Fri, Dec 11, 2020 at 09:05:05PM +0100, Daniel Borkmann wrote:
> > On 12/11/20 9:19 AM, Gary Lin wrote:
> > > The x64 bpf jit expects bpf images converge within the given passes, but
> > > it could fail to do so with some corner cases. For example:
> > > 
> > >        l0:     ldh [4]
> > >        l1:     jeq #0x537d, l2, l40
> > >        l2:     ld [0]
> > >        l3:     jeq #0xfa163e0d, l4, l40
> > >        l4:     ldh [12]
> > >        l5:     ldx #0xe
> > >        l6:     jeq #0x86dd, l41, l7
> > >        l8:     ld [x+16]
> > >        l9:     ja 41
> > > 
> > >          [... repeated ja 41 ]
> > > 
> > >        l40:    ja 41
> > >        l41:    ret #0
> > >        l42:    ld #len
> > >        l43:    ret a
> > > 
> > > This bpf program contains 32 "ja 41" instructions which are effectively
> > > NOPs and designed to be replaced with valid code dynamically. Ideally,
> > > bpf jit should optimize those "ja 41" instructions out when translating
> > > the bpf instructions into x86_64 machine code. However, do_jit() can
> > > only remove one "ja 41" for offset==0 on each pass, so it requires at
> > > least 32 runs to eliminate those JMPs and exceeds the current limit of
> > > passes (20). In the end, the program got rejected when BPF_JIT_ALWAYS_ON
> > > is set even though it's legit as a classic socket filter.
> > > 
> > > To make the image more likely converge within 20 passes, this commit
> > > pads some instructions with NOPs in the last 5 passes:
> > > 
> > > 1. conditional jumps
> > >    A possible size variance comes from the adoption of imm8 JMP. If the
> > >    offset is imm8, we calculate the size difference of this BPF instruction
> > >    between the previous pass and the current pass and fill the gap with NOPs.
> > >    To avoid the recalculation of jump offset, those NOPs are inserted before
> > >    the JMP code, so we have to subtract the 2 bytes of imm8 JMP when
> > >    calculating the NOP number.
> > > 
> > > 2. BPF_JA
> > >    There are two conditions for BPF_JA.
> > >    a.) nop jumps
> > >      If this instruction is not optimized out in the previous pass,
> > >      instead of removing it, we insert the equivalent size of NOPs.
> > >    b.) label jumps
> > >      Similar to condition jumps, we prepend NOPs right before the JMP
> > >      code.
> > > 
> > > To make the code concise, emit_nops() is modified to use the signed len and
> > > return the number of inserted NOPs.
> > > 
> > > To support bpf-to-bpf, a new flag, padded, is introduced to 'struct bpf_prog'
> > > so that bpf_int_jit_compile() could know if the program is padded or not.
> > 
> > Please also add multiple hand-crafted test cases e.g. for bpf-to-bpf calls into
> > test_verifier (which is part of bpf kselftests) that would exercise this corner
> > case in x86 jit where we would start to nop pad so that there is proper coverage,
> > too.
> > 
> The corner case I had in the commit description is likely being rejected by
> the verifier because most of those "ja 41" are unreachable instructions.
> Is there any known test case that needs more than 15 passes in x86 jit?
> 
Just an idea. Besides the mentioned corner case, how about making
PADDING_PASSES dynamically configurable (sysfs?) and reusing the existing
test cases? So that we can have a script to set PADDING_PASSES from 1 to 20
and run the bpf selftests separately. This guarantees that the padding
strategy will be applied at least in a certain PADDING_PASSES settings.

Gary Lin


  reply	other threads:[~2020-12-14  8:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-11  8:19 [PATCH] bpf,x64: pad NOPs to make images converge more easily Gary Lin
2020-12-11 20:05 ` Daniel Borkmann
2020-12-14  3:56   ` Gary Lin
2020-12-14  8:15     ` Gary Lin [this message]
2020-12-14 15:31       ` Daniel Borkmann
2020-12-15  1:50         ` Gary Lin
2020-12-11 20:58 ` Andrii Nakryiko
2020-12-11 21:13   ` Daniel Borkmann
2020-12-12  2:24     ` Alexei Starovoitov
2020-12-14  5:12       ` Gary Lin
2020-12-14  3:51   ` Gary Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X9cfFVKMFwtKdbNS@GaryWorkstation \
    --to=glin@suse.com \
    --cc=andreas.taschner@suse.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).