Re: [PATCH] bpf, x64: bump the number of passes to 64

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gary Lin <glin@suse.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	andreas.taschner@suse.com
Subject: Re: [PATCH] bpf, x64: bump the number of passes to 64
Date: Fri, 4 Dec 2020 11:42:13 +0800	[thread overview]
Message-ID: <20201204034213.GA16653@GaryWorkstation> (raw)
In-Reply-To: <20201203181431.t2l63nifzprxqc26@ast-mbp>

On Thu, Dec 03, 2020 at 10:14:31AM -0800, Alexei Starovoitov wrote:
> On Thu, Dec 03, 2020 at 12:20:38PM +0100, Eric Dumazet wrote:
> > 
> > 
> > On 12/3/20 10:12 AM, Gary Lin wrote:
> > > The x64 bpf jit expects bpf images converge within the given passes, but
> > > it could fail to do so with some corner cases. For example:
> > > 
> > >       l0:     ldh [4]
> > >       l1:     jeq #0x537d, l2, l40
> > >       l2:     ld [0]
> > >       l3:     jeq #0xfa163e0d, l4, l40
> > >       l4:     ldh [12]
> > >       l5:     ldx #0xe
> > >       l6:     jeq #0x86dd, l41, l7
> > >       l8:     ld [x+16]
> > >       l9:     ja 41
> > > 
> > >         [... repeated ja 41 ]
> > > 
> > >       l40:    ja 41
> > >       l41:    ret #0
> > >       l42:    ld #len
> > >       l43:    ret a
> > > 
> > > This bpf program contains 32 "ja 41" instructions which are effectively
> > > NOPs and designed to be replaced with valid code dynamically. Ideally,
> > > bpf jit should optimize those "ja 41" instructions out when translating
> > > the bpf instructions into x86_64 machine code. However, do_jit() can
> > > only remove one "ja 41" for offset==0 on each pass, so it requires at
> > > least 32 runs to eliminate those JMPs and exceeds the current limit of
> > > passes (20). In the end, the program got rejected when BPF_JIT_ALWAYS_ON
> > > is set even though it's legit as a classic socket filter.
> > > 
> > > Since this kind of programs are usually handcrafted rather than
> > > generated by LLVM, those programs tend to be small. To avoid increasing
> > > the complexity of BPF JIT, this commit just bumps the number of passes
> > > to 64 as suggested by Daniel to make it less likely to fail on such cases.
> > > 
> > 
> > Another idea would be to stop trying to reduce size of generated
> > code after a given number of passes have been attempted.
> > 
> > Because even a limit of 64 wont ensure all 'valid' programs can be JITed.
> 
> +1.
> Bumping the limit is not solving anything.
> It only allows bad actors force kernel to spend more time in JIT.
> If we're holding locks the longer looping may cause issues.
> I think JIT is parallel enough, but still it's a concern.
> 
> I wonder how assemblers deal with it?
> They probably face the same issue.
> 
> Instead of going back to 32-bit jumps and suddenly increase image size
> I think we can do nop padding instead.
> After few loops every insn is more or less optimal.
> I think the fix could be something like:
>   if (is_imm8(jmp_offset)) {
>        EMIT2(jmp_cond, jmp_offset);
>        if (loop_cnt > 5) {
>           EMIT N nops
>           where N = addrs[i] - addrs[i - 1]; // not sure about this math.
>           N can be 0 or 4 here.
>           // or may be NOPs should be emitted before EMIT2.
>           // need to think it through
>        }
>   }
This looks promising. Once we switch to nop padding, the image is likely
to converge soon. Maybe we can postpone the padding to the last 5 passes
so that do_jit() could optimize the image a bit more.

> Will something like this work?
> I think that's what you're suggesting, right?
> 
Besides nop padding, the optimization for 0 offset jump also has to be
disabled since it's actually the one causing image shrinking in my case.

Gary Lin

next prev parent reply	other threads:[~2020-12-04  3:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03  9:12 [PATCH] bpf, x64: bump the number of passes to 64 Gary Lin
2020-12-03 11:20 ` Eric Dumazet
2020-12-03 18:14   ` Alexei Starovoitov
2020-12-04  3:42     ` Gary Lin [this message]
2020-12-04 10:15       ` Gary Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201204034213.GA16653@GaryWorkstation \
    --to=glin@suse.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andreas.taschner@suse.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.