From: Gary Lin <glin@suse.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Network Development <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>,
andreas.taschner@suse.com
Subject: Re: [PATCH RFC] bpf, x64: allow not-converged images when BPF_JIT_ALWAYS_ON is set
Date: Mon, 16 Nov 2020 14:31:50 +0800 [thread overview]
Message-ID: <20201116063150.GU16653@GaryWorkstation> (raw)
In-Reply-To: <CAADnVQ+0m3OJs6eNOyZv4v0PrB3JDxkP=xCK5sbXQpJ9sWqBjw@mail.gmail.com>
On Fri, Nov 13, 2020 at 05:48:31PM -0800, Alexei Starovoitov wrote:
> On Fri, Nov 13, 2020 at 12:40 AM Gary Lin <glin@suse.com> wrote:
> >
> > The x64 bpf jit expects the bpf images converge within the given passes.
> > However there is a corner case:
> >
> > l0: ldh [4]
> > l1: jeq #0x537d, l2, l40
> > l2: ld [0]
> > l3: jeq #0xfa163e0d, l4, l40
> > l4: ldh [12]
> > l5: ldx #0xe
> > l6: jeq #0x86dd, l41, l7
> > l7: jeq #0x800, l8, l41
> > l8: ld [x+16]
> > l9: ja 41
> >
> > [... repeated ja 41 ]
> >
> > l40: ja 41
> > l41: ret #0
> > l42: ld #len
> > l43: ret a
> >
> > The bpf program contains 32 "ja 41" and do_jit() only removes one "ja 41"
> > right before "l41: ret #0" for offset==0 in each pass, so
> > bpf_int_jit_compile() needs to run do_jit() at least 32 times to
> > eliminate those JMP instructions. Since the current max number of passes
> > is 20, the bpf program couldn't converge within 20 passes and got rejected
> > when BPF_JIT_ALWAYS_ON is set even though it's legit as a classic socket
> > filter.
> >
> > A not-converged image may be not optimal but at least the bpf
> > instructions are translated into x64 machine code. Maybe we could just
> > issue a warning instead so that the program is still loaded and the user
> > is also notified.
>
> Non-convergence is not about being optimal. It's about correctness.
> If size is different it likely means that at least one jump has the
> wrong offset.
>
Ah, I see.
> Bumping from 20 to 64 also won't solve it.
> There could be a case where 64 isn't enough either.
True. Increasing the number of passes is just a workaround.
> One of the test_bpf.ko tests can hit any limit, iirc.
Thanks for the pointer. Will look into the tests.
>
> Also we've seen a case where JIT might never converge.
> The iteration N can have size 40, iteration N+1 size 38, iteration N+2 size 40
> and keep oscillating like this.
>
> I think the fix could be is to avoid optimality in size when pass
> number is getting large.
> Like after pass > 10 BPF_JA could always use 32-bit offset regardless
> of actual addrs[i + insn->off] - addrs[i]; difference.
> There could be other solutions too.
>
So the size convergence can be ignored if all BPF_JMPs are translated
into 32-bit offset jmp?
Thanks,
Gary Lin
next prev parent reply other threads:[~2020-11-16 6:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-13 8:38 [PATCH RFC] bpf, x64: allow not-converged images when BPF_JIT_ALWAYS_ON is set Gary Lin
2020-11-14 1:48 ` Alexei Starovoitov
2020-11-16 6:31 ` Gary Lin [this message]
2020-11-24 3:22 ` [bpf, x64] e491680a9f: kernel-selftests.net.test_bpf.sh.fail kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201116063150.GU16653@GaryWorkstation \
--to=glin@suse.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andreas.taschner@suse.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).