netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Gary Lin <glin@suse.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	andreas.taschner@suse.com
Subject: Re: [PATCH RFC] bpf, x64: allow not-converged images when BPF_JIT_ALWAYS_ON is set
Date: Fri, 13 Nov 2020 17:48:31 -0800	[thread overview]
Message-ID: <CAADnVQ+0m3OJs6eNOyZv4v0PrB3JDxkP=xCK5sbXQpJ9sWqBjw@mail.gmail.com> (raw)
In-Reply-To: <20201113083852.22294-1-glin@suse.com>

On Fri, Nov 13, 2020 at 12:40 AM Gary Lin <glin@suse.com> wrote:
>
> The x64 bpf jit expects the bpf images converge within the given passes.
> However there is a corner case:
>
>   l0:     ldh [4]
>   l1:     jeq #0x537d, l2, l40
>   l2:     ld [0]
>   l3:     jeq #0xfa163e0d, l4, l40
>   l4:     ldh [12]
>   l5:     ldx #0xe
>   l6:     jeq #0x86dd, l41, l7
>   l7:     jeq #0x800, l8, l41
>   l8:     ld [x+16]
>   l9:     ja 41
>
>     [... repeated ja 41 ]
>
>   l40:    ja 41
>   l41:    ret #0
>   l42:    ld #len
>   l43:    ret a
>
> The bpf program contains 32 "ja 41" and do_jit() only removes one "ja 41"
> right before "l41:  ret #0" for offset==0 in each pass, so
> bpf_int_jit_compile() needs to run do_jit() at least 32 times to
> eliminate those JMP instructions. Since the current max number of passes
> is 20, the bpf program couldn't converge within 20 passes and got rejected
> when BPF_JIT_ALWAYS_ON is set even though it's legit as a classic socket
> filter.
>
> A not-converged image may be not optimal but at least the bpf
> instructions are translated into x64 machine code. Maybe we could just
> issue a warning instead so that the program is still loaded and the user
> is also notified.

Non-convergence is not about being optimal. It's about correctness.
If size is different it likely means that at least one jump has the
wrong offset.

Bumping from 20 to 64 also won't solve it.
There could be a case where 64 isn't enough either.
One of the test_bpf.ko tests can hit any limit, iirc.

Also we've seen a case where JIT might never converge.
The iteration N can have size 40, iteration N+1 size 38, iteration N+2 size 40
and keep oscillating like this.

I think the fix could be is to avoid optimality in size when pass
number is getting large.
Like after pass > 10 BPF_JA could always use 32-bit offset regardless
of actual addrs[i + insn->off] - addrs[i]; difference.
There could be other solutions too.

  reply	other threads:[~2020-11-14  1:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-13  8:38 [PATCH RFC] bpf, x64: allow not-converged images when BPF_JIT_ALWAYS_ON is set Gary Lin
2020-11-14  1:48 ` Alexei Starovoitov [this message]
2020-11-16  6:31   ` Gary Lin
2020-11-24  3:22 ` [bpf, x64] e491680a9f: kernel-selftests.net.test_bpf.sh.fail kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAADnVQ+0m3OJs6eNOyZv4v0PrB3JDxkP=xCK5sbXQpJ9sWqBjw@mail.gmail.com' \
    --to=alexei.starovoitov@gmail.com \
    --cc=andreas.taschner@suse.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=glin@suse.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).