BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Joanne Koong <joannekoong@fb.com>, <bpf@vger.kernel.org>,
	<netdev@vger.kernel.org>, <Kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v2 0/3] Add XDP support for bpf_load_hdr_opt
Date: Tue, 19 Oct 2021 09:02:33 -0700	[thread overview]
Message-ID: <a9314ae9-eab8-5761-a547-5bfb5022dac3@fb.com> (raw)
In-Reply-To: <20211019000058.ghklvg4saybzqk3o@ast-mbp>



On 10/18/21 5:00 PM, Alexei Starovoitov wrote:
> On Sat, Oct 09, 2021 at 12:20:27AM +0200, Toke Høiland-Jørgensen wrote:
>>
>> So if we can't fix the verifier, maybe we could come up with a more
>> general helper for packet parsing? Something like:
>>
>> bpf_for_each_pkt_chunk(ctx, offset, callback_fn, callback_arg)
>> {
>>    ptr = ctx->data + offset;
>>    while (ptr < ctx->data_end) {
>>      offset = callback_fn(ptr, ctx->data_end, callback_arg);
>>      if (offset == 0)
>>        return 0;
>>      ptr += offset;
>>    }
>>    
>>    // out of bounds before callback was done
>>    return -EINVAL;
>> }
> 
> We're starting to work on this since it will be useful not only for
> packet parsing, TLV parsing, but potentially any kind of 'for' loop iteration.
> 
>> This would work for parsing any kind of packet header or TLV-style data
>> without having to teach the kernel about each header type. It'll have
>> quite a bit of overhead if all the callbacks happen via indirect calls,
>> but maybe the verifier can inline the calls (or at least turn them into
>> direct CALL instructions)?
> 
> Right. That's the main downside.
> If the bpf_for_each*() helper is simple enough the verifier can inline it
> similar to map_gen_lookup. In such case the indirect call will be a direct call,
> so the overhead won't be that bad, but it's still a function call and
> static function will have full prologue+epilogue.
> Converting static function into direct jump would be really challenging
> for the verifier and won't provide much benefit, since r6-r9 save/restore
> would need to happen anyway even for such 'inlined' static func, since
> llvm will be freely using r6-r9 for insns inside function body
> assuming that it's a normal function.
> 
> May be there is a way to avoid call overhead with with clang extensions.
> If we want to do:
> int mem_eq(char *p1, char *p2, int size)
> {
>    int i;
>    for (i = 0; i < size; i++)
>      if (p1[i] != p2[i])
>        return 0;
>    return 1;
> }
> 
> With clang extension we might write it as:
> int bpf_mem_eq(char *p1, char *p2, int size)
> {
>    int i = 0;
>    int iter;
> 
>    iter = __builtin_for_until(i, size, ({
>        if (p1[i] != p2[i])
>          goto out;
>    }));
>    out:
>    if (iter != size)
>      return 0;
>    return 1;
> }
> 
> The llvm will generate pseudo insns for this __builtin_for.
> The verifier will analyze the loop body for the range [0, size)
> and replace pseudo insns with normal branches after the verification.
> We might even keep the normal C syntax for loops and use
> llvm HardwareLoops pass to add pseudo insns.
> It's more or less the same ideas for loops we discussed before
> bounded loops were introduced.
> The main problem with bounded loops is that the loop body will
> typically be verified the number of times equal to the number of iterations.
> So for any non-trivial loop such iteration count is not much more
> than 100. The verifier can do scalar evolution analysis, but
> it's likely won't work for many cases and user experience
> will suffer. With __builtin_for the scalar evolution is not necessary,
> since induction variable is one and explicit and its range is explicit too.
> That enables single pass over loop body.
> One might argue that for (i = 0; i < 10000; i += 10) loops are
> necessary too, but instead of complicating the verifier with sparse
> ranges it's better to put that on users that can do:
>    iter = __builtin_for_until(i, 10000 / 10, ({
>        j = i * 10;
>        use j;
>    }));
> Single explicit induction variable makes the verification practical.
> The loop body won't be as heavily optimized as normal loop,
> but it's a good thing.

We have discussed how to verify *well-formed* loops back in 2018.
(BPF control flow, supporting loops and other patterns:
  https://www.linuxplumbersconf.org/event/2/contributions/116/)
Now probably the time to revisit it again!

I think Alexei's proposal is the right direction to have
compiler preserving the loop structure with some pseudo instructions
and verifier is able to range-based verification
instead of iterating all loop iterations.

LLVM already has some IR loop intrinsics like below:
   def int_set_loop_iterations :
   def int_start_loop_iterations :
   def int_test_set_loop_iterations :
   def int_test_start_loop_iterations :
   def int_loop_decrement :
   def int_loop_decrement_reg :
to facilitate hardware loop. BPF target can define its own
intrinsics to help define a well structured loop.

I will look into this.

  reply	other threads:[~2021-10-19 16:03 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-06 23:05 [PATCH bpf-next v2 0/3] Add XDP support for bpf_load_hdr_opt Joanne Koong
2021-10-06 23:05 ` [PATCH bpf-next v2 1/3] bpf/xdp: Add bpf_load_hdr_opt support for xdp Joanne Koong
2021-10-06 23:50   ` Song Liu
2021-10-06 23:05 ` [PATCH bpf-next v2 2/3] bpf/selftests: Rename test_tcp_hdr_options to test_sockops_tcp_hdr_options Joanne Koong
2021-10-06 23:47   ` Song Liu
2021-10-06 23:05 ` [PATCH bpf-next v2 3/3] bpf/selftests: Add xdp bpf_load_tcp_hdr_options tests Joanne Koong
2021-10-06 23:52   ` Song Liu
2021-10-07 14:41 ` [PATCH bpf-next v2 0/3] Add XDP support for bpf_load_hdr_opt Toke Høiland-Jørgensen
2021-10-07 20:57   ` Joanne Koong
2021-10-07 21:25     ` Daniel Borkmann
2021-10-07 23:52       ` Martin KaFai Lau
2021-10-08 22:20         ` Toke Høiland-Jørgensen
2021-10-11 18:43           ` Martin KaFai Lau
2021-10-12 14:11             ` Toke Høiland-Jørgensen
2021-10-12 20:51               ` Joanne Koong
2021-10-13 10:19                 ` Toke Høiland-Jørgensen
2021-10-19  0:00           ` Alexei Starovoitov
2021-10-19 16:02             ` Yonghong Song [this message]
2021-10-19 16:10             ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9314ae9-eab8-5761-a547-5bfb5022dac3@fb.com \
    --to=yhs@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=joannekoong@fb.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox