netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run()
Date: Tue, 14 Dec 2021 12:46:55 +0100	[thread overview]
Message-ID: <874k7bz9w0.fsf@toke.dk> (raw)
In-Reply-To: <CAADnVQKRAFCqUj9J8B5cM4u=wS-0Kh9YZYR=QqT6GiiX3ZXXDQ@mail.gmail.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Dec 13, 2021 at 4:36 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Mon, Dec 13, 2021 at 8:26 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> >>
>> >> > On Sat, Dec 11, 2021 at 10:43 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >> +
>> >> >> +static void bpf_test_run_xdp_teardown(struct bpf_test_timer *t)
>> >> >> +{
>> >> >> +       struct xdp_mem_info mem = {
>> >> >> +               .id = t->xdp.pp->xdp_mem_id,
>> >> >> +               .type = MEM_TYPE_PAGE_POOL,
>> >> >> +       };
>> >> >
>> >> > pls add a new line.
>> >> >
>> >> >> +       xdp_unreg_mem_model(&mem);
>> >> >> +}
>> >> >> +
>> >> >> +static bool ctx_was_changed(struct xdp_page_head *head)
>> >> >> +{
>> >> >> +       return (head->orig_ctx.data != head->ctx.data ||
>> >> >> +               head->orig_ctx.data_meta != head->ctx.data_meta ||
>> >> >> +               head->orig_ctx.data_end != head->ctx.data_end);
>> >> >
>> >> > redundant ()
>> >> >
>> >> >>         bpf_test_timer_enter(&t);
>> >> >>         old_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>> >> >>         do {
>> >> >>                 run_ctx.prog_item = &item;
>> >> >> -               if (xdp)
>> >> >> +               if (xdp && xdp_redirect) {
>> >> >> +                       ret = bpf_test_run_xdp_redirect(&t, prog, ctx);
>> >> >> +                       if (unlikely(ret < 0))
>> >> >> +                               break;
>> >> >> +                       *retval = ret;
>> >> >> +               } else if (xdp) {
>> >> >>                         *retval = bpf_prog_run_xdp(prog, ctx);
>> >> >
>> >> > Can we do this unconditionally without introducing a new uapi flag?
>> >> > I mean "return bpf_redirect()" was a nop under test_run.
>> >> > What kind of tests might break if it stops being a nop?
>> >>
>> >> Well, I view the existing mode of bpf_prog_test_run() with XDP as a way
>> >> to write XDP unit tests: it allows you to submit a packet, run your XDP
>> >> program on it, and check that it returned the right value and did the
>> >> right modifications. This means if you XDP program does 'return
>> >> bpf_redirect()', userspace will still get the XDP_REDIRECT value and so
>> >> it can check correctness of your XDP program.
>> >>
>> >> With this flag the behaviour changes quite drastically, in that it will
>> >> actually put packets on the wire instead of getting back the program
>> >> return. So I think it makes more sense to make it a separate opt-in
>> >> mode; the old behaviour can still be useful for checking XDP program
>> >> behaviour.
>> >
>> > Ok that all makes sense.
>>
>> Great!
>>
>> > How about using prog_run to feed the data into proper netdev?
>> > XDP prog may or may not attach to it (this detail is tbd) and
>> > prog_run would use prog_fd and ifindex to trigger RX (yes, receive)
>> > in that netdev. XDP prog will execute and will be able to perform
>> > all actions (not only XDP_REDIRECT).
>> > XDP_PASS would pass the packet to the stack, etc.
>>
>> Hmm, that's certainly an interesting idea! I don't think we can actually
>> run the XDP hook on the netdev itself (since that is deep in the
>> driver), but we can emulate it: we just need to do what this version of
>> the patch is doing, but add handling of the other return codes.
>>
>> XDP_PASS could be supported by basically copying what cpumap is doing
>> (turn the frames into skbs and call netif_receive_skb_list()), but
>> XDP_TX would have to be implemented via ndo_xdp_xmit(), so it becomes
>> equivalent to a REDIRECT back to the same interface. That's probably OK,
>> though, right?
>
> Yep. Something like this.
> imo the individual BPF_F_TEST_XDP_DO_REDIRECT knob doesn't look right.
> It's tweaking the prog run from no side effects execution model
> to partial side effects.
> If we want to run xdp prog with side effects it probably should
> behave like normal execution on the netdev when it receives the packet.
> We might not even need to create a new netdev for that.
> I can imagine a bpf_prog_run operating on eth0 with a packet prepared
> by the user space.
> Like injecting a packet right into the driver and xdp part of it.
> If prog says XDP_PASS the packet will go up the stack like normal.
> So this mechanism could be used to inject packets into the stack.
> Obviously buffer management is an issue in the traditional NIC
> when a packet doesn't come from the wire.
> Also doing this in every driver would be a pain.
> So we need some common infra to inject the user packet into a netdev
> like it was received by this netdev. It could be a change for tuntap
> or for veth or not related to netdev at all.

What you're describing is basically what the cpumap code does; except it
doesn't handle XDP_TX, and it doesn't do buffer management. But I
already implemented the latter, and the former is straight-forward to do
as a special-case XDP_REDIRECT. So my plan is to try this out and see
what that looks like :)

> After XDP_PASS it doesn't need to be fast. skb will get allocated
> and the stack might see it as it arrived from ifindex=N regardless
> of the HW of that netdev.
> XDP_TX would xmit right out of that ifindex=netdev.
> and XDP_REDIRECT would redirect to a different netdev.
> At the end there will be less special cases and page_pool tweaks.
> Thought the patches 1-5 look fine, it still feels a bit custom
> just for this particular BPF_F_TEST_XDP_DO_REDIRECT use case.
> With more generic bpf_run_prog(xdp_prog_fd, ifindex_of_netdev)
> it might reduce custom handling.

Yup, totally makes sense!

-Toke


  reply	other threads:[~2021-12-14 11:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-11 18:41 [PATCH bpf-next v3 0/8] Add support for transmitting packets using XDP in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 1/8] xdp: Allow registering memory model without rxq reference Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 2/8] page_pool: Add callback to init pages when they are allocated Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 3/8] page_pool: Store the XDP mem id Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 4/8] xdp: Move conversion to xdp_frame out of map functions Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 5/8] xdp: add xdp_do_redirect_frame() for pre-computed xdp_frames Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run() Toke Høiland-Jørgensen
2021-12-12  2:43   ` Alexei Starovoitov
2021-12-13 16:26     ` Toke Høiland-Jørgensen
2021-12-14  0:02       ` Alexei Starovoitov
2021-12-14  0:36         ` Toke Høiland-Jørgensen
2021-12-14  3:45           ` Alexei Starovoitov
2021-12-14 11:46             ` Toke Høiland-Jørgensen [this message]
2021-12-11 18:41 ` [PATCH bpf-next v3 7/8] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 8/8] samples/bpf: Add xdp_trafficgen sample Toke Høiland-Jørgensen
2021-12-12  2:47   ` Alexei Starovoitov
2021-12-13 16:28     ` Toke Høiland-Jørgensen
2021-12-14  0:05       ` Alexei Starovoitov
2021-12-14  0:37         ` Toke Høiland-Jørgensen
2021-12-14  3:28           ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874k7bz9w0.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).