All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run()
Date: Tue, 14 Dec 2021 12:46:55 +0100	[thread overview]
Message-ID: <874k7bz9w0.fsf@toke.dk> (raw)
In-Reply-To: <CAADnVQKRAFCqUj9J8B5cM4u=wS-0Kh9YZYR=QqT6GiiX3ZXXDQ@mail.gmail.com>

Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Dec 13, 2021 at 4:36 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Mon, Dec 13, 2021 at 8:26 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> >>
>> >> > On Sat, Dec 11, 2021 at 10:43 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >> +
>> >> >> +static void bpf_test_run_xdp_teardown(struct bpf_test_timer *t)
>> >> >> +{
>> >> >> +       struct xdp_mem_info mem = {
>> >> >> +               .id = t->xdp.pp->xdp_mem_id,
>> >> >> +               .type = MEM_TYPE_PAGE_POOL,
>> >> >> +       };
>> >> >
>> >> > pls add a new line.
>> >> >
>> >> >> +       xdp_unreg_mem_model(&mem);
>> >> >> +}
>> >> >> +
>> >> >> +static bool ctx_was_changed(struct xdp_page_head *head)
>> >> >> +{
>> >> >> +       return (head->orig_ctx.data != head->ctx.data ||
>> >> >> +               head->orig_ctx.data_meta != head->ctx.data_meta ||
>> >> >> +               head->orig_ctx.data_end != head->ctx.data_end);
>> >> >
>> >> > redundant ()
>> >> >
>> >> >>         bpf_test_timer_enter(&t);
>> >> >>         old_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>> >> >>         do {
>> >> >>                 run_ctx.prog_item = &item;
>> >> >> -               if (xdp)
>> >> >> +               if (xdp && xdp_redirect) {
>> >> >> +                       ret = bpf_test_run_xdp_redirect(&t, prog, ctx);
>> >> >> +                       if (unlikely(ret < 0))
>> >> >> +                               break;
>> >> >> +                       *retval = ret;
>> >> >> +               } else if (xdp) {
>> >> >>                         *retval = bpf_prog_run_xdp(prog, ctx);
>> >> >
>> >> > Can we do this unconditionally without introducing a new uapi flag?
>> >> > I mean "return bpf_redirect()" was a nop under test_run.
>> >> > What kind of tests might break if it stops being a nop?
>> >>
>> >> Well, I view the existing mode of bpf_prog_test_run() with XDP as a way
>> >> to write XDP unit tests: it allows you to submit a packet, run your XDP
>> >> program on it, and check that it returned the right value and did the
>> >> right modifications. This means if you XDP program does 'return
>> >> bpf_redirect()', userspace will still get the XDP_REDIRECT value and so
>> >> it can check correctness of your XDP program.
>> >>
>> >> With this flag the behaviour changes quite drastically, in that it will
>> >> actually put packets on the wire instead of getting back the program
>> >> return. So I think it makes more sense to make it a separate opt-in
>> >> mode; the old behaviour can still be useful for checking XDP program
>> >> behaviour.
>> >
>> > Ok that all makes sense.
>>
>> Great!
>>
>> > How about using prog_run to feed the data into proper netdev?
>> > XDP prog may or may not attach to it (this detail is tbd) and
>> > prog_run would use prog_fd and ifindex to trigger RX (yes, receive)
>> > in that netdev. XDP prog will execute and will be able to perform
>> > all actions (not only XDP_REDIRECT).
>> > XDP_PASS would pass the packet to the stack, etc.
>>
>> Hmm, that's certainly an interesting idea! I don't think we can actually
>> run the XDP hook on the netdev itself (since that is deep in the
>> driver), but we can emulate it: we just need to do what this version of
>> the patch is doing, but add handling of the other return codes.
>>
>> XDP_PASS could be supported by basically copying what cpumap is doing
>> (turn the frames into skbs and call netif_receive_skb_list()), but
>> XDP_TX would have to be implemented via ndo_xdp_xmit(), so it becomes
>> equivalent to a REDIRECT back to the same interface. That's probably OK,
>> though, right?
>
> Yep. Something like this.
> imo the individual BPF_F_TEST_XDP_DO_REDIRECT knob doesn't look right.
> It's tweaking the prog run from no side effects execution model
> to partial side effects.
> If we want to run xdp prog with side effects it probably should
> behave like normal execution on the netdev when it receives the packet.
> We might not even need to create a new netdev for that.
> I can imagine a bpf_prog_run operating on eth0 with a packet prepared
> by the user space.
> Like injecting a packet right into the driver and xdp part of it.
> If prog says XDP_PASS the packet will go up the stack like normal.
> So this mechanism could be used to inject packets into the stack.
> Obviously buffer management is an issue in the traditional NIC
> when a packet doesn't come from the wire.
> Also doing this in every driver would be a pain.
> So we need some common infra to inject the user packet into a netdev
> like it was received by this netdev. It could be a change for tuntap
> or for veth or not related to netdev at all.

What you're describing is basically what the cpumap code does; except it
doesn't handle XDP_TX, and it doesn't do buffer management. But I
already implemented the latter, and the former is straight-forward to do
as a special-case XDP_REDIRECT. So my plan is to try this out and see
what that looks like :)

> After XDP_PASS it doesn't need to be fast. skb will get allocated
> and the stack might see it as it arrived from ifindex=N regardless
> of the HW of that netdev.
> XDP_TX would xmit right out of that ifindex=netdev.
> and XDP_REDIRECT would redirect to a different netdev.
> At the end there will be less special cases and page_pool tweaks.
> Thought the patches 1-5 look fine, it still feels a bit custom
> just for this particular BPF_F_TEST_XDP_DO_REDIRECT use case.
> With more generic bpf_run_prog(xdp_prog_fd, ifindex_of_netdev)
> it might reduce custom handling.

Yup, totally makes sense!

-Toke


  reply	other threads:[~2021-12-14 11:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-11 18:41 [PATCH bpf-next v3 0/8] Add support for transmitting packets using XDP in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 1/8] xdp: Allow registering memory model without rxq reference Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 2/8] page_pool: Add callback to init pages when they are allocated Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 3/8] page_pool: Store the XDP mem id Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 4/8] xdp: Move conversion to xdp_frame out of map functions Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 5/8] xdp: add xdp_do_redirect_frame() for pre-computed xdp_frames Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run() Toke Høiland-Jørgensen
2021-12-12  2:43   ` Alexei Starovoitov
2021-12-13 16:26     ` Toke Høiland-Jørgensen
2021-12-14  0:02       ` Alexei Starovoitov
2021-12-14  0:36         ` Toke Høiland-Jørgensen
2021-12-14  3:45           ` Alexei Starovoitov
2021-12-14 11:46             ` Toke Høiland-Jørgensen [this message]
2021-12-11 18:41 ` [PATCH bpf-next v3 7/8] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 8/8] samples/bpf: Add xdp_trafficgen sample Toke Høiland-Jørgensen
2021-12-12  2:47   ` Alexei Starovoitov
2021-12-13 16:28     ` Toke Høiland-Jørgensen
2021-12-14  0:05       ` Alexei Starovoitov
2021-12-14  0:37         ` Toke Høiland-Jørgensen
2021-12-14  3:28           ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874k7bz9w0.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.