From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
Yonghong Song <yhs@fb.com>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Network Development <netdev@vger.kernel.org>,
bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run()
Date: Tue, 14 Dec 2021 12:46:55 +0100 [thread overview]
Message-ID: <874k7bz9w0.fsf@toke.dk> (raw)
In-Reply-To: <CAADnVQKRAFCqUj9J8B5cM4u=wS-0Kh9YZYR=QqT6GiiX3ZXXDQ@mail.gmail.com>
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> On Mon, Dec 13, 2021 at 4:36 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Mon, Dec 13, 2021 at 8:26 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> >>
>> >> > On Sat, Dec 11, 2021 at 10:43 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >> >> +
>> >> >> +static void bpf_test_run_xdp_teardown(struct bpf_test_timer *t)
>> >> >> +{
>> >> >> + struct xdp_mem_info mem = {
>> >> >> + .id = t->xdp.pp->xdp_mem_id,
>> >> >> + .type = MEM_TYPE_PAGE_POOL,
>> >> >> + };
>> >> >
>> >> > pls add a new line.
>> >> >
>> >> >> + xdp_unreg_mem_model(&mem);
>> >> >> +}
>> >> >> +
>> >> >> +static bool ctx_was_changed(struct xdp_page_head *head)
>> >> >> +{
>> >> >> + return (head->orig_ctx.data != head->ctx.data ||
>> >> >> + head->orig_ctx.data_meta != head->ctx.data_meta ||
>> >> >> + head->orig_ctx.data_end != head->ctx.data_end);
>> >> >
>> >> > redundant ()
>> >> >
>> >> >> bpf_test_timer_enter(&t);
>> >> >> old_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
>> >> >> do {
>> >> >> run_ctx.prog_item = &item;
>> >> >> - if (xdp)
>> >> >> + if (xdp && xdp_redirect) {
>> >> >> + ret = bpf_test_run_xdp_redirect(&t, prog, ctx);
>> >> >> + if (unlikely(ret < 0))
>> >> >> + break;
>> >> >> + *retval = ret;
>> >> >> + } else if (xdp) {
>> >> >> *retval = bpf_prog_run_xdp(prog, ctx);
>> >> >
>> >> > Can we do this unconditionally without introducing a new uapi flag?
>> >> > I mean "return bpf_redirect()" was a nop under test_run.
>> >> > What kind of tests might break if it stops being a nop?
>> >>
>> >> Well, I view the existing mode of bpf_prog_test_run() with XDP as a way
>> >> to write XDP unit tests: it allows you to submit a packet, run your XDP
>> >> program on it, and check that it returned the right value and did the
>> >> right modifications. This means if you XDP program does 'return
>> >> bpf_redirect()', userspace will still get the XDP_REDIRECT value and so
>> >> it can check correctness of your XDP program.
>> >>
>> >> With this flag the behaviour changes quite drastically, in that it will
>> >> actually put packets on the wire instead of getting back the program
>> >> return. So I think it makes more sense to make it a separate opt-in
>> >> mode; the old behaviour can still be useful for checking XDP program
>> >> behaviour.
>> >
>> > Ok that all makes sense.
>>
>> Great!
>>
>> > How about using prog_run to feed the data into proper netdev?
>> > XDP prog may or may not attach to it (this detail is tbd) and
>> > prog_run would use prog_fd and ifindex to trigger RX (yes, receive)
>> > in that netdev. XDP prog will execute and will be able to perform
>> > all actions (not only XDP_REDIRECT).
>> > XDP_PASS would pass the packet to the stack, etc.
>>
>> Hmm, that's certainly an interesting idea! I don't think we can actually
>> run the XDP hook on the netdev itself (since that is deep in the
>> driver), but we can emulate it: we just need to do what this version of
>> the patch is doing, but add handling of the other return codes.
>>
>> XDP_PASS could be supported by basically copying what cpumap is doing
>> (turn the frames into skbs and call netif_receive_skb_list()), but
>> XDP_TX would have to be implemented via ndo_xdp_xmit(), so it becomes
>> equivalent to a REDIRECT back to the same interface. That's probably OK,
>> though, right?
>
> Yep. Something like this.
> imo the individual BPF_F_TEST_XDP_DO_REDIRECT knob doesn't look right.
> It's tweaking the prog run from no side effects execution model
> to partial side effects.
> If we want to run xdp prog with side effects it probably should
> behave like normal execution on the netdev when it receives the packet.
> We might not even need to create a new netdev for that.
> I can imagine a bpf_prog_run operating on eth0 with a packet prepared
> by the user space.
> Like injecting a packet right into the driver and xdp part of it.
> If prog says XDP_PASS the packet will go up the stack like normal.
> So this mechanism could be used to inject packets into the stack.
> Obviously buffer management is an issue in the traditional NIC
> when a packet doesn't come from the wire.
> Also doing this in every driver would be a pain.
> So we need some common infra to inject the user packet into a netdev
> like it was received by this netdev. It could be a change for tuntap
> or for veth or not related to netdev at all.
What you're describing is basically what the cpumap code does; except it
doesn't handle XDP_TX, and it doesn't do buffer management. But I
already implemented the latter, and the former is straight-forward to do
as a special-case XDP_REDIRECT. So my plan is to try this out and see
what that looks like :)
> After XDP_PASS it doesn't need to be fast. skb will get allocated
> and the stack might see it as it arrived from ifindex=N regardless
> of the HW of that netdev.
> XDP_TX would xmit right out of that ifindex=netdev.
> and XDP_REDIRECT would redirect to a different netdev.
> At the end there will be less special cases and page_pool tweaks.
> Thought the patches 1-5 look fine, it still feels a bit custom
> just for this particular BPF_F_TEST_XDP_DO_REDIRECT use case.
> With more generic bpf_run_prog(xdp_prog_fd, ifindex_of_netdev)
> it might reduce custom handling.
Yup, totally makes sense!
-Toke
next prev parent reply other threads:[~2021-12-14 11:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-11 18:41 [PATCH bpf-next v3 0/8] Add support for transmitting packets using XDP in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 1/8] xdp: Allow registering memory model without rxq reference Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 2/8] page_pool: Add callback to init pages when they are allocated Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 3/8] page_pool: Store the XDP mem id Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 4/8] xdp: Move conversion to xdp_frame out of map functions Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 5/8] xdp: add xdp_do_redirect_frame() for pre-computed xdp_frames Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 6/8] bpf: Add XDP_REDIRECT support to XDP for bpf_prog_run() Toke Høiland-Jørgensen
2021-12-12 2:43 ` Alexei Starovoitov
2021-12-13 16:26 ` Toke Høiland-Jørgensen
2021-12-14 0:02 ` Alexei Starovoitov
2021-12-14 0:36 ` Toke Høiland-Jørgensen
2021-12-14 3:45 ` Alexei Starovoitov
2021-12-14 11:46 ` Toke Høiland-Jørgensen [this message]
2021-12-11 18:41 ` [PATCH bpf-next v3 7/8] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run() Toke Høiland-Jørgensen
2021-12-11 18:41 ` [PATCH bpf-next v3 8/8] samples/bpf: Add xdp_trafficgen sample Toke Høiland-Jørgensen
2021-12-12 2:47 ` Alexei Starovoitov
2021-12-13 16:28 ` Toke Høiland-Jørgensen
2021-12-14 0:05 ` Alexei Starovoitov
2021-12-14 0:37 ` Toke Høiland-Jørgensen
2021-12-14 3:28 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874k7bz9w0.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).