From: Jesper Dangaard Brouer <brouer@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <borkmann@iogearbox.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Alexei Starovoitov <ast@fb.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"xdp-newbies@vger.kernel.org" <xdp-newbies@vger.kernel.org>,
brouer@redhat.com
Subject: Re: XDP question: best API for returning/setting egress port?
Date: Wed, 19 Apr 2017 14:00:19 +0200 [thread overview]
Message-ID: <20170419140019.366fb1fb@redhat.com> (raw)
In-Reply-To: <58F67D15.3050308@gmail.com>
On Tue, 18 Apr 2017 13:54:45 -0700
John Fastabend <john.fastabend@gmail.com> wrote:
> On 17-04-18 12:58 PM, Jesper Dangaard Brouer wrote:
> >
> > As I argued in NetConf presentation[1] (from slide #9) we need a port
> > mapping table (instead of using ifindex'es). Both for supporting
> > other "port" types than net_devices (think sockets), and for
> > sandboxing what XDP can bypass.
> >
> > I want to create a new XDP action called XDP_REDIRECT, that instruct
> > XDP to send the xdp_buff to another "port" (get translated into a
> > net_device, or something else depending on internal port type).
> >
> > Looking at the userspace/eBPF interface, I'm wondering what is the
> > best API for "returning" this port number from eBPF?
> >
> > The options I see is:
> >
> > 1) Split-up the u32 action code, and e.g let the high-16-bit be the
> > port number and lower-16bit the (existing) action verdict.
> >
> > Pros: Simple API
> > Cons: Number of ports limited to 64K
> >
> > 2) Extend both xdp_buff + xdp_md to contain a (u32) port number, allow
> > eBPF to update xdp_md->port.
> >
> > Pros: Larger number of ports.
> > Cons: This require some ebpf translation steps between xdp_buff <-> xdp_md.
> > (see xdp_convert_ctx_access)
> >
> > 3) Extend only xdp_buff and create bpf_helper that set port in xdp_buff.
> >
> > Pros: Hides impl details, and allows helper to give eBPF code feedback
> > (on e.g. if port doesn't exist any longer)
> > Cons: Helper function call likely slower?
> >
> >
>
> How about doing this the same way redirect is done in the tc case? I have this
> patch under test,
>
> https://github.com/jrfastab/linux/commit/e78f5425d5e3c305b4170ddd85c61c2e15359fee
I have been looking at this approach, which is close to option #3 above.
The problem with your implementation that you use a per-cpu store.
This creates the problem of storing state between packets. First packet
can call helper bpf_xdp_redirect() setting an ifindex, but program can
still return XDP_PASS. Next packet can call XDP_REDIRECT and use the
ifindex set from the first packet. IMHO this is a problematic API to
expose.
I do see that the TC interface that uses the same approach, via helper
bpf_redirect(). Maybe it have the same API problem? Looking at
sch_handle_ingress() I don't see this is handled (e.g. by always
clearing this_cpu_ptr(redirect_info)->ifindex = 0).
> that should give you some idea. It just needs a port mapping table in the
> bpf_tx_xdp() call.
I'll take a closer look. I don't think we need the per-cpu-store
approach for XDP, as we might as well store the port info in xdp_buff,
or return it directly option #1.
(TC redirect need the per-cpu-store to avoid extending the SKB).
> > (Cc'ed xdp-newbies as end-users might have an opinion on UAPI?)
I would still like people to comment on the above options?
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2017-04-19 12:01 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-18 19:58 XDP question: best API for returning/setting egress port? Jesper Dangaard Brouer
2017-04-18 20:54 ` John Fastabend
2017-04-19 12:00 ` Jesper Dangaard Brouer [this message]
2017-04-19 12:33 ` Daniel Borkmann
2017-04-19 15:24 ` Jesper Dangaard Brouer
2017-04-19 12:25 ` Hannes Frederic Sowa
2017-04-19 20:02 ` Andy Gospodarek
2017-04-19 21:42 ` Daniel Borkmann
2017-04-20 17:12 ` Andy Gospodarek
2017-04-19 22:51 ` Daniel Borkmann
2017-04-20 2:56 ` xdp_redirect ifindex vs port. Was: " Alexei Starovoitov
2017-04-20 4:38 ` John Fastabend
2017-04-20 4:58 ` Alexei Starovoitov
2017-04-20 5:14 ` John Fastabend
2017-04-20 6:10 ` Jesper Dangaard Brouer
2017-04-20 17:10 ` Alexei Starovoitov
2017-04-25 9:34 ` Jesper Dangaard Brouer
2017-04-26 0:26 ` Alexei Starovoitov
2017-04-26 3:07 ` John Fastabend
2017-04-26 9:11 ` Jesper Dangaard Brouer
2017-04-26 16:35 ` John Fastabend
2017-04-26 17:58 ` Alexei Starovoitov
2017-04-26 20:55 ` Andy Gospodarek
2017-04-27 8:41 ` Jesper Dangaard Brouer
2017-04-27 23:31 ` Alexei Starovoitov
2017-04-28 5:06 ` John Fastabend
2017-04-28 5:30 ` Alexei Starovoitov
2017-04-28 19:43 ` Hannes Frederic Sowa
2017-04-30 1:35 ` Alexei Starovoitov
2017-04-28 10:58 ` Jesper Dangaard Brouer
2017-04-30 1:04 ` Alexei Starovoitov
2017-04-30 22:55 ` John Fastabend
2017-04-20 6:39 ` XDP question: " Jesper Dangaard Brouer
2017-04-20 4:43 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170419140019.366fb1fb@redhat.com \
--to=brouer@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=ast@fb.com \
--cc=borkmann@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.