From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Björn Töpel" <bjorn.topel@intel.com>
Cc: "Björn Töpel" <bjorn.topel@gmail.com>,
magnus.karlsson@intel.com, magnus.karlsson@gmail.com,
ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org,
u9012063@gmail.com, qi.z.zhang@intel.com,
jakub.kicinski@netronome.com, andrew@lunn.ch, brouer@redhat.com
Subject: Re: [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH
Date: Mon, 17 Dec 2018 15:10:17 +0100 [thread overview]
Message-ID: <20181217151017.17d708a6@redhat.com> (raw)
In-Reply-To: <0e9c13f2-190a-18c0-efa6-a7fe0096f98f@intel.com>
On Mon, 17 Dec 2018 14:39:57 +0100
Björn Töpel <bjorn.topel@intel.com> wrote:
> On 2018-12-17 13:50, Jesper Dangaard Brouer wrote:
> > On Mon, 17 Dec 2018 11:24:54 +0100
> > Björn Töpel <bjorn.topel@gmail.com> wrote:
> >
> >> XDP_ATTACH associates an XDP socket to a specific netdev Rx queue. To
> >> redirect a packet to an attached socket from XDP, the bpf_xsk_redirect
> >> helper is used. The bpf_xsk_redirect helper is also introduced in this
> >> series.
> >>
> >> Many XDP socket users just need a simple way of creating/binding a
> >> socket and receiving frames right away without a complicated XDP
> >> program. "Attached" XDP sockets removes the need for the XSKMAP, and
> >> allows for a trivial XDP program, e.g.:
> >>
> >> SEC("xdp")
> >> int xdp_prog(struct xdp_md *ctx)
> >> {
> >> return bpf_xsk_redirect(ctx);
> >> }
> >>
> >> An attached XDP socket also has better performance than the XSKMAP
> >> based sockets (performance numbers below).
> >
> > I still have a general problem with this approach.
> >
>
> And I appreciate that you have comments on the design/code! :-) Thank you!
>
>
> > The AF_XDP socket is build around (and gets its performance) from
> > being tied to a specific RX-queue. That design begs to have an XDP
> > program per RX-queue.
> >
> > Patchset-v1 moved towards this goal. But in this patchset-v2 you
> > steer away from this again, and work-around the issue with the current
> > limitations of 1-XDP program per netdev. (Which result in; if a
> > single AF_XDP socket is used in the system, which can ONLY be for a
> > single RX-queue by design, then ALL other XDP_PASS traffic also have
> > to take the overhead of indirect BPF call).
> >
>
> I agree that a per-queue-program would be a good fit, but I think it
> still makes sense to add XDP_ATTACH and the helper, to make it easier
> for the XDP program authors to use AF_XDP. Would you prefer that this
> functionality was help back, until per-queue programs are introduced?
Yes, for the reasons you yourself listed in next section:
> OTOH the implementation would probably look different if there was a
> per-q program, because this would open up for more optimizations. One
> could even imagine that the socket fd was part of the program (part of
> relocation process) when loading the program. That could get rid of yet
> another if-statement that check for socket existence. :-)
Yes, exactly. The implementation would probably look different, when
we look at it from a more generic angle, with per-q programs.
> > IMHO with this use-case, now is the time to introduce XDP programs per
> > RX-queue. Yes, it will be more work, but I will offer to helpout.
> > This should be generalized as XDP programs per RX-queue can be used by
> > other use-cases too:
> > In general terms: We can setup a NIC hardware filter to deliver
> > frame matching some criteria, then we can avoid rechecking these
> > criterias in on the (RX) CPU when/if we can attach an XDP prog to this
> > specific RX-queue directly. This *IS* exactly what AF_XDP does, but it
> > is in general useful for others, like CPUMAP redirect.
> >
>
> Fair enough, and thank you for offering to help out. And the fact that
> *other than AF_XDP* can benefit from that is good. Before we dive down
> this hole, what are the opinions of the BPF maintainers? Maybe it's
> better to hack an RFC, and then take the discussion?
As XDP grows, and more use-cases are added, then I fear that the single
XDP program per netdev is going to be a performance bottleneck. As the
single XDP program, will have to perform a lot of common checks before
it knows what use-case this packet match. With an XDP program per
RX-queue, we can instead leverage the hardware to pre-filter/sort
packets, and thus simplify the XDP programs.
And this patchset already do shows a performance advantage of
simplifying the XDP prog and allowing to store info per RX-queue (the
xsk-sock) that allows you do take a more direct action (avoiding exec
some of the redirect-core code).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-12-17 14:10 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-17 10:24 [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 1/7] xsk: simplify AF_XDP socket teardown Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 2/7] xsk: add XDP_ATTACH bind() flag Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 3/7] bpf: add bpf_xsk_redirect function Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 4/7] tools/bpf: sync kernel uapi bpf.h to tools directory Björn Töpel
2018-12-17 10:24 ` [PATCH bpf-next v2 5/7] libbpf: initial support for builtin BPF programs Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 6/7] samples: bpf: simplify/cleanup xdpsock Björn Töpel
2018-12-17 10:25 ` [PATCH bpf-next v2 7/7] samples: bpf: add support for XDP_ATTACH to xdpsock Björn Töpel
2018-12-17 12:50 ` [PATCH bpf-next v2 0/7] Add support for XDP_ATTACH Jesper Dangaard Brouer
2018-12-17 13:39 ` Björn Töpel
2018-12-17 14:10 ` Jesper Dangaard Brouer [this message]
2018-12-17 14:55 ` Magnus Karlsson
2018-12-17 15:30 ` Björn Töpel
2018-12-18 23:04 ` Alexei Starovoitov
2018-12-19 9:34 ` Björn Töpel
2018-12-19 16:23 ` Jesper Dangaard Brouer
2018-12-20 0:48 ` Alexei Starovoitov
2018-12-18 2:36 ` Jakub Kicinski
2018-12-18 8:59 ` Björn Töpel
2018-12-18 18:18 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181217151017.17d708a6@redhat.com \
--to=brouer@redhat.com \
--cc=andrew@lunn.ch \
--cc=ast@kernel.org \
--cc=bjorn.topel@gmail.com \
--cc=bjorn.topel@intel.com \
--cc=daniel@iogearbox.net \
--cc=jakub.kicinski@netronome.com \
--cc=magnus.karlsson@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=qi.z.zhang@intel.com \
--cc=u9012063@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).