From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Björn Töpel" <bjorn.topel@gmail.com>,
"Karlsson, Magnus" <magnus.karlsson@intel.com>,
"Alexander Duyck" <alexander.h.duyck@intel.com>,
"Alexander Duyck" <alexander.duyck@gmail.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Alexei Starovoitov" <ast@fb.com>,
"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Network Development" <netdev@vger.kernel.org>,
"Björn Töpel" <bjorn.topel@intel.com>,
michael.lundkvist@ericsson.com, "Brandeburg,
Jesse" <jesse.brandeburg@intel.com>,
"Singhai, Anjali" <anjali.singhai@intel.com>,
"Zhang, Qi Z" <qi.z.zhang@intel.com>,
brouer@redhat.com
Subject: Re: [PATCH bpf-next v3 00/15] Introducing AF_XDP support
Date: Mon, 7 May 2018 15:09:40 +0200 [thread overview]
Message-ID: <20180507150940.2578d6e3@redhat.com> (raw)
In-Reply-To: <CAJ8uoz2L+3dw5OOHJJ0CyV-y5w1ooNAjKHzHAcAiKhTjAq-vzQ@mail.gmail.com>
On Mon, 7 May 2018 11:13:58 +0200
Magnus Karlsson <magnus.karlsson@gmail.com> wrote:
> On Sat, May 5, 2018 at 2:34 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> > On Fri, May 04, 2018 at 01:22:17PM +0200, Magnus Karlsson wrote:
> >> On Fri, May 4, 2018 at 1:38 AM, Alexei Starovoitov
> >> <alexei.starovoitov@gmail.com> wrote:
> >> > On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
> >> >> On 05/02/2018 01:01 PM, Björn Töpel wrote:
> >> >> > From: Björn Töpel <bjorn.topel@intel.com>
> >> >> >
> >> >> > This patch set introduces a new address family called AF_XDP that is
> >> >> > optimized for high performance packet processing and, in upcoming
> >> >> > patch sets, zero-copy semantics. In this patch set, we have removed
> >> >> > all zero-copy related code in order to make it smaller, simpler and
> >> >> > hopefully more review friendly. This patch set only supports copy-mode
> >> >> > for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
> >> >> > for RX using the XDP_DRV path. Zero-copy support requires XDP and
> >> >> > driver changes that Jesper Dangaard Brouer is working on. Some of his
> >> >> > work has already been accepted. We will publish our zero-copy support
> >> >> > for RX and TX on top of his patch sets at a later point in time.
> >> >>
> >> >> +1, would be great to see it land this cycle. Saw few minor nits here
> >> >> and there but nothing to hold it up, for the series:
> >> >>
> >> >> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> >> >>
> >> >> Thanks everyone!
> >> >
> >> > Great stuff!
> >> >
> >> > Applied to bpf-next, with one condition.
> >> > Upcoming zero-copy patches for both RX and TX need to be posted
> >> > and reviewed within this release window.
> >> > If netdev community as a whole won't be able to agree on the zero-copy
> >> > bits we'd need to revert this feature before the next merge window.
> >>
> >> Thanks everyone for reviewing this. Highly appreciated.
> >>
> >> Just so we understand the purpose correctly:
> >>
> >> 1: Do you want to see the ZC patches in order to verify that the user
> >> space API holds? If so, we can produce an additional RFC patch set
> >> using a big chunk of code that we had in RFC V1. We are not proud of
> >> this code since it is clunky, but it hopefully proves the point with
> >> the uapi being the same.
> >>
> >> 2: And/Or are you worried about us all (the netdev community) not
> >> agreeing on a way to implement ZC internally in the drivers and the
> >> XDP infrastructure? This is not going to be possible to finish during
> >> this cycle since we do not like the implementation we had in RFC V1.
> >> Too intrusive and now we also have nicer abstractions from Jesper that
> >> we can use and extend to provide a (hopefully) much cleaner and less
> >> intrusive solution.
> >
> > short answer: both.
> >
> > Cleanliness and performance of the ZC code is not as important as
> > getting API right. The main concern that during ZC review process
> > we will find out that existing API has issues, so we have to
> > do this exercise before the merge window.
> > And RFC won't fly. Send the patches for real. They have to go
> > through the proper code review. The hackers of netdev community
> > can accept a partial, or a bit unclean, or slightly inefficient
> > implementation, since it can be and will be improved later,
> > but API we cannot change once it goes into official release.
> >
> > Here is the example of API concern:
> > this patch set added shared umem concept. It sounds good in theory,
> > but will it perform well with ZC ? Earlier RFCs didn't have that
> > feature. If it won't perform well than it shouldn't be in the tree.
> > The key reason to let AF_XDP into the tree is its performance promise.
> > If it doesn't perform we should rip it out and redesign.
>
> That is a fair point. We will try to produce patch sets for zero-copy
> RX and TX using the latest interfaces within this merge window. Just
> note that we will focus on this for the next week(s) instead of the
> review items that you and Daniel Borkmann submitted. If we get those
> patch sets out in time and we agree that they are a possible way
> forward, then we produce patches with your fixes. It was mainly small
> items, so should be quick.
I would like to see that you create a new xdp_mem_type for this new
zero-copy type. This will allow other XDP redirect methods/types (e.g.
devmap and cpumap) to react appropriately when receiving a zero-copy
frame.
For devmap, I'm hoping we can allow/support using the ndo_xdp_xmit call
without (first) copying (into a newly allocated page). By arguing that
if an xsk-userspace app modify a frame it's not allowed to, then it is
simply a bug in the program. (Note, this would also allow using
ndo_xdp_xmit call for TX from xsk-userspace).
For cpumap, it is hard to avoid a copy, but I'm hoping we could delay
the copy (and alloc of mem dest area) until on the remote CPU. This is
already the principle of cpumap; of moving the allocation of the SKB to
the remote CPU.
For ZC to interact with XDP redirect-core and return API, the zero-copy
memory type/allocator, need to provide an area for the xdp_frame data
to be stored in (as we cannot allow using top-of-frame like
non-zero-copy variants), and extend xdp_frame with an ZC umem-id.
I imagine we can avoid any dynamic allocations, as we upfront (at bind
and XDP_UMEM_REG time) know the number of frames. (e.g. pre-alloc in
xdp_umem_reg() call, and have xdp_umem_get_xdp_frame lookup func).
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-05-07 13:09 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 11:01 [PATCH bpf-next v3 00/15] Introducing AF_XDP support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 01/15] net: initial AF_XDP skeleton Björn Töpel
2018-05-23 22:50 ` Stephen Hemminger
2018-05-24 6:38 ` Björn Töpel
2018-05-24 17:57 ` Alexei Starovoitov
2018-05-02 11:01 ` [PATCH bpf-next v3 02/15] xsk: add user memory registration support sockopt Björn Töpel
2018-05-04 12:34 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 03/15] xsk: add umem fill queue support and mmap Björn Töpel
2018-05-04 12:49 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 04/15] xsk: add Rx queue setup and mmap support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 05/15] xsk: add support for bind for Rx Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 06/15] xsk: add Rx receive functions and poll support Björn Töpel
2018-05-04 12:59 ` Daniel Borkmann
2018-05-22 7:42 ` Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 07/15] bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP Björn Töpel
2018-10-08 15:31 ` Eric Dumazet
2018-10-08 16:05 ` Björn Töpel
2018-10-08 16:52 ` Björn Töpel
2018-10-08 16:55 ` Eric Dumazet
2018-10-08 17:04 ` Björn Töpel
2018-10-08 17:40 ` [PATCH bpf] xsk: do not call synchronize_net() under RCU read lock Björn Töpel
2018-10-09 0:30 ` Song Liu
2018-10-11 8:22 ` Daniel Borkmann
2018-05-02 11:01 ` [PATCH bpf-next v3 08/15] xsk: wire up XDP_DRV side of AF_XDP Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 09/15] xsk: wire up XDP_SKB " Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 10/15] xsk: add umem completion queue support and mmap Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 11/15] xsk: add Tx queue setup and mmap support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 12/15] dev: packet: make packet_direct_xmit a common function Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 13/15] xsk: support for Tx Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 14/15] xsk: statistics support Björn Töpel
2018-05-02 11:01 ` [PATCH bpf-next v3 15/15] samples/bpf: sample application and documentation for AF_XDP sockets Björn Töpel
2018-05-02 20:59 ` Jesper Dangaard Brouer
2018-05-03 13:55 ` [PATCH bpf-next v3 00/15] Introducing AF_XDP support Willem de Bruijn
2018-05-03 15:07 ` David Miller
2018-05-03 22:49 ` Daniel Borkmann
2018-05-03 23:38 ` Alexei Starovoitov
2018-05-04 11:22 ` Magnus Karlsson
2018-05-05 0:34 ` Alexei Starovoitov
2018-05-07 9:13 ` Magnus Karlsson
2018-05-07 13:09 ` Jesper Dangaard Brouer [this message]
2018-05-07 19:47 ` Björn Töpel
2018-05-17 6:46 ` Björn Töpel
2018-05-18 3:38 ` Alexei Starovoitov
2018-05-18 13:43 ` Daniel Borkmann
2018-05-18 15:18 ` Björn Töpel
2018-05-18 16:17 ` Daniel Borkmann
2018-05-18 16:32 ` Björn Töpel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180507150940.2578d6e3@redhat.com \
--to=brouer@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=alexei.starovoitov@gmail.com \
--cc=anjali.singhai@intel.com \
--cc=ast@fb.com \
--cc=bjorn.topel@gmail.com \
--cc=bjorn.topel@intel.com \
--cc=daniel@iogearbox.net \
--cc=jesse.brandeburg@intel.com \
--cc=john.fastabend@gmail.com \
--cc=magnus.karlsson@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=michael.lundkvist@ericsson.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=qi.z.zhang@intel.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.