From: Lorenzo Bianconi <lorenzo@kernel.org>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
lorenzo.bianconi@redhat.com, davem@davemloft.net,
kuba@kernel.org, ast@kernel.org, shayagr@amazon.com,
john.fastabend@gmail.com, dsahern@kernel.org, brouer@redhat.com,
echaudro@redhat.com, jasowang@redhat.com,
alexander.duyck@gmail.com, saeed@kernel.org,
maciej.fijalkowski@intel.com, sameehj@amazon.com
Subject: Re: [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support
Date: Sun, 31 Jan 2021 18:23:41 +0100 [thread overview]
Message-ID: <20210131172341.GA6003@lore-desk> (raw)
In-Reply-To: <572556bb-845f-1b4a-8f0a-fb6a4fc286e3@iogearbox.net>
[-- Attachment #1: Type: text/plain, Size: 5638 bytes --]
> Hi Lorenzo,
Hi Daniel,
sorry for the delay.
>
> On 1/19/21 9:20 PM, Lorenzo Bianconi wrote:
> > This series introduce XDP multi-buffer support. The mvneta driver is
> > the first to support these new "non-linear" xdp_{buff,frame}. Reviewers
> > please focus on how these new types of xdp_{buff,frame} packets
> > traverse the different layers and the layout design. It is on purpose
> > that BPF-helpers are kept simple, as we don't want to expose the
> > internal layout to allow later changes.
> >
> > For now, to keep the design simple and to maintain performance, the XDP
> > BPF-prog (still) only have access to the first-buffer. It is left for
> > later (another patchset) to add payload access across multiple buffers.
>
> I think xmas break has mostly wiped my memory from 2020 ;) so it would be
> good to describe the sketched out design for how this will look like inside
> the cover letter in terms of planned uapi exposure. (Additionally discussing
> api design proposal could also be sth for BPF office hour to move things
> quicker + posting a summary to the list for transparency of course .. just
> a thought.)
I guess the main goal of this series is to add the multi-buffer support to the
xdp core (e.g. in xdp_frame/xdp_buff or in xdp_return_{buff/frame}) and to provide
the first driver with xdp mult-ibuff support. We tried to make the changes
independent from eBPF helpers since we do not have defined use cases for them
yet and we don't want to expose the internal layout to allow later changes.
One possible example is bpf_xdp_adjust_mb_header() helper we sent in v2 patch 6/9
[0] to try to address use-case explained by Eric @ NetDev 0x14 [1].
Anyway I agree there are some missing bits we need to address (e.g. what is the
behaviour when we redirect a mb xdp_frame to a driver not supporting it?)
Ack, I agree we can discuss about mb eBPF helper APIs in BPF office hour mtg in
order to speed-up the process.
>
> Glancing over the series, while you've addressed the bpf_xdp_adjust_tail()
> helper API, this series will be breaking one assumption of programs at least
> for the mvneta driver from one kernel to another if you then use the multi
> buff mode, and that is basically bpf_xdp_event_output() API: the assumption
> is that you can do full packet capture by passing in the xdp buff len that
> is data_end - data ptr. We use it this way for sampling & others might as well
> (e.g. xdpcap). But bpf_xdp_copy() would only copy the first buffer today which
> would break the full pkt visibility assumption. Just walking the frags if
> xdp->mb bit is set would still need some sort of struct xdp_md exposure so
> the prog can figure out the actual full size..
ack, thx for pointing this out, I will take a look to it.
Eelco added xdp_len to xdp_md in the previous series (he is still working on
it). Another possible approach would be defining a helper, what do you think?
>
> > This patchset should still allow for these future extensions. The goal
> > is to lift the XDP MTU restriction that comes with XDP, but maintain
> > same performance as before.
> >
> > The main idea for the new multi-buffer layout is to reuse the same
> > layout used for non-linear SKB. We introduced a "xdp_shared_info" data
> > structure at the end of the first buffer to link together subsequent buffers.
> > xdp_shared_info will alias skb_shared_info allowing to keep most of the frags
> > in the same cache-line (while with skb_shared_info only the first fragment will
> > be placed in the first "shared_info" cache-line). Moreover we introduced some
> > xdp_shared_info helpers aligned to skb_frag* ones.
> > Converting xdp_frame to SKB and deliver it to the network stack is shown in
> > cpumap code (patch 7/8). Building the SKB, the xdp_shared_info structure
> > will be converted in a skb_shared_info one.
> >
> > A multi-buffer bit (mb) has been introduced in xdp_{buff,frame} structure
> > to notify the bpf/network layer if this is a xdp multi-buffer frame (mb = 1)
> > or not (mb = 0).
> > The mb bit will be set by a xdp multi-buffer capable driver only for
> > non-linear frames maintaining the capability to receive linear frames
> > without any extra cost since the xdp_shared_info structure at the end
> > of the first buffer will be initialized only if mb is set.
> >
> > Typical use cases for this series are:
> > - Jumbo-frames
> > - Packet header split (please see Google’s use-case @ NetDevConf 0x14, [0])
> > - TSO
> >
> > bpf_xdp_adjust_tail helper has been modified to take info account xdp
> > multi-buff frames.
>
> Also in terms of logistics (I think mentioned earlier already), for the series to
> be merged - as with other networking features spanning core + driver (example
> af_xdp) - we also need a second driver (ideally mlx5, i40e or ice) implementing
> this and ideally be submitted together in the same series for review. For that
> it probably also makes sense to more cleanly split out the core pieces from the
> driver ones. Either way, how is progress on that side coming along?
I do not have any updated news about it so far, but afaik amazon folks were working
on adding mb support to ena driver, while intel was planning to add it to af_xdp.
Moreover Jason was looking to add it to virtio-net.
>
> Thanks,
> Daniel
Regards,
Lorenzo
[0] https://patchwork.ozlabs.org/project/netdev/patch/b7475687bb09aac6ec051596a8ccbb311a54cb8a.1599165031.git.lorenzo@kernel.org/
[1] https://netdevconf.info/0x14/session.html?talk-the-path-to-tcp-4k-mtu-and-rx-zerocopy
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
next prev parent reply other threads:[~2021-01-31 20:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-19 20:20 [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 1/8] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 2/8] xdp: add xdp_shared_info data structure Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 3/8] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 4/8] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 5/8] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 6/8] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 7/8] net: xdp: add multi-buff support to xdp_build_skb_from_fram Lorenzo Bianconi
2021-01-19 20:20 ` [PATCH v6 bpf-next 8/8] bpf: add multi-buff support to the bpf_xdp_adjust_tail() API Lorenzo Bianconi
2021-01-23 1:03 ` [PATCH v6 bpf-next 0/8] mvneta: introduce XDP multi-buffer support Daniel Borkmann
2021-01-31 17:23 ` Lorenzo Bianconi [this message]
2021-02-03 13:24 ` Jubran, Samih
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210131172341.GA6003@lore-desk \
--to=lorenzo@kernel.org \
--cc=alexander.duyck@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=echaudro@redhat.com \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=maciej.fijalkowski@intel.com \
--cc=netdev@vger.kernel.org \
--cc=saeed@kernel.org \
--cc=sameehj@amazon.com \
--cc=shayagr@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).