From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
Alexei Starovoitov <ast@kernel.org>
Cc: "Daniel Borkmann" <daniel@iogearbox.net>,
netdev@vger.kernel.org,
"Jakub Kicinski" <jakub.kicinski@netronome.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Karlsson, Magnus" <magnus.karlsson@intel.com>,
"Björn Töpel" <bjorn.topel@intel.com>,
brouer@redhat.com
Subject: Re: [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring
Date: Wed, 1 Aug 2018 17:09:49 +0200 [thread overview]
Message-ID: <20180801170949.5bf6101e@redhat.com> (raw)
In-Reply-To: <90f355ef-1e56-5f12-ab78-a19c83fc9253@lab.ntt.co.jp>
On Wed, 1 Aug 2018 14:41:08 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> On 2018/07/31 21:46, Jesper Dangaard Brouer wrote:
> > On Tue, 31 Jul 2018 19:40:08 +0900
> > Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >
> >> On 2018/07/31 19:26, Jesper Dangaard Brouer wrote:
> >>>
> >>> Context needed from: [PATCH v6 bpf-next 2/9] veth: Add driver XDP
> >>>
> >>> On Mon, 30 Jul 2018 19:43:44 +0900
> >>> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >>>
[...]
> >>>
> >>> Here you are adding an assumption that struct xdp_frame is always
> >>> located in-the-top of the packet-data area. I tried hard not to add
> >>> such a dependency! You can calculate the beginning of the frame from
> >>> the xdp_frame->data pointer.
> >>>
> >>> Why not add such a dependency? Because for AF_XDP zero-copy, we cannot
> >>> make such an assumption.
> >>>
> >>> Currently, when an RX-queue is in AF-XDP-ZC mode (MEM_TYPE_ZERO_COPY)
> >>> the packet will get dropped when calling convert_to_xdp_frame(), but as
> >>> the TODO comment indicated in convert_to_xdp_frame() this is not the
> >>> end-goal.
> >>>
> >>> The comment in convert_to_xdp_frame(), indicate we need a full
> >>> alloc+copy, but that is actually not necessary, if we can just use
> >>> another memory area for struct xdp_frame, and a pointer to data. Thus,
> >>> allowing devmap-redir to work-ZC and allow cpumap-redir to do the copy
> >>> on the remote CPU.
> >>
> >> Thanks for pointing this out.
> >> Seems you are saying xdp_frame area is not reusable. That means we
> >> reduce usable headroom on every REDIRECT. I wanted to avoid this but
> >> actually it is impossible, right?
> >
> > I'm not sure I understand fully... has this something to do, with the
> > below memset?
>
> Sorry for not being so clear...
> It has something to do with the memset as well but mainly I was talking
> about XDP_TX and REDIRECT introduced in patch 8. On REDIRECT,
> dev_map_enqueue() calls convert_to_xdp_frame() so we use the headroom
> for struct xdp_frame on REDIRECT. If we don't reuse xdp_frame region of
> the original xdp packet, we reduce the headroom size each time on
> REDIRECT. When ZC is used, in the future xdp_frame can be non-contiguous
> to the buffer, so we cannot reuse the xdp_frame region in
> convert_to_xdp_frame()? But current convert_to_xdp_frame()
> implementation requires xdp_frame region in headroom so I think I cannot
> avoid this dependency now.
>
> SKB has a similar problem if we cannot reuse it. It can be passed to a
> bridge and redirected to another veth which has driver XDP. In that case
> we need to reallocate the page if we have reduced the headroom because
> sufficient headroom is required for XDP processing for now (can we
> remove this requirement actually?).
Okay, now I understand. Your changes allow multiple levels of
XDP_REDIRECT between/into other veth net_devices. This is very
interesting and exciting stuff, but also a bit scary, when thinking
about if we got he life-time correct for the different memory objects.
You have convinced me. We should not sacrifice/reduce the headroom
this way. I'll also fix up cpumap.
To avoid the performance penalty of the memset, I propose that we just
clear the xdp_frame->data pointer. But lets implement it via a common
sanitize/scrub function.
> > When cpumap generate an SKB for the netstack, then we sacrifice/reduce
> > the SKB headroom available, by in convert_to_xdp_frame() reducing the
> > headroom by xdp_frame size.
> >
> > xdp_frame->headroom = headroom - sizeof(*xdp_frame)
> >
> > In-order to avoid doing such memset of this area. We are actually only
> > worried about exposing the 'data' pointer, thus we could just clear
> > that. (See commit 6dfb970d3dbd, this is because Alexei is planing to
> > move from CAP_SYS_ADMIN to lesser privileged mode CAP_NET_ADMIN)
> >
> > See commits:
> > 97e19cce05e5 ("bpf: reserve xdp_frame size in xdp headroom")
> > 6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")
>
> We have talked about that...
> https://patchwork.ozlabs.org/patch/903536/
>
> The memset is introduced as per your feedback, but I'm still not sure if
> we need this. In general the headroom is not cleared after allocation in
> drivers, so anyway unprivileged users should not see it no matter if it
> contains xdp_frame or not...
I actually got this request from Alexei. That is why I implemented it.
Personally I don't think this clearing is really needed, until someone
actually makes the TC/cls_act BPF hook CAP_NET_ADMIN.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next prev parent reply other threads:[~2018-08-01 16:56 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-30 10:43 [PATCH v6 bpf-next 0/9] veth: Driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 1/9] net: Export skb_headers_offset_update Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 2/9] veth: Add driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 3/9] veth: Avoid drops by oversized packets when XDP is enabled Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring Toshiaki Makita
2018-07-31 10:26 ` Jesper Dangaard Brouer
2018-07-31 10:40 ` Toshiaki Makita
2018-07-31 12:46 ` Jesper Dangaard Brouer
2018-08-01 5:41 ` Toshiaki Makita
2018-08-01 15:09 ` Jesper Dangaard Brouer [this message]
2018-07-30 10:43 ` [PATCH v6 bpf-next 5/9] veth: Add ndo_xdp_xmit Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 6/9] bpf: Make redirect_info accessible from modules Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 7/9] xdp: Helpers for disabling napi_direct of xdp_return_frame Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 8/9] veth: Add XDP TX and REDIRECT Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 9/9] veth: Support per queue XDP ring Toshiaki Makita
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180801170949.5bf6101e@redhat.com \
--to=brouer@redhat.com \
--cc=ast@kernel.org \
--cc=bjorn.topel@intel.com \
--cc=daniel@iogearbox.net \
--cc=jakub.kicinski@netronome.com \
--cc=john.fastabend@gmail.com \
--cc=magnus.karlsson@intel.com \
--cc=makita.toshiaki@lab.ntt.co.jp \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).