netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
	Alexei Starovoitov <ast@kernel.org>
Cc: "Daniel Borkmann" <daniel@iogearbox.net>,
	netdev@vger.kernel.org,
	"Jakub Kicinski" <jakub.kicinski@netronome.com>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Karlsson, Magnus" <magnus.karlsson@intel.com>,
	"Björn Töpel" <bjorn.topel@intel.com>,
	brouer@redhat.com
Subject: Re: [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring
Date: Wed, 1 Aug 2018 17:09:49 +0200	[thread overview]
Message-ID: <20180801170949.5bf6101e@redhat.com> (raw)
In-Reply-To: <90f355ef-1e56-5f12-ab78-a19c83fc9253@lab.ntt.co.jp>

On Wed, 1 Aug 2018 14:41:08 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:

> On 2018/07/31 21:46, Jesper Dangaard Brouer wrote:
> > On Tue, 31 Jul 2018 19:40:08 +0900
> > Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >   
> >> On 2018/07/31 19:26, Jesper Dangaard Brouer wrote:  
> >>>
> >>> Context needed from: [PATCH v6 bpf-next 2/9] veth: Add driver XDP
> >>>
> >>> On Mon, 30 Jul 2018 19:43:44 +0900
> >>> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
> >>>     
[...]
> >>>
> >>> Here you are adding an assumption that struct xdp_frame is always
> >>> located in-the-top of the packet-data area.  I tried hard not to add
> >>> such a dependency!  You can calculate the beginning of the frame from
> >>> the xdp_frame->data pointer.
> >>>
> >>> Why not add such a dependency?  Because for AF_XDP zero-copy, we cannot
> >>> make such an assumption.  
> >>>
> >>> Currently, when an RX-queue is in AF-XDP-ZC mode (MEM_TYPE_ZERO_COPY)
> >>> the packet will get dropped when calling convert_to_xdp_frame(), but as
> >>> the TODO comment indicated in convert_to_xdp_frame() this is not the
> >>> end-goal. 
> >>>
> >>> The comment in convert_to_xdp_frame(), indicate we need a full
> >>> alloc+copy, but that is actually not necessary, if we can just use
> >>> another memory area for struct xdp_frame, and a pointer to data.  Thus,
> >>> allowing devmap-redir to work-ZC and allow cpumap-redir to do the copy
> >>> on the remote CPU.    
> >>
> >> Thanks for pointing this out.
> >> Seems you are saying xdp_frame area is not reusable. That means we
> >> reduce usable headroom on every REDIRECT. I wanted to avoid this but
> >> actually it is impossible, right?  
> > 
> > I'm not sure I understand fully...  has this something to do, with the
> > below memset?  
> 
> Sorry for not being so clear...
> It has something to do with the memset as well but mainly I was talking
> about XDP_TX and REDIRECT introduced in patch 8. On REDIRECT,
> dev_map_enqueue() calls convert_to_xdp_frame() so we use the headroom
> for struct xdp_frame on REDIRECT. If we don't reuse xdp_frame region of
> the original xdp packet, we reduce the headroom size each time on
> REDIRECT. When ZC is used, in the future xdp_frame can be non-contiguous
> to the buffer, so we cannot reuse the xdp_frame region in
> convert_to_xdp_frame()? But current convert_to_xdp_frame()
> implementation requires xdp_frame region in headroom so I think I cannot
> avoid this dependency now.
> 
> SKB has a similar problem if we cannot reuse it. It can be passed to a
> bridge and redirected to another veth which has driver XDP. In that case
> we need to reallocate the page if we have reduced the headroom because
> sufficient headroom is required for XDP processing for now (can we
> remove this requirement actually?).

Okay, now I understand.  Your changes allow multiple levels of
XDP_REDIRECT between/into other veth net_devices.  This is very
interesting and exciting stuff, but also a bit scary, when thinking
about if we got he life-time correct for the different memory objects.

You have convinced me.  We should not sacrifice/reduce the headroom
this way.  I'll also fix up cpumap.

To avoid the performance penalty of the memset, I propose that we just
clear the xdp_frame->data pointer.  But lets implement it via a common
sanitize/scrub function.


> > When cpumap generate an SKB for the netstack, then we sacrifice/reduce
> > the SKB headroom available, by in convert_to_xdp_frame() reducing the
> > headroom by xdp_frame size.
> > 
> >  xdp_frame->headroom = headroom - sizeof(*xdp_frame)
> > 
> > In-order to avoid doing such memset of this area.  We are actually only
> > worried about exposing the 'data' pointer, thus we could just clear
> > that.  (See commit 6dfb970d3dbd, this is because Alexei is planing to
> > move from CAP_SYS_ADMIN to lesser privileged mode CAP_NET_ADMIN)
> > 
> > See commits:
> >  97e19cce05e5 ("bpf: reserve xdp_frame size in xdp headroom")
> >  6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")  
> 
> We have talked about that...
> https://patchwork.ozlabs.org/patch/903536/
> 
> The memset is introduced as per your feedback, but I'm still not sure if
> we need this. In general the headroom is not cleared after allocation in
> drivers, so anyway unprivileged users should not see it no matter if it
> contains xdp_frame or not...

I actually got this request from Alexei. That is why I implemented it.
Personally I don't think this clearing is really needed, until someone
actually makes the TC/cls_act BPF hook CAP_NET_ADMIN.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-08-01 16:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-30 10:43 [PATCH v6 bpf-next 0/9] veth: Driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 1/9] net: Export skb_headers_offset_update Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 2/9] veth: Add driver XDP Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 3/9] veth: Avoid drops by oversized packets when XDP is enabled Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 4/9] veth: Handle xdp_frames in xdp napi ring Toshiaki Makita
2018-07-31 10:26   ` Jesper Dangaard Brouer
2018-07-31 10:40     ` Toshiaki Makita
2018-07-31 12:46       ` Jesper Dangaard Brouer
2018-08-01  5:41         ` Toshiaki Makita
2018-08-01 15:09           ` Jesper Dangaard Brouer [this message]
2018-07-30 10:43 ` [PATCH v6 bpf-next 5/9] veth: Add ndo_xdp_xmit Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 6/9] bpf: Make redirect_info accessible from modules Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 7/9] xdp: Helpers for disabling napi_direct of xdp_return_frame Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 8/9] veth: Add XDP TX and REDIRECT Toshiaki Makita
2018-07-30 10:43 ` [PATCH v6 bpf-next 9/9] veth: Support per queue XDP ring Toshiaki Makita

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180801170949.5bf6101e@redhat.com \
    --to=brouer@redhat.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=john.fastabend@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).