All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: "Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>,
	bpf@vger.kernel.org, netdev@vger.kernel.org, davem@davemloft.net,
	kuba@kernel.org, ast@kernel.org, daniel@iogearbox.net,
	shayagr@amazon.com, sameehj@amazon.com, dsahern@kernel.org,
	"Eelco Chaudron" <echaudro@redhat.com>,
	brouer@redhat.com, "Tirthendu Sarkar" <tirtha@gmail.com>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>
Subject: Re: [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support
Date: Tue, 6 Oct 2020 09:30:11 +0200	[thread overview]
Message-ID: <20201006093011.36375745@carbon> (raw)
In-Reply-To: <5f7bf2b0bf899_4f19a2083f@john-XPS-13-9370.notmuch>

On Mon, 05 Oct 2020 21:29:36 -0700
John Fastabend <john.fastabend@gmail.com> wrote:

> Lorenzo Bianconi wrote:
> > [...]
> >   
> > > 
> > > In general I see no reason to populate these fields before the XDP
> > > program runs. Someone needs to convince me why having frags info before
> > > program runs is useful. In general headers should be preserved and first
> > > frag already included in the data pointers. If users start parsing further
> > > they might need it, but this series doesn't provide a way to do that
> > > so IMO without those helpers its a bit difficult to debate.  
> > 
> > We need to populate the skb_shared_info before running the xdp program in order to
> > allow the ebpf sanbox to access this data. If we restrict the access to the first
> > buffer only I guess we can avoid to do that but I think there is a value allowing
> > the xdp program to access this data.  
> 
> I agree. We could also only populate the fields if the program accesses
> the fields.

Notice, a driver will not initialize/use the shared_info area unless
there are more segments.  And (we have already established) the xdp->mb
bit is guarding BPF-prog from accessing shared_info area. 

> > A possible optimization can be access the shared_info only once before running
> > the ebpf program constructing the shared_info using a struct allocated on the
> > stack.  
> 
> Seems interesting, might be a good idea.

It *might* be a good idea ("alloc" shared_info on stack), but we should
benchmark this.  The prefetch trick might be fast enough.  But also
keep in mind the performance target, as with large size frames the
packet-per-sec we need to handle dramatically drop.


The TSO statement, I meant LRO (Large Receive Offload), but I want the
ability to XDP-redirect this frame out another netdev as TSO.  This
does means that we need more than 3 pages (2 frags slots) to store LRO
frames.  Thus, if we store this shared_info on the stack it might need
to be larger than we like.



> > Moreover we can define a "xdp_shared_info" struct to alias the skb_shared_info
> > one in order to have most on frags elements in the first "shared_info" cache line.
> >   
> > > 
> > > Specifically for XDP_TX case we can just flip the descriptors from RX
> > > ring to TX ring and keep moving along. This is going to be ideal on
> > > 40/100Gbps nics.

I think both approaches will still allow to do these page-flips.

> > > I'm not arguing that its likely possible to put some prefetch logic
> > > in there and keep the pipe full, but I would need to see that on
> > > a 100gbps nic to be convinced the details here are going to work. Or
> > > at minimum a 40gbps nic.

I'm looking forward to see how this performs on faster NICs.  Once we
have a high-speed NIC driver with this I can also start doing testing
in my testlab.


> > [...]
> >   
> > > Not against it, but these things are a bit tricky. Couple things I still
> > > want to see/understand
> > > 
> > >  - Lets see a 40gbps use a prefetch and verify it works in practice
> > >  - Explain why we can't just do this after XDP program runs  
> > 
> > how can we allow the ebpf program to access paged data if we do not do that?  
> 
> I don't see an easy way, but also this series doesn't have the data
> access support.

Eelco (Cc'ed) are working on patches that allow access to data in these
fragments, so far internal patches, which (sorry to mention) got
shutdown in internal review.


> Its hard to tell until we get at least a 40gbps nic if my concern about
> performance is real or not. Prefetching smartly could resolve some of the
> issue I guess.
> 
> If the Intel folks are working on it I think waiting would be great. Otherwise
> at minimum drop the helpers and be prepared to revert things if needed.

I do think it makes sense to drop the helpers for now, and focus on how
this new multi-buffer frame type is handled in the existing code, and do
some benchmarking on higher speed NIC, before the BPF-helper start to
lockdown/restrict what we can change/revert as they define UAPI.

E.g. existing code that need to handle this is existing helper
bpf_xdp_adjust_tail, which is something I have broad up before and even
described in[1].  Lets make sure existing code works with proposed
design, before introducing new helpers (and this makes it easier to
revert).

[1] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp-multi-buffer01-design.org#xdp-tail-adjust
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2020-10-06  7:30 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-02 14:41 [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support Lorenzo Bianconi
2020-10-02 14:41 ` [PATCH v4 bpf-next 01/13] xdp: introduce mb in xdp_buff/xdp_frame Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 02/13] xdp: initialize xdp_buff mb bit to 0 in all XDP drivers Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 03/13] net: mvneta: update mb bit before passing the xdp buffer to eBPF layer Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 04/13] xdp: add multi-buff support to xdp_return_{buff/frame} Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 05/13] net: mvneta: add multi buffer support to XDP_TX Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 06/13] bpf: introduce bpf_xdp_get_frags_{count, total_size} helpers Lorenzo Bianconi
2020-10-02 15:36   ` John Fastabend
2020-10-02 16:25     ` Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 07/13] samples/bpf: add bpf program that uses xdp mb helpers Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 08/13] bpf: move user_size out of bpf_test_init Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 09/13] bpf: introduce multibuff support to bpf_prog_test_run_xdp() Lorenzo Bianconi
2020-10-08  8:06   ` Shay Agroskin
2020-10-08 10:46     ` Jesper Dangaard Brouer
2020-10-02 14:42 ` [PATCH v4 bpf-next 10/13] bpf: test_run: add skb_shared_info pointer in bpf_test_finish signature Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 11/13] bpf: add xdp multi-buffer selftest Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 12/13] net: mvneta: enable jumbo frames for XDP Lorenzo Bianconi
2020-10-02 14:42 ` [PATCH v4 bpf-next 13/13] bpf: cpumap: introduce xdp multi-buff support Lorenzo Bianconi
2020-10-02 15:25 ` [PATCH v4 bpf-next 00/13] mvneta: introduce XDP multi-buffer support John Fastabend
2020-10-02 16:06   ` Lorenzo Bianconi
2020-10-02 18:06     ` John Fastabend
2020-10-05  9:52       ` Jesper Dangaard Brouer
2020-10-05 21:22         ` John Fastabend
2020-10-05 22:24           ` Lorenzo Bianconi
2020-10-06  4:29             ` John Fastabend
2020-10-06  7:30               ` Jesper Dangaard Brouer [this message]
2020-10-06 15:28                 ` Lorenzo Bianconi
2020-10-08 14:38                   ` John Fastabend
2020-10-02 19:53   ` Daniel Borkmann
2020-10-05 15:50     ` Tirthendu Sarkar
2020-10-06 12:39     ` Jubran, Samih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201006093011.36375745@carbon \
    --to=brouer@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=echaudro@redhat.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lorenzo.bianconi@redhat.com \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sameehj@amazon.com \
    --cc=shayagr@amazon.com \
    --cc=tirtha@gmail.com \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.