netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	ast@kernel.org, daniel@iogearbox.net, mst@redhat.com
Subject: Re: [RFC PATCH net-next V2 0/6] XDP rx handler
Date: Tue, 14 Aug 2018 12:17:34 +0200	[thread overview]
Message-ID: <20180814121734.105769fa@redhat.com> (raw)
In-Reply-To: <5de3d14f-f21a-c806-51f4-b5efd7d809b7@redhat.com>

On Tue, 14 Aug 2018 15:59:01 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On 2018年08月14日 08:32, Alexei Starovoitov wrote:
> > On Mon, Aug 13, 2018 at 11:17:24AM +0800, Jason Wang wrote:  
> >> Hi:
> >>
> >> This series tries to implement XDP support for rx hanlder. This would
> >> be useful for doing native XDP on stacked device like macvlan, bridge
> >> or even bond.
> >>
> >> The idea is simple, let stacked device register a XDP rx handler. And
> >> when driver return XDP_PASS, it will call a new helper xdp_do_pass()
> >> which will try to pass XDP buff to XDP rx handler directly. XDP rx
> >> handler may then decide how to proceed, it could consume the buff, ask
> >> driver to drop the packet or ask the driver to fallback to normal skb
> >> path.
> >>
> >> A sample XDP rx handler was implemented for macvlan. And virtio-net
> >> (mergeable buffer case) was converted to call xdp_do_pass() as an
> >> example. For ease comparision, generic XDP support for rx handler was
> >> also implemented.
> >>
> >> Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP)
> >> shows about 83% improvement.  
> > I'm missing the motiviation for this.
> > It seems performance of such solution is ~1M packet per second.  
> 
> Notice it was measured by virtio-net which is kind of slow.
> 
> > What would be a real life use case for such feature ?  
> 
> I had another run on top of 10G mlx4 and macvlan:
> 
> XDP_DROP on mlx4: 14.0Mpps
> XDP_DROP on macvlan: 10.05Mpps
> 
> Perf shows macvlan_hash_lookup() and indirect call to 
> macvlan_handle_xdp() are the reasons for the number drop. I think the 
> numbers are acceptable. And we could try more optimizations on top.
> 
> So here's real life use case is trying to have an fast XDP path for rx 
> handler based device:
> 
> - For containers, we can run XDP for macvlan (~70% of wire speed). This 
> allows a container specific policy.
> - For VM, we can implement macvtap XDP rx handler on top. This allow us 
> to forward packet to VM without building skb in the setup of macvtap.
> - The idea could be used by other rx handler based device like bridge, 
> we may have a XDP fast forwarding path for bridge.
> 
> >
> > Another concern is that XDP users expect to get line rate performance
> > and native XDP delivers it. 'generic XDP' is a fallback only
> > mechanism to operate on NICs that don't have native XDP yet.  
> 
> So I can replace generic XDP TX routine with a native one for macvlan.

If you simply implement ndo_xdp_xmit() for macvlan, and instead use
XDP_REDIRECT, then we are basically done.


> > Toshiaki's veth XDP work fits XDP philosophy and allows
> > high speed networking to be done inside containers after veth.
> > It's trying to get to line rate inside container.  
> 
> This is one of the goal of this series as well. I agree veth XDP work 
> looks pretty fine, but it only work for a specific setup I believe since 
> it depends on XDP_REDIRECT which is supported by few drivers (and 
> there's no VF driver support). 

The XDP_REDIRECT (RX-side) is trivial to add to drivers.  It is a bad
argument that only a few drivers implement this.  Especially since all
drivers also need to be extended with your proposed xdp_do_pass() call.

(rant) The thing that is delaying XDP_REDIRECT adaption in drivers, is
that it is harder to implement the TX-side, as the ndo_xdp_xmit() call
have to allocate HW TX-queue resources.  If we disconnect RX and TX
side of redirect, then we can implement RX-side in an afternoon.


> And in order to make it work for a end 
> user, the XDP program still need logic like hash(map) lookup to 
> determine the destination veth.

That _is_ the general idea behind XDP and eBPF, that we need to add logic
that determine the destination.  The kernel provides the basic
mechanisms for moving/redirecting packets fast, and someone else
builds an orchestration tool like Cilium, that adds the needed logic.

Did you notice that we (Ahern) added bpf_fib_lookup a FIB route lookup
accessible from XDP.

For macvlan, I imagine that we could add a BPF helper that allows you
to lookup/call macvlan_hash_lookup().

 
> > This XDP rx handler stuff is destined to stay at 1Mpps speeds forever
> > and the users will get confused with forever slow modes of XDP.
> >
> > Please explain the problem you're trying to solve.
> > "look, here I can to XDP on top of macvlan" is not an explanation of the problem.
> >  


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2018-08-14 10:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13  3:17 [RFC PATCH net-next V2 0/6] XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 1/6] net: core: factor out generic XDP check and process routine Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 2/6] net: core: generic XDP support for stacked device Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 3/6] net: core: introduce XDP rx handler Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 4/6] macvlan: count the number of vlan in source mode Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 5/6] macvlan: basic XDP support Jason Wang
2018-08-13  3:17 ` [RFC PATCH net-next V2 6/6] virtio-net: support XDP rx handler Jason Wang
2018-08-14  9:22   ` Jesper Dangaard Brouer
2018-08-14 13:01     ` Jason Wang
2018-08-14  0:32 ` [RFC PATCH net-next V2 0/6] " Alexei Starovoitov
2018-08-14  7:59   ` Jason Wang
2018-08-14 10:17     ` Jesper Dangaard Brouer [this message]
2018-08-14 13:20       ` Jason Wang
2018-08-14 14:03         ` David Ahern
2018-08-15  0:29           ` Jason Wang
2018-08-15  5:35             ` Alexei Starovoitov
2018-08-15  7:04               ` Jason Wang
2018-08-16  2:49                 ` Alexei Starovoitov
2018-08-16  4:21                   ` Jason Wang
2018-08-15 17:17             ` David Ahern
2018-08-16  3:34               ` Jason Wang
2018-08-16  4:05                 ` Alexei Starovoitov
2018-08-16  4:24                   ` Jason Wang
2018-08-17 21:15                 ` David Ahern
2018-08-20  6:34                   ` Jason Wang
2018-09-05 17:20                     ` David Ahern
2018-09-06  5:12                       ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180814121734.105769fa@redhat.com \
    --to=jbrouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).