From: John Fastabend <john.fastabend@gmail.com>
To: "Md. Islam" <mislam4@kent.edu>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
David Ahern <dsahern@gmail.com>,
stephen@networkplumber.org, agaceph@gmail.com,
Pavel Emelyanov <xemul@openvz.org>,
Eric Dumazet <edumazet@google.com>,
alexei.starovoitov@gmail.com, brouer@redhat.com
Subject: Re: [PATCH v15 ] net/veth/XDP: Line-rate packet forwarding in kernel
Date: Mon, 2 Apr 2018 11:03:37 -0700 [thread overview]
Message-ID: <7cfca503-3e17-6287-8888-92d43ce7a2e7@gmail.com> (raw)
In-Reply-To: <CAFgPn1DX9cOpDRGj=wFwvZq_bpq6VFnEOzR1YbMuC0+=DFEWxA@mail.gmail.com>
On 04/01/2018 05:47 PM, Md. Islam wrote:
> This patch implements IPv4 forwarding on xdp_buff. I added a new
> config option XDP_ROUTER. Kernel would forward packets through fast
> path when this option is enabled. But it would require driver support.
> Currently it only works with veth. Here I have modified veth such that
> it outputs xdp_buff. I created a testbed in Mininet. The Mininet
> script (topology.py) is attached. Here the topology is:
>
> h1 -----r1-----h2 (r1 acts as a router)
>
> This patch improves the throughput from 53.8Gb/s to 60Gb/s on my
> machine. Median RTT also improved from around .055 ms to around .035
> ms.
>
> Then I disabled hyperthreading and cpu frequency scaling in order to
> utilize CPU cache (DPDK also utilizes CPU cache to improve
> forwarding). This further improves per-packet forwarding latency from
> around 400ns to 200 ns. More specifically, header parsing and fib
> lookup only takes around 82 ns. This shows that this could be used to
> implement linerate packet forwarding in kernel.
>
> The patch has been generated on 4.15.0+. Please let me know your
> feedback and suggestions. Please feel free to let me know if this
> approach make sense.
Make sense although lets try to avoid hard coded routing into
XDP xmit routines. See details below.
> +#ifdef CONFIG_XDP_ROUTER
> +int veth_xdp_xmit(struct net_device *dev, struct xdp_buff *xdp)
> +{
This is nice but instead of building a new config_xdp_router
just enable standard XDP for veth + a new helper call to
do routing. Then it will be immediately usable from any XDP
enabled device.
> + struct veth_priv *priv = netdev_priv(dev);
> + struct net_device *rcv;
> + struct ethhdr *ethh;
> + struct sk_buff *skb;
> + int length = xdp->data_end - xdp->data;
> +
> + rcu_read_lock();
> + rcv = rcu_dereference(priv->peer);
> + if (unlikely(!rcv)) {
> + kfree(xdp);
> + goto drop;
> + }
> +
> + /* Update MAC address and checksum */
> + ethh = eth_hdr_xdp(xdp);
> + ether_addr_copy(ethh->h_source, dev->dev_addr);
> + ether_addr_copy(ethh->h_dest, rcv->dev_addr);
> +
> + /* if IP forwarding is enabled on the receiver,
> + * call xdp_router_forward()
> + */
> + if (is_forwarding_enabled(rcv)) {
> + prefetch_xdp(xdp);
> + if (likely(xdp_router_forward(rcv, xdp) == NET_RX_SUCCESS)) {
> + struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
> +
> + u64_stats_update_begin(&stats->syncp);
> + stats->bytes += length;
> + stats->packets++;
> + u64_stats_update_end(&stats->syncp);
> + goto success;
> + }
> + }
> +
> + /* Local deliver */
> + skb = (struct sk_buff *)xdp->data_meta;
> + if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
> + struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
> +
> + u64_stats_update_begin(&stats->syncp);
> + stats->bytes += length;
> + stats->packets++;
> + u64_stats_update_end(&stats->syncp);
> + } else {
> +drop:
> + atomic64_inc(&priv->dropped);
> + }
> +success:
> + rcu_read_unlock();
> + return NETDEV_TX_OK;
> +}
> +#endif
> +
> static const struct net_device_ops veth_netdev_ops = {
> .ndo_init = veth_dev_init,
> .ndo_open = veth_open,
> @@ -290,6 +370,9 @@ static const struct net_device_ops veth_netdev_ops = {
> .ndo_get_iflink = veth_get_iflink,
> .ndo_features_check = passthru_features_check,
> .ndo_set_rx_headroom = veth_set_rx_headroom,
> +#ifdef CONFIG_XDP_ROUTER
> + .ndo_xdp_xmit = veth_xdp_xmit,
> +#endif
> };
>
[...]
> +#ifdef CONFIG_XDP_ROUTER
> +int ip_route_lookup(__be32 daddr, __be32 saddr,
> + u8 tos, struct net_device *dev,
> + struct fib_result *res);
> +#endif
> +
Can the above be a normal BPF helper that returns an
ifindex? Then something roughly like this patter would
work for all drivers with redirect support,
route_ifindex = ip_route_lookup(__daddr, ....)
if (!route_ifindex)
return do_foo()
return xdp_redirect(route_ifindex);
So my suggestion is,
1. enable veth xdp (including redirect support)
2. add a helper to lookup route from routing table
Alternatively you can skip step (2) and encode the routing
table in BPF directly. Maybe we need a more efficient data
structure but that should also work.
Thanks,
John
next prev parent reply other threads:[~2018-04-02 18:04 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-02 0:47 [PATCH v15 ] net/veth/XDP: Line-rate packet forwarding in kernel Md. Islam
2018-04-02 16:51 ` Stephen Hemminger
2018-04-02 18:03 ` John Fastabend [this message]
2018-04-02 18:09 ` David Ahern
2018-04-02 18:16 ` Alexei Starovoitov
2018-04-03 15:07 ` David Ahern
2018-04-03 16:41 ` John Fastabend
2018-04-03 16:45 ` David Miller
2018-04-03 17:00 ` David Ahern
2018-04-03 17:06 ` Alexei Starovoitov
2018-04-03 17:14 ` David Ahern
2018-04-03 17:37 ` Alexei Starovoitov
2018-04-04 1:09 ` David Ahern
2018-04-03 18:21 ` Jesper Dangaard Brouer
2018-04-04 1:16 ` David Ahern
2018-04-04 3:15 ` Md. Islam
2018-04-06 2:55 ` David Ahern
2018-04-10 4:27 ` Md. Islam
2018-04-04 6:16 ` Jesper Dangaard Brouer
2018-04-04 21:09 ` Md. Islam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cfca503-3e17-6287-8888-92d43ce7a2e7@gmail.com \
--to=john.fastabend@gmail.com \
--cc=agaceph@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=edumazet@google.com \
--cc=mislam4@kent.edu \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).