netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Saeed Mahameed <saeedm@mellanox.com>,
	netdev@vger.kernel.org, davem@davemloft.net
Cc: brouer@redhat.com, andy@greyhouse.net, daniel@iogearbox.net, ast@fb.com
Subject: Re: [RFC PATCH 03/12] xdp: add bpf_redirect helper function
Date: Mon, 10 Jul 2017 10:23:38 -0700	[thread overview]
Message-ID: <5963B81A.5070008@gmail.com> (raw)
In-Reply-To: <8cf998a4-dadb-0a1a-0f3d-9a4f8ca39287@mellanox.com>

On 07/09/2017 06:37 AM, Saeed Mahameed wrote:
> 
> 
> On 7/7/2017 8:35 PM, John Fastabend wrote:
>> This adds support for a bpf_redirect helper function to the XDP
>> infrastructure. For now this only supports redirecting to the egress
>> path of a port.
>>
>> In order to support drivers handling a xdp_buff natively this patches
>> uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff
>> to the specified device.
>>
>> If the program specifies either (a) an unknown device or (b) a device
>> that does not support the operation a BPF warning is thrown and the
>> XDP_ABORTED error code is returned.
>>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>> ---

[...]

>>
>> +static int __bpf_tx_xdp(struct net_device *dev, struct xdp_buff *xdp)
>> +{
>> +    if (dev->netdev_ops->ndo_xdp_xmit) {
>> +        dev->netdev_ops->ndo_xdp_xmit(dev, xdp);
> 
> Hi John,
> 
> I have some concern here regarding synchronizing between the
> redirecting device and the target device:
> 
> if the target device's NAPI is also doing XDP_TX on the same XDP TX
> ring which this NDO might be redirecting xdp packets into the same
> ring, there would be a race accessing this ring resources (buffers
> and descriptors). Maybe you addressed this issue in the device driver
> implementation of this ndo or with some NAPI tricks/assumptions, I
> guess we have the same issue for if you run the same program to
> redirect traffic from multiple netdevices into one netdevice, how do
> you synchronize accessing this TX ring ?

The implementation uses a per cpu TX ring to resolve these races. And
the pair of driver interface API calls, xdp_do_redirect() and xdp_do_flush_map()
must be completed in a single poll() handler.

This comment was included in the header file to document this,

/* The pair of xdp_do_redirect and xdp_do_flush_map MUST be called in the
 * same cpu context. Further for best results no more than a single map
 * for the do_redirect/do_flush pair should be used. This limitation is
 * because we only track one map and force a flush when the map changes.
 * This does not appear to be a real limitation for existing software.
 */

In general some documentation about implementing XDP would probably be
useful to add in Documentation/networking but this IMO goes beyond just
this patch series.

> 
> Maybe we need some clear guidelines in this ndo documentation stating
> how to implement this ndo and what are the assumptions on those XDP
> TX redirect rings or from which context this ndo can run.
> 
> can you please elaborate.

I think the best implementation is to use a per cpu TX ring as I did in
this series. If your device is limited by the number of queues for some
reason some other scheme would need to be devised. Unfortunately, the only
thing I've come up for this case (using only this series) would both impact
performance and make the code complex.

A nice solution might be to constrain networking "tasks" to only a subset
of cores. For 64+ core systems this might be a good idea. It would allow
avoiding locking using per_cpu logic but also avoid networking consuming
slices of every core in the system. As core count goes up I think we will
eventually need to address this.I believe Eric was thinking along these
lines with his netconf talk iirc. Obviously this work is way outside the
scope of this series though.


> Thanks,
> Saeed.
> 

  reply	other threads:[~2017-07-10 17:23 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-07 17:34 [RFC PATCH 00/12] Implement XDP bpf_redirect vairants John Fastabend
2017-07-07 17:34 ` [RFC PATCH 01/12] ixgbe: NULL xdp_tx rings on resource cleanup John Fastabend
2017-07-07 17:35 ` [RFC PATCH 02/12] net: xdp: support xdp generic on virtual devices John Fastabend
2017-07-07 17:35 ` [RFC PATCH 03/12] xdp: add bpf_redirect helper function John Fastabend
2017-07-09 13:37   ` Saeed Mahameed
2017-07-10 17:23     ` John Fastabend [this message]
2017-07-11 14:09       ` Andy Gospodarek
2017-07-11 18:38         ` John Fastabend
2017-07-11 19:38           ` Jesper Dangaard Brouer
2017-07-12 11:00             ` Saeed Mahameed
2017-07-07 17:35 ` [RFC PATCH 04/12] xdp: sample program for new bpf_redirect helper John Fastabend
2017-07-07 17:36 ` [RFC PATCH 05/12] net: implement XDP_REDIRECT for xdp generic John Fastabend
2017-07-07 17:36 ` [RFC PATCH 06/12] ixgbe: add initial support for xdp redirect John Fastabend
2017-07-07 17:36 ` [RFC PATCH 07/12] xdp: add trace event " John Fastabend
2017-07-07 17:37 ` [RFC PATCH 08/12] bpf: add devmap, a map for storing net device references John Fastabend
2017-07-08 18:57   ` Jesper Dangaard Brouer
2017-07-07 17:37 ` [RFC PATCH 09/12] bpf: add bpf_redirect_map helper routine John Fastabend
2017-07-07 17:37 ` [RFC PATCH 10/12] xdp: Add batching support to redirect map John Fastabend
2017-07-10 17:53   ` Jesper Dangaard Brouer
2017-07-10 17:56     ` John Fastabend
2017-07-07 17:38 ` [RFC PATCH 11/12] net: add notifier hooks for devmap bpf map John Fastabend
2017-07-07 17:38 ` [RFC PATCH 12/12] xdp: bpf redirect with map sample program John Fastabend
2017-07-07 17:48 ` [RFC PATCH 00/12] Implement XDP bpf_redirect vairants John Fastabend
2017-07-08  9:46   ` David Miller
2017-07-08 19:06     ` Jesper Dangaard Brouer
2017-07-10 18:30       ` Jesper Dangaard Brouer
2017-07-11  0:59         ` John Fastabend
2017-07-11 14:23           ` Jesper Dangaard Brouer
2017-07-11 18:26             ` John Fastabend
2017-07-13 11:14               ` Jesper Dangaard Brouer
2017-07-13 16:16                 ` Jesper Dangaard Brouer
2017-07-13 17:00                   ` John Fastabend
2017-07-13 18:21                     ` David Miller
2017-07-11 15:36       ` Jesper Dangaard Brouer
2017-07-11 17:48         ` John Fastabend
2017-07-11 18:01           ` Jesper Dangaard Brouer
2017-07-11 18:29             ` John Fastabend
2017-07-11 18:44             ` Jesper Dangaard Brouer
2017-07-11 18:56               ` John Fastabend
2017-07-11 19:19                 ` Jesper Dangaard Brouer
2017-07-11 19:37                   ` John Fastabend
2017-07-16  8:23                     ` Jesper Dangaard Brouer
2017-07-17 17:04                       ` Jesse Brandeburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5963B81A.5070008@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=andy@greyhouse.net \
    --cc=ast@fb.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).