From: Andy Gospodarek <andy@greyhouse.net>
To: John Fastabend <john.fastabend@gmail.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@fb.com>
Subject: Re: [RFC PATCH 03/12] xdp: add bpf_redirect helper function
Date: Tue, 11 Jul 2017 10:09:19 -0400 [thread overview]
Message-ID: <CAHashqA8UYWAX7TDVrnGiTCA3zN5RGyiOQ3CcJpFDNnMemRnoQ@mail.gmail.com> (raw)
In-Reply-To: <5963B81A.5070008@gmail.com>
On Mon, Jul 10, 2017 at 1:23 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 07/09/2017 06:37 AM, Saeed Mahameed wrote:
>>
>>
>> On 7/7/2017 8:35 PM, John Fastabend wrote:
>>> This adds support for a bpf_redirect helper function to the XDP
>>> infrastructure. For now this only supports redirecting to the egress
>>> path of a port.
>>>
>>> In order to support drivers handling a xdp_buff natively this patches
>>> uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff
>>> to the specified device.
>>>
>>> If the program specifies either (a) an unknown device or (b) a device
>>> that does not support the operation a BPF warning is thrown and the
>>> XDP_ABORTED error code is returned.
>>>
>>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>>> ---
>
> [...]
>
>>>
>>> +static int __bpf_tx_xdp(struct net_device *dev, struct xdp_buff *xdp)
>>> +{
>>> + if (dev->netdev_ops->ndo_xdp_xmit) {
>>> + dev->netdev_ops->ndo_xdp_xmit(dev, xdp);
>>
>> Hi John,
>>
>> I have some concern here regarding synchronizing between the
>> redirecting device and the target device:
>>
>> if the target device's NAPI is also doing XDP_TX on the same XDP TX
>> ring which this NDO might be redirecting xdp packets into the same
>> ring, there would be a race accessing this ring resources (buffers
>> and descriptors). Maybe you addressed this issue in the device driver
>> implementation of this ndo or with some NAPI tricks/assumptions, I
>> guess we have the same issue for if you run the same program to
>> redirect traffic from multiple netdevices into one netdevice, how do
>> you synchronize accessing this TX ring ?
>
> The implementation uses a per cpu TX ring to resolve these races. And
> the pair of driver interface API calls, xdp_do_redirect() and xdp_do_flush_map()
> must be completed in a single poll() handler.
>
> This comment was included in the header file to document this,
>
> /* The pair of xdp_do_redirect and xdp_do_flush_map MUST be called in the
> * same cpu context. Further for best results no more than a single map
> * for the do_redirect/do_flush pair should be used. This limitation is
> * because we only track one map and force a flush when the map changes.
> * This does not appear to be a real limitation for existing software.
> */
>
> In general some documentation about implementing XDP would probably be
> useful to add in Documentation/networking but this IMO goes beyond just
> this patch series.
>
>>
>> Maybe we need some clear guidelines in this ndo documentation stating
>> how to implement this ndo and what are the assumptions on those XDP
>> TX redirect rings or from which context this ndo can run.
>>
>> can you please elaborate.
>
> I think the best implementation is to use a per cpu TX ring as I did in
> this series. If your device is limited by the number of queues for some
> reason some other scheme would need to be devised. Unfortunately, the only
> thing I've come up for this case (using only this series) would both impact
> performance and make the code complex.
>
> A nice solution might be to constrain networking "tasks" to only a subset
> of cores. For 64+ core systems this might be a good idea. It would allow
> avoiding locking using per_cpu logic but also avoid networking consuming
> slices of every core in the system. As core count goes up I think we will
> eventually need to address this.I believe Eric was thinking along these
> lines with his netconf talk iirc. Obviously this work is way outside the
> scope of this series though.
I agree that it is outside the scope of this series, but I think it is
important to consider the impact of the output queue selection in both
a heterogenous and homogenous driver setup and how tx could be
optimized or even considered to be more reliable and I think that was
part of Saeed's point.
I got base redirect support for bnxt_en working yesterday, but for it
and other drivers that do not necessarily create a ring/queue per core
like ixgbe there is probably a bit more to work in each driver to
properly track output tx rings/queues than what you have done with
ixgbe.
>
>
>> Thanks,
>> Saeed.
>>
next prev parent reply other threads:[~2017-07-11 14:09 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-07 17:34 [RFC PATCH 00/12] Implement XDP bpf_redirect vairants John Fastabend
2017-07-07 17:34 ` [RFC PATCH 01/12] ixgbe: NULL xdp_tx rings on resource cleanup John Fastabend
2017-07-07 17:35 ` [RFC PATCH 02/12] net: xdp: support xdp generic on virtual devices John Fastabend
2017-07-07 17:35 ` [RFC PATCH 03/12] xdp: add bpf_redirect helper function John Fastabend
2017-07-09 13:37 ` Saeed Mahameed
2017-07-10 17:23 ` John Fastabend
2017-07-11 14:09 ` Andy Gospodarek [this message]
2017-07-11 18:38 ` John Fastabend
2017-07-11 19:38 ` Jesper Dangaard Brouer
2017-07-12 11:00 ` Saeed Mahameed
2017-07-07 17:35 ` [RFC PATCH 04/12] xdp: sample program for new bpf_redirect helper John Fastabend
2017-07-07 17:36 ` [RFC PATCH 05/12] net: implement XDP_REDIRECT for xdp generic John Fastabend
2017-07-07 17:36 ` [RFC PATCH 06/12] ixgbe: add initial support for xdp redirect John Fastabend
2017-07-07 17:36 ` [RFC PATCH 07/12] xdp: add trace event " John Fastabend
2017-07-07 17:37 ` [RFC PATCH 08/12] bpf: add devmap, a map for storing net device references John Fastabend
2017-07-08 18:57 ` Jesper Dangaard Brouer
2017-07-07 17:37 ` [RFC PATCH 09/12] bpf: add bpf_redirect_map helper routine John Fastabend
2017-07-07 17:37 ` [RFC PATCH 10/12] xdp: Add batching support to redirect map John Fastabend
2017-07-10 17:53 ` Jesper Dangaard Brouer
2017-07-10 17:56 ` John Fastabend
2017-07-07 17:38 ` [RFC PATCH 11/12] net: add notifier hooks for devmap bpf map John Fastabend
2017-07-07 17:38 ` [RFC PATCH 12/12] xdp: bpf redirect with map sample program John Fastabend
2017-07-07 17:48 ` [RFC PATCH 00/12] Implement XDP bpf_redirect vairants John Fastabend
2017-07-08 9:46 ` David Miller
2017-07-08 19:06 ` Jesper Dangaard Brouer
2017-07-10 18:30 ` Jesper Dangaard Brouer
2017-07-11 0:59 ` John Fastabend
2017-07-11 14:23 ` Jesper Dangaard Brouer
2017-07-11 18:26 ` John Fastabend
2017-07-13 11:14 ` Jesper Dangaard Brouer
2017-07-13 16:16 ` Jesper Dangaard Brouer
2017-07-13 17:00 ` John Fastabend
2017-07-13 18:21 ` David Miller
2017-07-11 15:36 ` Jesper Dangaard Brouer
2017-07-11 17:48 ` John Fastabend
2017-07-11 18:01 ` Jesper Dangaard Brouer
2017-07-11 18:29 ` John Fastabend
2017-07-11 18:44 ` Jesper Dangaard Brouer
2017-07-11 18:56 ` John Fastabend
2017-07-11 19:19 ` Jesper Dangaard Brouer
2017-07-11 19:37 ` John Fastabend
2017-07-16 8:23 ` Jesper Dangaard Brouer
2017-07-17 17:04 ` Jesse Brandeburg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHashqA8UYWAX7TDVrnGiTCA3zN5RGyiOQ3CcJpFDNnMemRnoQ@mail.gmail.com \
--to=andy@greyhouse.net \
--cc=ast@fb.com \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).