From: Florian Fainelli <f.fainelli@gmail.com>
To: sfeldma@gmail.com, netdev@vger.kernel.org
Cc: jiri@resnulli.us, roopa@cumulusnetworks.com,
simon.horman@netronome.com, nicolas.dichtel@6wind.com,
andrew@lunn.ch, vivien.didelot@savoirfairelinux.com,
linux@roeck-us.net
Subject: Re: [PATCH net-next v3 0/5] switchdev: avoid duplicate packet forwarding
Date: Mon, 20 Jul 2015 10:57:27 -0700 [thread overview]
Message-ID: <55AD3687.9010002@gmail.com> (raw)
In-Reply-To: <1437269092-17669-1-git-send-email-sfeldma@gmail.com>
Hi Scott,
On 18/07/15 18:24, sfeldma@gmail.com wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> v3:
>
> - Per Nicolas Dichtel review: remove errant empty union.
>
> v2:
>
> - Per davem review: in sk_buff, union fwd_mark with secmark to save space
> since features appear to be mutually exclusive.
> - Per Simon Horman review:
> - fix grammar in switchdev.txt wrt fwd_mark
> - remove some unrelated changes that snuck in
>
> v1:
>
> This patchset was previously submitted as RFC. No changes from the last
> version (v2) sent under RFC. Including RFC version history here for reference.
This looks good to me, and it looks like this is also relevant for
DSA-driven switches, however I am not exactly clear on how we could
potentially use this on such switches.
For Broadcom tags, there is a reason code forwarded along with the tag
making it to the CPU, and the "mirror" bit sounds like something that
should dictate whether we should be setting the fwd_mark or not.
What does it look like on Marvell switches?
>
> RFC v2:
>
> - s/fwd_mark/offload_fwd_mark
> - use consume_skb rather than kfree_skb when dropping pkt on egress.
> - Use Jiri's suggestion to use ifindex of one of the ports in a group
> as the mark for all the ports in the group. This can be done with
> no additional storage (no hashtable from v1). To pull it off, we
> need some simple recursive routines to walk the netdev tree ensuring
> all leaves in the tree (ports) in the same group (e.g. bridge)
> belonging to the same switch device will have the same offload fwd mark.
> Maybe someone sees a better design for the recusive routines? They're
> not too bad, and should cover the stacked driver cases.
>
> RFC v1:
>
> With switchdev support for offloading L2/L3 forwarding data path to a
> switch device, we have a general problem where both the device and the
> kernel may forward the packet, resulting in duplicate packets on the wire.
> Anytime a packet is forwarded by the device and a copy is sent to the CPU,
> there is potential for duplicate forwarding, as the kernel may also do a
> forwarding lookup and send the packet on the wire.
>
> The specific problem this patch series is interested in solving is avoiding
> duplicate packets on bridged ports. There was a previous RFC from Roopa
> (http://marc.info/?l=linux-netdev&m=142687073314252&w=2) to address this
> problem, but didn't solve the problem of mixed ports in the bridge from
> different devices; there was no way to exclude some ports from forwarding
> and include others. This RFC solves that problem by tagging the ingressing
> packet with a unique mark, and then comparing the packet mark with the
> egress port mark, and skip forwarding when there is a match. For the mixed
> ports bridge case, only those ports with matching marks are skipped.
>
> The switchdev port driver must do two things:
>
> 1) Generate a fwd_mark for each switch port, using some unique key of the
> switch device (and optionally port). This is done when the port netdev
> is registered or if the port's group membership changes (joins/leaves
> a bridge, for example).
>
> 2) On packet ingress from port, mark the skb with the ingress port's
> fwd_mark. If the device supports it, it's useful to only mark skbs
> which were already forwarded by the device. If the device does not
> support such indication, all skbs can be marked, even if they're
> local dst.
>
> Two new 32-bit fields are added to struct sk_buff and struct netdevice to
> hold the fwd_mark. I've wrapped these with CONFIG_NET_SWITCHDEV for now. I
> tried using skb->mark for this purpose, but ebtables can overwrite the
> skb->mark before the bridge gets it, so that will not work.
>
> In general, this fwd_mark can be used for any case where a packet is
> forwarded by the device and a copy is sent to the CPU, to avoid the kernel
> re-forwarding the packet. sFlow is another use-case that comes to mind,
> but I haven't explored the details.
> Scott Feldman (5):
> net: don't reforward packets already forwarded by offload device
> net: add phys ID compare helper to test if two IDs are the same
> switchdev: add offload_fwd_mark generator helper
> rocker: add offload_fwd_mark support
> switchdev: update documentation for offload_fwd_mark
>
> Documentation/networking/switchdev.txt | 14 +++-
> drivers/net/ethernet/rocker/rocker.c | 11 ++++
> drivers/net/ethernet/rocker/rocker.h | 1 +
> include/linux/netdevice.h | 13 ++++
> include/linux/skbuff.h | 9 ++-
> include/net/switchdev.h | 9 +++
> net/core/dev.c | 10 +++
> net/switchdev/switchdev.c | 111 ++++++++++++++++++++++++++++++--
> 8 files changed, 169 insertions(+), 9 deletions(-)
>
--
Florian
next prev parent reply other threads:[~2015-07-20 17:59 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-19 1:24 [PATCH net-next v3 0/5] switchdev: avoid duplicate packet forwarding sfeldma
2015-07-19 1:24 ` [PATCH net-next v3 1/5] net: don't reforward packets already forwarded by offload device sfeldma
2015-07-20 7:21 ` Nicolas Dichtel
2015-07-19 1:24 ` [PATCH net-next v3 2/5] net: add phys ID compare helper to test if two IDs are the same sfeldma
2015-07-19 1:24 ` [PATCH net-next v3 3/5] switchdev: add offload_fwd_mark generator helper sfeldma
2015-07-19 1:24 ` [PATCH net-next v3 4/5] rocker: add offload_fwd_mark support sfeldma
2015-07-19 1:24 ` [PATCH net-next v3 5/5] switchdev: update documentation for offload_fwd_mark sfeldma
2015-07-20 17:57 ` Florian Fainelli [this message]
2015-07-20 18:08 ` [PATCH net-next v3 0/5] switchdev: avoid duplicate packet forwarding Andrew Lunn
2015-07-21 1:33 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55AD3687.9010002@gmail.com \
--to=f.fainelli@gmail.com \
--cc=andrew@lunn.ch \
--cc=jiri@resnulli.us \
--cc=linux@roeck-us.net \
--cc=netdev@vger.kernel.org \
--cc=nicolas.dichtel@6wind.com \
--cc=roopa@cumulusnetworks.com \
--cc=sfeldma@gmail.com \
--cc=simon.horman@netronome.com \
--cc=vivien.didelot@savoirfairelinux.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.