From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [RFC PATCH net-next 0/4] switchdev: avoid duplicate packet forwarding Date: Tue, 16 Jun 2015 23:11:47 +0200 Message-ID: <20150616211147.GB2135@nanopsycho.orion> References: <1434218670-43821-1-git-send-email-sfeldma@gmail.com> <20150615.162551.805611215439524288.davem@davemloft.net> <20150616060427.GA2135@nanopsycho.orion> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: David Miller , Netdev , "simon.horman@netronome.com" , Roopa Prabhu , "Arad, Ronen" , "Fastabend, John R" , "andrew@lunn.ch" , Florian Fainelli , Guenter Roeck , davidch , "stephen@networkplumber.org" To: Scott Feldman Return-path: Received: from mail-wi0-f176.google.com ([209.85.212.176]:35931 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752489AbbFPVLu (ORCPT ); Tue, 16 Jun 2015 17:11:50 -0400 Received: by wicnd19 with SMTP id nd19so9783925wic.1 for ; Tue, 16 Jun 2015 14:11:49 -0700 (PDT) Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Tue, Jun 16, 2015 at 06:47:47PM CEST, sfeldma@gmail.com wrote: >On Mon, Jun 15, 2015 at 11:04 PM, Jiri Pirko wrote: >> Tue, Jun 16, 2015 at 01:25:51AM CEST, davem@davemloft.net wrote: >>>From: sfeldma@gmail.com >>>Date: Sat, 13 Jun 2015 11:04:26 -0700 >>> >>>> The switchdev port driver must do two things: >>>> >>>> 1) Generate a fwd_mark for each switch port, using some unique key of the >>>> switch device (and optionally port). This is a one-time operation done >>>> when port's netdev is setup. >>>> >>>> 2) On packet ingress from port, mark the skb with the ingress port's >>>> fwd_mark. If the device supports it, it's useful to only mark skbs >>>> which were already forwarded by the device. If the device does not >>>> support such indication, all skbs can be marked, even if they're >>>> local dst. >>>> >>>> Two new 32-bit fields are added to struct sk_buff and struct netdevice to >>>> hold the fwd_mark. I've wrapped these with CONFIG_NET_SWITCHDEV for now. I >>>> tried using skb->mark for this purpose, but ebtables can overwrite the >>>> skb->mark before the bridge gets it, so that will not work. >>>> >>>> In general, this fwd_mark can be used for any case where a packet is >>>> forwarded by the device and a copy is sent to the CPU, to avoid the kernel >>>> re-forwarding the packet. sFlow is another use-case that comes to mind, >>>> but I haven't explored the details. >>> >>>Generally I'm against adding new fields fo sk_buff but I'm trying to be >>>open minded. :-) >>> >>>About the per-device fwd_mark, if the key attribute is uniqueness, >>>let's just do it right and use something like lib/idr.c to generate >>>truly unique indices at probe time for all devices using this >>>facility. I like that better than having them be unique by a happy >>>accident. >> >> We already have per-device uniqueue key. dev->ifindex. >> That should be good for fwd_mark purposes I believe. > >It would be great if we could use dev->index, but fwd_mark is really >to mark device ports that belong to a group. In the case of a bridge, >the device ports in the bridge should all have the same mark. And >another device's ports in the same bridge would have a different mark >(so we can't use the bridge's dev->ifindex). On ingress, the skb is >marked with the ingress port's mark. If the skb is to be forwarded >out an egress port, the skb mark is compared with egress port's mark. >If marks compare, then the device has already forwarded the pkt so the >kernel can consume_skb to avoid duplicate pkts on the wire. > >So what we need is a unique mark for device ports within a fwding >group, such as a bridge. Yep, have a group of netdevs, pick one of them and use it's ifindex for the whole group. > >I'm investigating Dave's suggestion to use IDR. I think this will work...