From: roopa <roopa@cumulusnetworks.com>
To: Scott Feldman <sfeldma@gmail.com>
Cc: "John Fastabend" <john.r.fastabend@intel.com>,
"David S. Miller" <davem@davemloft.net>,
"Jiří Pírko" <jiri@resnulli.us>,
"Arad, Ronen" <ronen.arad@intel.com>,
Netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next RFC v2] switchdev: bridge: drop hardware forwarded packets
Date: Fri, 20 Mar 2015 16:30:31 -0700 [thread overview]
Message-ID: <550CAD97.3000404@cumulusnetworks.com> (raw)
In-Reply-To: <CAE4R7bBd8TeNCiXdOL9en-FfuTqqn==UYvAxHHD8BYHuFCiWBw@mail.gmail.com>
On 3/20/15, 3:37 PM, Scott Feldman wrote:
> On Fri, Mar 20, 2015 at 3:06 PM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 3/20/15, 11:13 AM, Scott Feldman wrote:
>>> On Fri, Mar 20, 2015 at 10:11 AM, John Fastabend
>>> <john.r.fastabend@intel.com> wrote:
>>>> On 03/20/2015 09:58 AM, roopa@cumulusnetworks.com wrote:
>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>
>>>>> On a Linux bridge with bridge forwarding offloaded to switch ASIC,
>>>>> there is a need to not re-forward frames that have already been
>>>>> forwarded in hardware.
>>>>>
>>>>> Typically these are broadcast or multicast frames forwarded by the
>>>>> hardware to multiple destination ports including sending a copy of
>>>>> the packet to the cpu (kernel e.g. an arp broadcast).
>>>>> The bridge driver will try to forward the packet again, resulting in
>>>>> two copies of the same packet.
>>>>>
>>>>> These packets can also come up to the kernel for logging when they hit
>>>>> a LOG acl rule in hardware. In such cases, you do want the packet
>>>>> to go through the bridge netfilter hooks. Hence, this patch adds the
>>>>> required checks just before the packet is being xmited.
>>>>>
>>>>> v2:
>>>>> - Add a new hw_fwded flag in skbuff to indicate that the packet
>>>>> is already hardware forwarded. Switch driver will set this flag.
>>>>> I have been trying to avoid having this flag in the skb
>>>>> and thats why this patch has been in my tree for long. Cant think
>>>>> of other better alternatives. Suggestions are welcome. I have put
>>>>> this under CONFIG_NET_SWITCHDEV to minimize the impact.
>>>>>
>>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>> ---
>>>> Interesting. I completely avoid this problem by not instantiating a
>>>> software bridge ;) When these pkts come up the stack I either use a
>>>> raw socket to capture them, put a 'tc' ingress rule to do something,
>>>> or have OVS handle them in some special way. It seems to me that this
>>>> is where the sw/hw model starts to break when you have these magic
>>>> bits to handle the packets differently.
>>>>
>>>> How do you know to set the skb bit? Do you have some indicator in the
>>>> descriptor? I don't have any good way to learn this on my hardware. But
>>>> I can assume if it reached the CPU it was because of some explicit rule.
>>> I was wondering that also, since there was no example.
>>>
>>> This features seems like it belongs in the bridge.
>> yes, it does, the check today is really in the bridge.
>>> We already have
>>> BR_FLOOD to indicate whether unknown unicast traffic is flooded to a
>>> bridge port. Can we add another BR_FLOOD_BCAST (or some name) for
>>> this new feature? You would set/clear this flag on the bridge
>>> (master) port. The default is set. And now:
>>>
>>> - #define BR_AUTO_MASK (BR_FLOOD | BR_LEARNING)
>>> + #define BR_AUTO_MASK (BR_FLOOD | BR_FLOOD_BCAST | BR_LEARNING)
>>>
>>> Does this work for your use-case, Roopa?
>> Note my first RFC patch, sort of did this:
>> https://marc.info/?l=linux-netdev&m=142147999420017&w=2
>>
>> But there are open things there as listed in the comment and also in the
>> subsequent
>> discussion on the thread.
>>
>> We discussed this flag before and i think it does not allow the case where
>> hw switch ports are bridged with non-hw ports.
> I went back and read the thread just to remind me what the pros/cons
> where. I think the mixed case isn't a concern since this
> BR_FLOOD_BCAST check is made on egress to the bridge port. So only
> clear BR_FLOOD_BCAST on hw switch ports (if hw did the flood already
> amongst its ports), and leave it set for non-hw-ports. It seems the
> patch should mostly be a clone of how BR_FLOOD is handled. Is there
> more to it?
That may work. But, we have recently moved igmp handling to software in
kernel
and i was trying to make this work for that case. I am going to try what
you suggest
by finding a work around for the igmp case.
I will get back to you.
Thanks!
-Roopa
next prev parent reply other threads:[~2015-03-20 23:30 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-20 16:58 [PATCH net-next RFC v2] switchdev: bridge: drop hardware forwarded packets roopa
2015-03-20 17:11 ` John Fastabend
2015-03-20 18:13 ` Scott Feldman
2015-03-20 18:30 ` John Fastabend
2015-03-20 22:06 ` roopa
2015-03-20 22:37 ` Scott Feldman
2015-03-20 23:30 ` roopa [this message]
2015-03-21 0:26 ` Scott Feldman
2015-03-21 5:53 ` roopa
2015-03-20 21:03 ` roopa
2015-03-20 21:23 ` John Fastabend
2015-03-20 22:04 ` Andrew Lunn
2015-03-20 23:12 ` roopa
2015-03-20 18:03 ` Scott Feldman
2015-03-20 21:20 ` roopa
2015-03-20 20:36 ` David Miller
2015-03-20 21:36 ` roopa
2015-03-20 22:09 ` Andrew Lunn
2015-03-20 23:43 ` Florian Fainelli
2015-03-23 0:22 ` Guenter Roeck
2015-03-23 1:33 ` John Fastabend
2015-03-23 2:57 ` Guenter Roeck
2015-03-23 3:18 ` John Fastabend
2015-03-23 3:33 ` Guenter Roeck
2015-03-23 17:12 ` roopa
2015-03-24 5:59 ` Scott Feldman
2015-03-24 13:13 ` Guenter Roeck
2015-03-24 18:08 ` Scott Feldman
2015-03-24 14:29 ` Jiri Pirko
2015-03-24 16:01 ` Guenter Roeck
2015-03-24 17:45 ` roopa
2015-03-24 17:58 ` Guenter Roeck
2015-03-24 18:14 ` Scott Feldman
2015-03-25 3:10 ` Guenter Roeck
2015-03-25 3:46 ` Florian Fainelli
2015-03-25 5:06 ` Scott Feldman
2015-03-25 17:01 ` roopa
2015-03-26 7:44 ` Scott Feldman
2015-03-26 8:20 ` Jiri Pirko
2015-03-26 14:28 ` Scott Feldman
2015-03-26 14:49 ` Jiri Pirko
2015-03-27 1:08 ` Simon Horman
2015-03-27 6:02 ` Jiri Pirko
2015-03-27 6:43 ` Scott Feldman
2015-03-27 7:01 ` Jiri Pirko
2015-03-27 23:19 ` Scott Feldman
2015-03-30 14:06 ` roopa
2015-03-24 18:48 ` David Christensen
2015-03-24 17:58 ` Scott Feldman
2015-03-23 17:10 ` roopa
2015-03-23 14:00 ` roopa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=550CAD97.3000404@cumulusnetworks.com \
--to=roopa@cumulusnetworks.com \
--cc=davem@davemloft.net \
--cc=jiri@resnulli.us \
--cc=john.r.fastabend@intel.com \
--cc=netdev@vger.kernel.org \
--cc=ronen.arad@intel.com \
--cc=sfeldma@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.