From mboxrd@z Thu Jan 1 00:00:00 1970 From: roopa Subject: Re: [RFC PATCH net-next] bridge: ability to disable forwarding on a port Date: Sun, 18 Jan 2015 01:10:12 -0800 Message-ID: <54BB7874.90201@cumulusnetworks.com> References: <1421479975-62049-1-git-send-email-roopa@cumulusnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: "stephen@networkplumber.org" , "David S. Miller" , Jamal Hadi Salim , =?UTF-8?B?SmnFmcOtIFDDrXJrbw==?= , "Arad, Ronen" , Thomas Graf , john fastabend , "vyasevic@redhat.com" , Netdev , Wilson Kok , Andy Gospodarek To: Scott Feldman Return-path: Received: from mail-pa0-f42.google.com ([209.85.220.42]:56230 "EHLO mail-pa0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751319AbbARJKO (ORCPT ); Sun, 18 Jan 2015 04:10:14 -0500 Received: by mail-pa0-f42.google.com with SMTP id et14so32727738pad.1 for ; Sun, 18 Jan 2015 01:10:13 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 1/17/15, 5:05 PM, Scott Feldman wrote: > On Fri, Jan 16, 2015 at 11:32 PM, wrote: >> From: Roopa Prabhu >> >> On a Linux bridge with bridge forwarding offloaded to a switch ASIC, >> there is a need to not re-forward the frames that come up to the >> kernel in software. >> >> Typically these are broadcast or multicast packets forwarded by the >> hardware to multiple destination ports including sending a copy of >> the packet to the kernel (e.g. an arp broadcast). >> The bridge driver will try to forward the packet again, resulting in >> two copies of the same packet. >> >> These packets can also come up to the kernel for logging when they hit >> a LOG acl in hardware. >> >> This patch makes forwarding a flag on the port similar to >> learn and flood and drops the packet just before forwarding. >> (The forwarding disable on a bridge is tested to work on our boxes. >> The bridge port flag addition is only compile tested. >> This will need to be further refined to cover cases where a non-switch port >> is bridged to a switch port etc. We will submit more patches to cover >> all cases if we agree on this approach). > Good topic to bring up, thanks for proposing a patch. There is indeed > duplicate pkts sent out in the case where both the bridge and the > offloaded device are flooding these non-unicast pkts, such as ARP > requests. We do have per-port control today over unicast flooding > using BR_FLOOD (IFLA_BRPORT_UNICAST_FLOOD). > > As you point out, this doesn't solve the case for non-offloaded ports > bridged with switch ports. If this port setting is enabled on an > offloaded switch port, for example, the non-offloaded port can't get > an ARP request resolved, if the MAC is behind the offloaded switch > port. But do we care? Is there a use-case for this one, mixing > offloaded and non-offloaded ports in a bridge? Not sure. I don't know the use case, but I think I might have heard that there could be a case where a switch port could be bridged with a vm's port running on the switch. (?) > >> Other ways to solve the same problem could be to: >> - use the offload feature flag on these switch ports to avoid the >> re-forward: >> https://www.marc.info/?l=linux-netdev&m=141820235010603&w=2 >> >> - Or the switch driver can mark or set a flag in the skb, which the bridge >> driver can use to avoid a re-forward. >> >> Signed-off-by: Wilson Kok >> Signed-off-by: Roopa Prabhu >> --- >> include/linux/if_bridge.h | 3 ++- >> include/uapi/linux/if_link.h | 1 + >> net/bridge/br_forward.c | 13 +++++++++++++ >> net/bridge/br_if.c | 2 +- >> net/bridge/br_netlink.c | 4 +++- >> net/bridge/br_sysfs_if.c | 1 + >> net/core/rtnetlink.c | 4 +++- >> 7 files changed, 24 insertions(+), 4 deletions(-) >> >> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h >> index 0a8ce76..c79f4eb 100644 >> --- a/include/linux/if_bridge.h >> +++ b/include/linux/if_bridge.h >> @@ -40,10 +40,11 @@ struct br_ip_list { >> #define BR_ADMIN_COST BIT(4) >> #define BR_LEARNING BIT(5) >> #define BR_FLOOD BIT(6) >> -#define BR_AUTO_MASK (BR_FLOOD | BR_LEARNING) >> #define BR_PROMISC BIT(7) >> #define BR_PROXYARP BIT(8) >> #define BR_LEARNING_SYNC BIT(9) >> +#define BR_FORWARD BIT(10) > The name BR_FORWARD might confuse people thinking this is related to > STP FORWARDING state. yes, that thought crossed my mind too. I had BR_FORWARDING initially and to make it sound less like STP changed it to BR_FORWARD. :) > We have BR_FLOOD for unknown unicast flooding. > How about renaming BR_FLOOD to BR_FLOOD_UNICAST and add > BR_FLOOD_BROADCAST? So you would have: > > IFLA_BRPORT_UNICAST_FLOOD BR_FLOOD_UNICAST /* flood > unknown unicast traffic to port */ > IFLA_BRPORT_BROADCAST_FLOOD BR_FLOOD_BROADCAST /* flood > bcast/mcast traffic to port */ That's a good idea. So, unknown unicast and broadcast will be covered with that. Am afraid there might be other packets, like the acl LOG packet hitting the CPU/kernel (?) I will check if there are others. > >> +#define BR_AUTO_MASK (BR_FLOOD | BR_LEARNING | BR_FORWARD) >> >> extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __user *)); >> >> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h >> index f7d0d2d..d394625 100644 >> --- a/include/uapi/linux/if_link.h >> +++ b/include/uapi/linux/if_link.h >> @@ -246,6 +246,7 @@ enum { >> IFLA_BRPORT_UNICAST_FLOOD, /* flood unicast traffic */ >> IFLA_BRPORT_PROXYARP, /* proxy ARP */ >> IFLA_BRPORT_LEARNING_SYNC, /* mac learning sync from device */ >> + IFLA_BRPORT_FORWARD, /* enable forwarding on a device */ >> __IFLA_BRPORT_MAX >> }; >> #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) >> diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c >> index f96933a..98c41c8 100644 >> --- a/net/bridge/br_forward.c >> +++ b/net/bridge/br_forward.c >> @@ -81,10 +81,23 @@ static void __br_deliver(const struct net_bridge_port *to, struct sk_buff *skb) >> br_forward_finish); >> } >> >> +int br_hw_forward_finish(struct sk_buff *skb) >> +{ >> + kfree_skb(skb); >> + >> + return 0; >> +} >> + >> static void __br_forward(const struct net_bridge_port *to, struct sk_buff *skb) >> { >> struct net_device *indev; >> >> + if (!(to->flags & BR_FORWARD)) { >> + NF_HOOK(NFPROTO_BRIDGE, NF_BR_FORWARD, skb, skb->dev, to->dev, >> + br_hw_forward_finish); >> + return; >> + } >> + > Seems you should make the (flags & BR_FORWARD) check earlier, before > skb cloning, in br_flood(), alongside the (flags & BR_FLOOD) check. This patch strategically places it in __br_forward to catch all forwards (due to floods or no floods ..with direct call to br_foward) with minimal code changes in mind. Will see if the clone can be avoided. > > Also, the above code is skipping some vlan checks (br_handle_vlan). The br_handle_vlan checks seemed not necessary for a packet getting dropped. But, will check and fix if its needed while the packet traverses the netfilter hook and before it gets dropped. Thanks, Roopa