* [RFC PATCH net-next] bridge: Add bridge port learn filter and priority
@ 2014-06-18 7:16 roopa
2014-06-18 20:34 ` Stephen Hemminger
0 siblings, 1 reply; 4+ messages in thread
From: roopa @ 2014-06-18 7:16 UTC (permalink / raw)
To: davem, stephen, netdev, roopa
Cc: wkok, scotte, ddutt, shm, nolan, sfeldma, jhs
From: Roopa Prabhu <roopa@cumulusnetworks.com>
This RFC patch introduces bridge port learning priority.
In certain network topologies, there is a need to give priority to certain
bridge port(s) over other bridge port(s) such that an fdb learned on the port
with higher priority does not move to the port(s) with lower priority.
One such topology is when there are multiple paths to a dual-homed host. In
this situation the path used by a bridge to reach the host is the port which
last received a packet from the host.
But, instead of having the path to a dual-connected host flip-flop back and
forth between the ports to the host based on the last received a packet from
that host, it is desirable to be able to define the preferred path to that
host. This is accomplished by defining an address movement priority for each
port, and restricting fdb address movement based on these priorities. The
rules are as follows:
- all ports of a bridge are assigned a priority (LEARN_PRIO),
default is 0 (lowest priority)
- enforcement of the learning rules is activated on a port if
LEARN_FILTER is set, default is not set
- if a bridge port, say p1, is enabled with the LEARN_FILTER, and an fdb
shows up on that port, and if the fdb is already learned on another
bridge port, say p0:
- if the priority of p0 is higher than the priority of p1, then the
fdb remains learned on p0 and not allowed to move to p1
- if the priority of p0 is equal or lower than the priority of p1,
then the fdb moves to p1
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
include/uapi/linux/if_link.h | 2 ++
net/bridge/br_fdb.c | 12 ++++++++++++
net/bridge/br_netlink.c | 33 ++++++++++++++++++++++++++++++++-
net/bridge/br_private.h | 2 ++
4 files changed, 48 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index b385348..ade556f 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -225,6 +225,8 @@ enum {
IFLA_BRPORT_FAST_LEAVE, /* multicast fast leave */
IFLA_BRPORT_LEARNING, /* mac learning */
IFLA_BRPORT_UNICAST_FLOOD, /* flood unicast traffic */
+ IFLA_BRPORT_LEARN_PRIO, /* learn priority */
+ IFLA_BRPORT_LEARN_FILTER, /* learn filter */
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index b524c36..38e246e 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -505,6 +505,10 @@ static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
*/
if (fdb->is_local)
return 0;
+ if ((fdb->dst != source) && source &&
+ source->learn_filter &&
+ (fdb->dst->learn_priority > source->learn_priority))
+ return 0;
br_warn(br, "adding interface %s with same address "
"as a received packet\n",
source ? source->dev->name : br->dev->name);
@@ -559,6 +563,11 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
source->dev->name);
} else {
/* fastpath: update of existing entry */
+ if (fdb->dst && source && (fdb->dst != source) &&
+ source->learn_filter &&
+ (fdb->dst->learn_priority > source->learn_priority))
+ return;
+
if (unlikely(source != fdb->dst)) {
fdb->dst = source;
fdb_modified = true;
@@ -732,6 +741,9 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
return -EEXIST;
if (fdb->dst != source) {
+ if (source->learn_filter &&
+ (fdb->dst->learn_priority > source->learn_priority))
+ return 0;
fdb->dst = source;
modified = true;
}
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 26edb51..2b5f770 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -32,6 +32,8 @@ static inline size_t br_port_info_size(void)
+ nla_total_size(1) /* IFLA_BRPORT_FAST_LEAVE */
+ nla_total_size(1) /* IFLA_BRPORT_LEARNING */
+ nla_total_size(1) /* IFLA_BRPORT_UNICAST_FLOOD */
+ + nla_total_size(1) /* IFLA_BRPORT_LEARN_PRIO */
+ + nla_total_size(1) /* IFLA_BRPORT_LEARN_FILTER */
+ 0;
}
@@ -60,7 +62,9 @@ static int br_port_fill_attrs(struct sk_buff *skb,
nla_put_u8(skb, IFLA_BRPORT_PROTECT, !!(p->flags & BR_ROOT_BLOCK)) ||
nla_put_u8(skb, IFLA_BRPORT_FAST_LEAVE, !!(p->flags & BR_MULTICAST_FAST_LEAVE)) ||
nla_put_u8(skb, IFLA_BRPORT_LEARNING, !!(p->flags & BR_LEARNING)) ||
- nla_put_u8(skb, IFLA_BRPORT_UNICAST_FLOOD, !!(p->flags & BR_FLOOD)))
+ nla_put_u8(skb, IFLA_BRPORT_UNICAST_FLOOD, !!(p->flags & BR_FLOOD)) ||
+ nla_put_u8(skb, IFLA_BRPORT_LEARN_FILTER, p->learn_filter) ||
+ nla_put_u8(skb, IFLA_BRPORT_LEARN_PRIO, p->learn_priority))
return -EMSGSIZE;
return 0;
@@ -286,6 +290,8 @@ static const struct nla_policy ifla_brport_policy[IFLA_BRPORT_MAX + 1] = {
[IFLA_BRPORT_PROTECT] = { .type = NLA_U8 },
[IFLA_BRPORT_LEARNING] = { .type = NLA_U8 },
[IFLA_BRPORT_UNICAST_FLOOD] = { .type = NLA_U8 },
+ [IFLA_BRPORT_LEARN_PRIO] = { .type = NLA_U8 },
+ [IFLA_BRPORT_LEARN_FILTER] = { .type = NLA_U8 },
};
/* Change the state of the port and notify spanning tree */
@@ -324,6 +330,18 @@ static void br_set_port_flag(struct net_bridge_port *p, struct nlattr *tb[],
}
}
+static int br_set_port_learn_priority(struct net_bridge_port *p, u8 prio)
+{
+ p->learn_priority = prio;
+ return 0;
+}
+
+static int br_set_port_learn_filter(struct net_bridge_port *p, u8 enable)
+{
+ p->learn_filter = enable;
+ return 0;
+}
+
/* Process bridge protocol info on port */
static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
{
@@ -355,6 +373,19 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[])
return err;
}
+ if (tb[IFLA_BRPORT_LEARN_PRIO]) {
+ err = br_set_port_learn_priority(p, nla_get_u8(tb[IFLA_BRPORT_LEARN_PRIO]));
+ if (err)
+ return err;
+ }
+
+ if (tb[IFLA_BRPORT_LEARN_FILTER]) {
+ err = br_set_port_learn_filter(p,
+ nla_get_u8(tb[IFLA_BRPORT_LEARN_FILTER]));
+ if (err)
+ return err;
+ }
+
br_port_flags_change(p, old_flags ^ p->flags);
return 0;
}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 23caf5b..35f032a 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -194,6 +194,8 @@ struct net_bridge_port
#ifdef CONFIG_BRIDGE_VLAN_FILTERING
struct net_port_vlans __rcu *vlan_info;
#endif
+ u8 learn_priority;
+ u8 learn_filter;
};
#define br_auto_port(p) ((p)->flags & BR_AUTO_MASK)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] bridge: Add bridge port learn filter and priority
2014-06-18 7:16 [RFC PATCH net-next] bridge: Add bridge port learn filter and priority roopa
@ 2014-06-18 20:34 ` Stephen Hemminger
[not found] ` <53A896A0.2030509@cumulusnetworks.com>
0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2014-06-18 20:34 UTC (permalink / raw)
To: roopa; +Cc: davem, netdev, wkok, scotte, ddutt, shm, nolan, sfeldma, jhs
On Wed, 18 Jun 2014 00:16:07 -0700
roopa@cumulusnetworks.com wrote:
> One such topology is when there are multiple paths to a dual-homed host.
That is called a loop.
Loops are not allowed on bridge networks.
Sorry, this is just the kind of things spanning tree and TRILL were supposed
to address. Solving in local bridge is not a good idea.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] bridge: Add bridge port learn filter and priority
[not found] ` <53A896A0.2030509@cumulusnetworks.com>
@ 2014-06-23 21:19 ` Stephen Hemminger
2014-06-23 23:03 ` Wilson Kok
0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2014-06-23 21:19 UTC (permalink / raw)
To: Wilson Kok; +Cc: roopa, davem, netdev, scotte, ddutt, shm, nolan, sfeldma, jhs
On Mon, 23 Jun 2014 14:05:36 -0700
Wilson Kok <wkok@cumulusnetworks.com> wrote:
> On 6/18/14, 1:34 PM, Stephen Hemminger wrote:
> > On Wed, 18 Jun 2014 00:16:07 -0700
> > roopa@cumulusnetworks.com wrote:
> >
> >> One such topology is when there are multiple paths to a dual-homed host.
> > That is called a loop.
> > Loops are not allowed on bridge networks.
> > Sorry, this is just the kind of things spanning tree and TRILL were supposed
> > to address. Solving in local bridge is not a good idea.
>
> This is not a loop situation. The topology we're referring to is a
> multi-chassis LAG topology (a widely supported feature by networking
> vendors, e.g. Arista's MLAG and Cisco's VPC), where some hosts are
> dual-homed on a pair of switches, and some hosts are single-homed on one
> of the two switches, and there is a inter-chassis link between the
> switches. A dual-homed host can and will source packets from either or
> both its uplinks with the same source MAC address. The packets may be
> locally forwarded on the first hop switch, and they may also cross the
> inter-chassis link from one switch to the other switch, e.g. in a
> multicast or flood situation or just unicast to a single-homed host
> connected to the other switch. As such, the host's source MAC can be
> learned on the inter-chassis link or the host link, and can move between
> the two. A dual-homed host may receive packets from either or both it's
> uplinks, and in some cases it may receive duplicate copies of the same
> packet.
>
Isn't this bridging over a bond (or team) interface.
The point is this really shouldn't be part of bridge code.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] bridge: Add bridge port learn filter and priority
2014-06-23 21:19 ` Stephen Hemminger
@ 2014-06-23 23:03 ` Wilson Kok
0 siblings, 0 replies; 4+ messages in thread
From: Wilson Kok @ 2014-06-23 23:03 UTC (permalink / raw)
To: Stephen Hemminger
Cc: roopa, davem, netdev, scotte, ddutt, shm, nolan, sfeldma, jhs
Resending the original message (in quotes) without HTML as the full
message did not appear on the mailing list. Please find response at the
end of the email.
"This is not a loop situation. The topology we're referring to is a
multi-chassis LAG topology (a widely supported feature by networking
vendors, e.g. Arista's MLAG and Cisco's VPC), where some hosts are
dual-homed on a pair of switches, and some hosts are single-homed on one
of the two switches, and there is a inter-chassis link between the
switches. A dual-homed host can and will source packets from either or
both its uplinks with the same source MAC address. The packets may be
locally forwarded on the first hop switch, and they may also cross the
inter-chassis link from one switch to the other switch, e.g. in a
multicast or flood situation or just unicast to a single-homed host
connected to the other switch. As such, the host's source MAC can be
learned on the inter-chassis link or the host link, and can move between
the two. A dual-homed host may receive packets from either or both it's
uplinks, and in some cases it may receive duplicate copies of the same
packet.
There are three issues regarding data packet forwarding in such topology
that need to be addressed:
1. reduce the usage of the inter-chassis link to reduce the
possibility of congestion and need of over-provisioning
2. reduce the possibility of a dual-homed host receiving duplicate
copies of the same packet
3. prevent out of order delivery of packets to a dual-homed host when
there are multiple paths to reach it
The proposed patch addresses #1, as soon as a dual-homed host MAC
address is learned on a host link, the local path becomes and remains
the preferred path to the host for unicast packets. It also addresses
#3, as it prevents packets destined to the dual-homed host from flipping
between the local path and the inter-chassis path following the MAC
address move.
To address #2 and also to reduce flooding, a simple filtering logic
(will be submitting in a later patch) can be used such that packets
crossing the inter-chassis link do not get forwarded to dual-homed
hosts, the assumption being that the packet should have already been
forwarded on the originating switch to the same host. With this
filtering in place, the proposed patch becomes necessary to ensure that
packets to a dual-homed host be forwarded locally on the originating
switch and not cross the inter-chassis link, or else they will get
dropped."
On 6/23/14, 2:19 PM, Stephen Hemminger wrote:
> On Mon, 23 Jun 2014 14:05:36 -0700
> Wilson Kok <wkok@cumulusnetworks.com> wrote:
>
>> On 6/18/14, 1:34 PM, Stephen Hemminger wrote:
>>> On Wed, 18 Jun 2014 00:16:07 -0700
>>> roopa@cumulusnetworks.com wrote:
>>>
>>>> One such topology is when there are multiple paths to a dual-homed host.
>>> That is called a loop.
>>> Loops are not allowed on bridge networks.
>>> Sorry, this is just the kind of things spanning tree and TRILL were supposed
>>> to address. Solving in local bridge is not a good idea.
>>
>> This is not a loop situation. The topology we're referring to is a
>> multi-chassis LAG topology (a widely supported feature by networking
>> vendors, e.g. Arista's MLAG and Cisco's VPC), where some hosts are
>> dual-homed on a pair of switches, and some hosts are single-homed on one
>> of the two switches, and there is a inter-chassis link between the
>> switches. A dual-homed host can and will source packets from either or
>> both its uplinks with the same source MAC address. The packets may be
>> locally forwarded on the first hop switch, and they may also cross the
>> inter-chassis link from one switch to the other switch, e.g. in a
>> multicast or flood situation or just unicast to a single-homed host
>> connected to the other switch. As such, the host's source MAC can be
>> learned on the inter-chassis link or the host link, and can move between
>> the two. A dual-homed host may receive packets from either or both it's
>> uplinks, and in some cases it may receive duplicate copies of the same
>> packet.
>>
>
> Isn't this bridging over a bond (or team) interface.
> The point is this really shouldn't be part of bridge code.
>
Yes, this is bridging over a host bond interface that terminates on two
different switches. Normal bridging and MAC learning operations over
such a bond interface will give rise to the three issues listed above,
hence the need for a solution such as the proposed additional logic in
MAC learning in the bridge code. I guess I'm not clear about your
concern and wondering if you have an alternative suggestion in mind.
Thanks,
Wilson.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-06-23 23:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-18 7:16 [RFC PATCH net-next] bridge: Add bridge port learn filter and priority roopa
2014-06-18 20:34 ` Stephen Hemminger
[not found] ` <53A896A0.2030509@cumulusnetworks.com>
2014-06-23 21:19 ` Stephen Hemminger
2014-06-23 23:03 ` Wilson Kok
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.