* [PATCH RFC WIP 0/5] IGMP snooping for local traffic
@ 2017-08-26 20:56 Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 1/5] net: rtnetlink: Handle bridge port without upper device Andrew Lunn
` (7 more replies)
0 siblings, 8 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa, stephen,
bridge, Andrew Lunn
This is a WIP patchset i would like comments on from bridge, switchdev
and hardware offload people.
The linux bridge supports IGMP snooping. It will listen to IGMP
reports on bridge ports and keep track of which groups have been
joined on an interface. It will then forward multicast based on this
group membership.
When the bridge adds or removed groups from an interface, it uses
switchdev to request the hardware add an mdb to a port, so the
hardware can perform the selective forwarding between ports.
What is not covered by the current bridge code, is IGMP joins/leaves
from the host on the brX interface. No such monitoring is
performed. With a pure software bridge, it is not required. All
mulitcast frames are passed to the brX interface, and the network
stack filters them, as it does for any interface. However, when
hardware offload is involved, things change. We should program the
hardware to only send multcast packets to the host when the host has
in interest in them.
Thus we need to perform IGMP snooping on the brX interface, just like
any other interface of the bridge. However, currently the brX
interface is missing all the needed data structures to do this. There
is no net_bridge_port structure for the brX interface. This strucuture
is created when an interface is added to the bridge. But the brX
interface is not a member of the bridge. So this patchset makes the
brX interface a first class member of the bridge. When the brX
interface is opened, the interface is added to the bridge. A
net_bridge_port is allocated for it, and IGMP snooping is performed as
usual.
There are some complexities here. Some assumptions are broken, like
the master interface of a port interface is the bridge interface. The
brX interface cannot be its own master. The use of
netdev_master_upper_dev_get() within the bridge code has been changed
to reflecit this. The bridge receive handler needs to not process
frames for the brX interface, etc.
The interface downward to the hardware is also an issue. The code
presented here is a hack and needs to change. But that is secondary
and can be solved once it is agreed how the bridge needs to change to
support this use case.
Comment welcome and wanted.
Andrew
Andrew Lunn (5):
net: rtnetlink: Handle bridge port without upper device
net: bridge: Skip receive handler on brX interface
net: bridge: Make the brX interface a member of the bridge
net: dsa: HACK: Handle MDB add/remove for none-switch ports
net: dsa: Don't include CPU port when adding MDB to a port
include/linux/if_bridge.h | 1 +
net/bridge/br_device.c | 12 ++++++++++--
net/bridge/br_if.c | 37 ++++++++++++++++++++++++-------------
net/bridge/br_input.c | 4 ++++
net/bridge/br_mdb.c | 2 --
net/bridge/br_multicast.c | 7 ++++---
net/bridge/br_private.h | 1 +
net/core/rtnetlink.c | 23 +++++++++++++++++++++--
net/dsa/port.c | 19 +++++++++++++++++--
net/dsa/switch.c | 2 +-
10 files changed, 83 insertions(+), 25 deletions(-)
--
2.14.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH RFC WIP 1/5] net: rtnetlink: Handle bridge port without upper device
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
@ 2017-08-26 20:56 ` Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 2/5] net: bridge: Skip receive handler on brX interface Andrew Lunn
` (6 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa, stephen,
bridge, Andrew Lunn
The brX interface will with a following patch becomes a member of the
bridge. It however cannot be a slave interface, since it would have to
be a slave of itself. netdev_master_upper_dev_get() returns NULL as a
result. Handle this NULL, by knowing this bridge slave must also be
the master, i.e. what we are looking for.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
net/core/rtnetlink.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9201e3621351..2673eb430b6f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3093,8 +3093,12 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh,
if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) &&
(dev->priv_flags & IFF_BRIDGE_PORT)) {
struct net_device *br_dev = netdev_master_upper_dev_get(dev);
- const struct net_device_ops *ops = br_dev->netdev_ops;
+ const struct net_device_ops *ops;
+ if (!br_dev)
+ br_dev = dev;
+
+ ops = br_dev->netdev_ops;
err = ops->ndo_fdb_add(ndm, tb, dev, addr, vid,
nlh->nlmsg_flags);
if (err)
@@ -3197,7 +3201,12 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh,
if ((!ndm->ndm_flags || ndm->ndm_flags & NTF_MASTER) &&
(dev->priv_flags & IFF_BRIDGE_PORT)) {
struct net_device *br_dev = netdev_master_upper_dev_get(dev);
- const struct net_device_ops *ops = br_dev->netdev_ops;
+ const struct net_device_ops *ops;
+
+ if (!br_dev)
+ br_dev = dev;
+
+ ops = br_dev->netdev_ops;
if (ops->ndo_fdb_del)
err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid);
@@ -3332,6 +3341,8 @@ static int rtnl_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb)
if (!br_idx) { /* user did not specify a specific bridge */
if (dev->priv_flags & IFF_BRIDGE_PORT) {
br_dev = netdev_master_upper_dev_get(dev);
+ if (!br_dev)
+ br_dev = dev;
cops = br_dev->netdev_ops;
}
} else {
@@ -3410,6 +3421,9 @@ int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
struct net_device *br_dev = netdev_master_upper_dev_get(dev);
int err = 0;
+ if (!br_dev)
+ br_dev = dev;
+
nlh = nlmsg_put(skb, pid, seq, RTM_NEWLINK, sizeof(*ifm), nlflags);
if (nlh == NULL)
return -EMSGSIZE;
@@ -3647,6 +3661,8 @@ static int rtnl_bridge_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (!flags || (flags & BRIDGE_FLAGS_MASTER)) {
struct net_device *br_dev = netdev_master_upper_dev_get(dev);
+ if (!br_dev)
+ br_dev = dev;
if (!br_dev || !br_dev->netdev_ops->ndo_bridge_setlink) {
err = -EOPNOTSUPP;
@@ -3723,6 +3739,9 @@ static int rtnl_bridge_dellink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (!flags || (flags & BRIDGE_FLAGS_MASTER)) {
struct net_device *br_dev = netdev_master_upper_dev_get(dev);
+ if (!br_dev)
+ br_dev = dev;
+
if (!br_dev || !br_dev->netdev_ops->ndo_bridge_dellink) {
err = -EOPNOTSUPP;
goto out;
--
2.14.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RFC WIP 2/5] net: bridge: Skip receive handler on brX interface
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 1/5] net: rtnetlink: Handle bridge port without upper device Andrew Lunn
@ 2017-08-26 20:56 ` Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 3/5] net: bridge: Make the brX interface a member of the bridge Andrew Lunn
` (5 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa, stephen,
bridge, Andrew Lunn
The brX interface will soon become a member of the bridge. As such, it
will get a receiver handler assigned. However, we don't want to handle
packets received on this soft interfaces. So detect the condition and
say all the packets pass.
---
net/bridge/br_input.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 7637f58c1226..38c2a41968f2 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -267,6 +267,10 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
return RX_HANDLER_CONSUMED;
p = br_port_get_rcu(skb->dev);
+
+ if (p->dev == p->br->dev)
+ return RX_HANDLER_PASS;
+
if (p->flags & BR_VLAN_TUNNEL) {
if (br_handle_ingress_vlan_tunnel(skb, p,
nbp_vlan_group_rcu(p)))
--
2.14.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RFC WIP 3/5] net: bridge: Make the brX interface a member of the bridge
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 1/5] net: rtnetlink: Handle bridge port without upper device Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 2/5] net: bridge: Skip receive handler on brX interface Andrew Lunn
@ 2017-08-26 20:56 ` Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 4/5] net: dsa: HACK: Handle MDB add/remove for none-switch ports Andrew Lunn
` (4 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa, stephen,
bridge, Andrew Lunn
In order to perform IGMP snooping on the brX interface, it has to be
part of the bridge, so that the code snooping on normal bridge ports
keeps track of IGMP joins and leaves.
When the brX interface is opened, add the interface to the bridge.
When the brX interface is closed, remove it from the bridge.
This port does however need some special handling. So add a bridge
port flag, BR_SOFT_INTERFACE, indicating a port is the sort interface
of the bridge.
When the port is added to the bridge, the netdev for this port cannot
be linked to the master device, since it is the master device.
Similarly when removing the port, it cannot be unlinked from the
master device.
With the brX interface now being a member of the bridge, and having
all associated structures, we can process IGMP messages sent by the
interface. This is done by the br_multicast_rcv() function, which
takes the bridge_port structure as a parameter. This cannot be easily
found, so keep track of it in the net_bridge structure.
---
include/linux/if_bridge.h | 1 +
net/bridge/br_device.c | 12 ++++++++++--
net/bridge/br_if.c | 37 ++++++++++++++++++++++++-------------
net/bridge/br_mdb.c | 2 --
net/bridge/br_multicast.c | 7 ++++---
net/bridge/br_private.h | 1 +
6 files changed, 40 insertions(+), 20 deletions(-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 3cd18ac0697f..8a03821d1827 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -49,6 +49,7 @@ struct br_ip_list {
#define BR_MULTICAST_TO_UNICAST BIT(12)
#define BR_VLAN_TUNNEL BIT(13)
#define BR_BCAST_FLOOD BIT(14)
+#define BR_SOFT_INTERFACE BIT(15)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 861ae2a165f4..f27ca62fd4a5 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -69,7 +69,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
br_flood(br, skb, BR_PKT_MULTICAST, false, true);
goto out;
}
- if (br_multicast_rcv(br, NULL, skb, vid)) {
+ if (br_multicast_rcv(br, br->local_port, skb, vid)) {
kfree_skb(skb);
goto out;
}
@@ -133,6 +133,14 @@ static void br_dev_uninit(struct net_device *dev)
static int br_dev_open(struct net_device *dev)
{
struct net_bridge *br = netdev_priv(dev);
+ int err;
+
+ err = br_add_if(br, br->dev);
+ if (err)
+ return err;
+
+ br->local_port = list_first_or_null_rcu(&br->port_list,
+ struct net_bridge_port, list);
netdev_update_features(dev);
netif_start_queue(dev);
@@ -161,7 +169,7 @@ static int br_dev_stop(struct net_device *dev)
netif_stop_queue(dev);
- return 0;
+ return br_del_if(br, br->dev);
}
static void br_get_stats64(struct net_device *dev,
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index f3aef22931ab..49208e774191 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -284,7 +284,8 @@ static void del_nbp(struct net_bridge_port *p)
nbp_update_port_count(br);
- netdev_upper_dev_unlink(dev, br->dev);
+ if (!(p->flags & BR_SOFT_INTERFACE))
+ netdev_upper_dev_unlink(dev, br->dev);
dev->priv_flags &= ~IFF_BRIDGE_PORT;
@@ -362,6 +363,8 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
p->priority = 0x8000 >> BR_PORT_BITS;
p->port_no = index;
p->flags = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD;
+ if (br->dev == dev)
+ p->flags |= BR_SOFT_INTERFACE;
br_init_port(p);
br_set_state(p, BR_STATE_DISABLED);
br_stp_port_timer_init(p);
@@ -500,8 +503,11 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
return -EINVAL;
/* No bridging of bridges */
- if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit)
- return -ELOOP;
+ if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) {
+ /* Unless it is our own soft interface */
+ if (br->dev != dev)
+ return -ELOOP;
+ }
/* Device is already being bridged */
if (br_port_exists(dev))
@@ -540,9 +546,11 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
dev->priv_flags |= IFF_BRIDGE_PORT;
- err = netdev_master_upper_dev_link(dev, br->dev, NULL, NULL);
- if (err)
- goto err5;
+ if (!(p->flags & BR_SOFT_INTERFACE)) {
+ err = netdev_master_upper_dev_link(dev, br->dev, NULL, NULL);
+ if (err)
+ goto err5;
+ }
err = nbp_switchdev_mark_set(p);
if (err)
@@ -563,13 +571,15 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
else
netdev_set_rx_headroom(dev, br_hr);
- if (br_fdb_insert(br, p, dev->dev_addr, 0))
- netdev_err(dev, "failed insert local address bridge forwarding table\n");
+ if (!(p->flags & BR_SOFT_INTERFACE)) {
+ if (br_fdb_insert(br, p, dev->dev_addr, 0))
+ netdev_err(dev, "failed insert local address bridge forwarding table\n");
- err = nbp_vlan_init(p);
- if (err) {
- netdev_err(dev, "failed to initialize vlan filtering on this port\n");
- goto err7;
+ err = nbp_vlan_init(p);
+ if (err) {
+ netdev_err(dev, "failed to initialize vlan filtering on this port\n");
+ goto err7;
+ }
}
spin_lock_bh(&br->lock);
@@ -597,7 +607,8 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
br_fdb_delete_by_port(br, p, 0, 1);
nbp_update_port_count(br);
err6:
- netdev_upper_dev_unlink(dev, br->dev);
+ if (!(p->flags & BR_SOFT_INTERFACE))
+ netdev_upper_dev_unlink(dev, br->dev);
err5:
dev->priv_flags &= ~IFF_BRIDGE_PORT;
netdev_rx_handler_unregister(dev);
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index a0b11e7d67d9..47f0d9b4221d 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -117,8 +117,6 @@ static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
struct br_mdb_entry e;
port = p->port;
- if (!port)
- continue;
memset(&e, 0, sizeof(e));
e.ifindex = port->dev->ifindex;
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index dae3af1f531a..f1bf9ec15de8 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -915,7 +915,7 @@ static void __br_multicast_send_query(struct net_bridge *br,
if (!skb)
return;
- if (port) {
+ if (port && !(port->flags & BR_SOFT_INTERFACE)) {
skb->dev = port->dev;
br_multicast_count(br, port, skb, igmp_type,
BR_MCAST_DIR_TX);
@@ -944,8 +944,9 @@ static void br_multicast_send_query(struct net_bridge *br,
memset(&br_group.u, 0, sizeof(br_group.u));
- if (port ? (own_query == &port->ip4_own_query) :
- (own_query == &br->ip4_own_query)) {
+ if (port && !(port->flags & BR_SOFT_INTERFACE) ?
+ (own_query == &port->ip4_own_query) :
+ (own_query == &br->ip4_own_query)) {
other_query = &br->ip4_other_query;
br_group.proto = htons(ETH_P_IP);
#if IS_ENABLED(CONFIG_IPV6)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index fd9ee73e0a6d..c4b99a35abb0 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -296,6 +296,7 @@ struct net_bridge {
spinlock_t lock;
spinlock_t hash_lock;
struct list_head port_list;
+ struct net_bridge_port *local_port;
struct net_device *dev;
struct pcpu_sw_netstats __percpu *stats;
/* These fields are accessed on each packet */
--
2.14.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RFC WIP 4/5] net: dsa: HACK: Handle MDB add/remove for none-switch ports
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
` (2 preceding siblings ...)
2017-08-26 20:56 ` [PATCH RFC WIP 3/5] net: bridge: Make the brX interface a member of the bridge Andrew Lunn
@ 2017-08-26 20:56 ` Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 5/5] net: dsa: Don't include CPU port when adding MDB to a port Andrew Lunn
` (3 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa, stephen,
bridge, Andrew Lunn
When there is a mdb added to a port which is not in the switch, we
need the switch to forward traffic for the group to the software
bridge, so it can forward it out the none-switch port.
The current implementation is a hack and will be replaced. Currently
only the bridge soft interface is supported. When there is a
join/leave on the soft interface, switchdev calls are made on the soft
interface device, brX. This does not have a switchdev ops structure
registered, so all lower interfaces of brX get there switchdev
function called. These are switch ports, and do have switchdev ops. By
comparing the original interface to the called interface, we can
determine this is not for a switch port, and add/remove the mdb to the
CPU port.
---
net/dsa/port.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/net/dsa/port.c b/net/dsa/port.c
index d6e07176df3f..d8e4bfefd97d 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -194,8 +194,15 @@ int dsa_port_mdb_add(struct dsa_port *dp,
.mdb = mdb,
};
- pr_info("dsa_port_mdb_add: %d %d", info.sw_index, info.port);
-
+ if (dp->netdev != mdb->obj.orig_dev) {
+ /* Not a port for this switch, so forward
+ * multicast out the CPU port to the bridge.
+ */
+ struct dsa_switch_tree *dst = dp->ds->dst;
+ struct dsa_port *cpu_dp = dsa_get_cpu_port(dst);
+ info.port = cpu_dp->index;
+ return dsa_port_notify(cpu_dp, DSA_NOTIFIER_MDB_ADD, &info);
+ }
return dsa_port_notify(dp, DSA_NOTIFIER_MDB_ADD, &info);
}
@@ -208,6 +215,14 @@ int dsa_port_mdb_del(struct dsa_port *dp,
.mdb = mdb,
};
+ if (dp->netdev != mdb->obj.orig_dev) {
+ struct dsa_switch_tree *dst = dp->ds->dst;
+ struct dsa_port *cpu_dp = dsa_get_cpu_port(dst);
+
+ info.port = cpu_dp->index;
+ return dsa_port_notify(cpu_dp, DSA_NOTIFIER_MDB_DEL, &info);
+ }
+
return dsa_port_notify(dp, DSA_NOTIFIER_MDB_DEL, &info);
}
--
2.14.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RFC WIP 5/5] net: dsa: Don't include CPU port when adding MDB to a port
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
` (3 preceding siblings ...)
2017-08-26 20:56 ` [PATCH RFC WIP 4/5] net: dsa: HACK: Handle MDB add/remove for none-switch ports Andrew Lunn
@ 2017-08-26 20:56 ` Andrew Lunn
2017-08-26 22:17 ` [PATCH RFC WIP 0/5] IGMP snooping for local traffic Nikolay Aleksandrov
` (2 subsequent siblings)
7 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 20:56 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, Florian Fainelli, Vivien Didelot, nikolay, roopa,
bridge, jiri
Now that the MDB are explicitly added to the CPU port when required,
don't add the CPU port adding an MDB to a switch port.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
net/dsa/switch.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 97e2e9c8cf3f..c178e2b86a9a 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -130,7 +130,7 @@ static int dsa_switch_mdb_add(struct dsa_switch *ds,
if (ds->index == info->sw_index)
set_bit(info->port, group);
for (port = 0; port < ds->num_ports; port++)
- if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
+ if (dsa_is_dsa_port(ds, port))
set_bit(port, group);
if (switchdev_trans_ph_prepare(trans)) {
--
2.14.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
` (4 preceding siblings ...)
2017-08-26 20:56 ` [PATCH RFC WIP 5/5] net: dsa: Don't include CPU port when adding MDB to a port Andrew Lunn
@ 2017-08-26 22:17 ` Nikolay Aleksandrov
2017-08-26 22:40 ` Nikolay Aleksandrov
2017-08-26 23:02 ` Andrew Lunn
2017-08-28 2:44 ` Florian Fainelli
2017-08-28 15:11 ` Stephen Hemminger
7 siblings, 2 replies; 12+ messages in thread
From: Nikolay Aleksandrov @ 2017-08-26 22:17 UTC (permalink / raw)
To: Andrew Lunn, netdev
Cc: Vivien Didelot, Florian Fainelli, jiri, roopa, stephen, bridge
On 26/08/17 23:56, Andrew Lunn wrote:
> This is a WIP patchset i would like comments on from bridge, switchdev
> and hardware offload people.
>
> The linux bridge supports IGMP snooping. It will listen to IGMP
> reports on bridge ports and keep track of which groups have been
> joined on an interface. It will then forward multicast based on this
> group membership.
>
> When the bridge adds or removed groups from an interface, it uses
> switchdev to request the hardware add an mdb to a port, so the
> hardware can perform the selective forwarding between ports.
>
> What is not covered by the current bridge code, is IGMP joins/leaves
> from the host on the brX interface. No such monitoring is
Hi Andrew,
Have you taken a look at mglist (the boolean, probably needs a rename) ? It is for
exactly that purpose, to track which groups the bridge is interested in.
I assume I'm forgetting or missing something here.
> performed. With a pure software bridge, it is not required. All
> mulitcast frames are passed to the brX interface, and the network
If mglist (again the boolean) is false then they won't be passed up.
> stack filters them, as it does for any interface. However, when
> hardware offload is involved, things change. We should program the
> hardware to only send multcast packets to the host when the host has
> in interest in them.
Granted the boolean mglist might need some changes (esp. with host group leave)
but I think it can be used to program switchdev for host join/leave, can't
we adjust its behaviour instead of introducing this complexity and avoid many
headaches ?
>
> Thus we need to perform IGMP snooping on the brX interface, just like
> any other interface of the bridge. However, currently the brX
> interface is missing all the needed data structures to do this. There
> is no net_bridge_port structure for the brX interface. This strucuture
> is created when an interface is added to the bridge. But the brX
> interface is not a member of the bridge. So this patchset makes the
> brX interface a first class member of the bridge. When the brX
> interface is opened, the interface is added to the bridge. A
> net_bridge_port is allocated for it, and IGMP snooping is performed as
> usual.
I have actually discussed this idea long time ago with Vlad and it has very nice
upsides (most important one removing br/port checks everywhere) but it blows up
fast with special cases for the bridge and things look very similar. You'll need
to rework the whole bridge and turn every bridge special case into either a port
generic one or again bridge-specific special case but with a check for the new flag.
I will not point out every bug that comes out of this, but registering the bridge
rx handler to itself is simply wrong on many levels and breaks many setups.
>
> There are some complexities here. Some assumptions are broken, like
> the master interface of a port interface is the bridge interface. The
> brX interface cannot be its own master. The use of
> netdev_master_upper_dev_get() within the bridge code has been changed
> to reflecit this. The bridge receive handler needs to not process
> frames for the brX interface, etc.
>
> The interface downward to the hardware is also an issue. The code
> presented here is a hack and needs to change. But that is secondary
> and can be solved once it is agreed how the bridge needs to change to
> support this use case.
Definitely agree with this statement. :-)
>
> Comment welcome and wanted.
>
> Andrew
>
> Andrew Lunn (5):
> net: rtnetlink: Handle bridge port without upper device
> net: bridge: Skip receive handler on brX interface
> net: bridge: Make the brX interface a member of the bridge
> net: dsa: HACK: Handle MDB add/remove for none-switch ports
> net: dsa: Don't include CPU port when adding MDB to a port
>
> include/linux/if_bridge.h | 1 +
> net/bridge/br_device.c | 12 ++++++++++--
> net/bridge/br_if.c | 37 ++++++++++++++++++++++++-------------
> net/bridge/br_input.c | 4 ++++
> net/bridge/br_mdb.c | 2 --
> net/bridge/br_multicast.c | 7 ++++---
> net/bridge/br_private.h | 1 +
> net/core/rtnetlink.c | 23 +++++++++++++++++++++--
> net/dsa/port.c | 19 +++++++++++++++++--
> net/dsa/switch.c | 2 +-
> 10 files changed, 83 insertions(+), 25 deletions(-)
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-26 22:17 ` [PATCH RFC WIP 0/5] IGMP snooping for local traffic Nikolay Aleksandrov
@ 2017-08-26 22:40 ` Nikolay Aleksandrov
2017-08-26 23:02 ` Andrew Lunn
1 sibling, 0 replies; 12+ messages in thread
From: Nikolay Aleksandrov @ 2017-08-26 22:40 UTC (permalink / raw)
To: Andrew Lunn, netdev; +Cc: Florian Fainelli, Vivien Didelot, roopa, bridge, jiri
On 27.08.2017 01:17, Nikolay Aleksandrov wrote:
> On 26/08/17 23:56, Andrew Lunn wrote:
>> This is a WIP patchset i would like comments on from bridge, switchdev
>> and hardware offload people.
>>
>> The linux bridge supports IGMP snooping. It will listen to IGMP
>> reports on bridge ports and keep track of which groups have been
>> joined on an interface. It will then forward multicast based on this
>> group membership.
>>
>> When the bridge adds or removed groups from an interface, it uses
>> switchdev to request the hardware add an mdb to a port, so the
>> hardware can perform the selective forwarding between ports.
>>
>> What is not covered by the current bridge code, is IGMP joins/leaves
>> from the host on the brX interface. No such monitoring is
>
> Hi Andrew,
>
> Have you taken a look at mglist (the boolean, probably needs a rename) ? It is for
> exactly that purpose, to track which groups the bridge is interested in.
> I assume I'm forgetting or missing something here.
>
>> performed. With a pure software bridge, it is not required. All
>> mulitcast frames are passed to the brX interface, and the network
>
> If mglist (again the boolean) is false then they won't be passed up.
>
>> stack filters them, as it does for any interface. However, when
>> hardware offload is involved, things change. We should program the
>> hardware to only send multcast packets to the host when the host has
>> in interest in them.
>
> Granted the boolean mglist might need some changes (esp. with host group leave)
> but I think it can be used to program switchdev for host join/leave, can't
> we adjust its behaviour instead of introducing this complexity and avoid many
> headaches ?
>
>>
>> Thus we need to perform IGMP snooping on the brX interface, just like
>> any other interface of the bridge. However, currently the brX
>> interface is missing all the needed data structures to do this. There
>> is no net_bridge_port structure for the brX interface. This strucuture
>> is created when an interface is added to the bridge. But the brX
>> interface is not a member of the bridge. So this patchset makes the
>> brX interface a first class member of the bridge. When the brX
>> interface is opened, the interface is added to the bridge. A
>> net_bridge_port is allocated for it, and IGMP snooping is performed as
>> usual.
>
> I have actually discussed this idea long time ago with Vlad and it has very nice
> upsides (most important one removing br/port checks everywhere) but it blows up
> fast with special cases for the bridge and things look very similar. You'll need
> to rework the whole bridge and turn every bridge special case into either a port
> generic one or again bridge-specific special case but with a check for the new flag.
> I will not point out every bug that comes out of this, but registering the bridge
> rx handler to itself is simply wrong on many levels and breaks many setups.
This was a digression about making the bridge a proper port of itself
(e.g. port 0, linked and all), it is only tangential to this
implementation as it doesn't link the new port.
>
>>
>> There are some complexities here. Some assumptions are broken, like
>> the master interface of a port interface is the bridge interface. The
>> brX interface cannot be its own master. The use of
>> netdev_master_upper_dev_get() within the bridge code has been changed
>> to reflecit this. The bridge receive handler needs to not process
>> frames for the brX interface, etc.
>>
>> The interface downward to the hardware is also an issue. The code
>> presented here is a hack and needs to change. But that is secondary
>> and can be solved once it is agreed how the bridge needs to change to
>> support this use case.
>
> Definitely agree with this statement. :-)
>
>>
>> Comment welcome and wanted.
>>
>> Andrew
>>
>> Andrew Lunn (5):
>> net: rtnetlink: Handle bridge port without upper device
>> net: bridge: Skip receive handler on brX interface
>> net: bridge: Make the brX interface a member of the bridge
>> net: dsa: HACK: Handle MDB add/remove for none-switch ports
>> net: dsa: Don't include CPU port when adding MDB to a port
>>
>> include/linux/if_bridge.h | 1 +
>> net/bridge/br_device.c | 12 ++++++++++--
>> net/bridge/br_if.c | 37 ++++++++++++++++++++++++-------------
>> net/bridge/br_input.c | 4 ++++
>> net/bridge/br_mdb.c | 2 --
>> net/bridge/br_multicast.c | 7 ++++---
>> net/bridge/br_private.h | 1 +
>> net/core/rtnetlink.c | 23 +++++++++++++++++++++--
>> net/dsa/port.c | 19 +++++++++++++++++--
>> net/dsa/switch.c | 2 +-
>> 10 files changed, 83 insertions(+), 25 deletions(-)
>>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-26 22:17 ` [PATCH RFC WIP 0/5] IGMP snooping for local traffic Nikolay Aleksandrov
2017-08-26 22:40 ` Nikolay Aleksandrov
@ 2017-08-26 23:02 ` Andrew Lunn
1 sibling, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-26 23:02 UTC (permalink / raw)
To: Nikolay Aleksandrov
Cc: netdev, Vivien Didelot, Florian Fainelli, jiri, roopa, stephen,
bridge
> Hi Andrew,
>
> Have you taken a look at mglist (the boolean, probably needs a rename) ? It is for
> exactly that purpose, to track which groups the bridge is interested in.
> I assume I'm forgetting or missing something here.
>
> > performed. With a pure software bridge, it is not required. All
> > mulitcast frames are passed to the brX interface, and the network
>
> If mglist (again the boolean) is false then they won't be passed up.
>
> > stack filters them, as it does for any interface. However, when
> > hardware offload is involved, things change. We should program the
> > hardware to only send multcast packets to the host when the host has
> > in interest in them.
>
> Granted the boolean mglist might need some changes (esp. with host group leave)
> but I think it can be used to program switchdev for host join/leave, can't
> we adjust its behaviour instead of introducing this complexity and avoid many
> headaches ?
I would like to avoid this complexity as well. I will take a look at
mglist. Thanks for the hint.
Andrew
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
` (5 preceding siblings ...)
2017-08-26 22:17 ` [PATCH RFC WIP 0/5] IGMP snooping for local traffic Nikolay Aleksandrov
@ 2017-08-28 2:44 ` Florian Fainelli
2017-08-28 13:45 ` Andrew Lunn
2017-08-28 15:11 ` Stephen Hemminger
7 siblings, 1 reply; 12+ messages in thread
From: Florian Fainelli @ 2017-08-28 2:44 UTC (permalink / raw)
To: Andrew Lunn, netdev; +Cc: Vivien Didelot, nikolay, roopa, bridge, jiri
Hi Andrew,
On 08/26/2017 01:56 PM, Andrew Lunn wrote:
> This is a WIP patchset i would like comments on from bridge,
> switchdev and hardware offload people.
>
> The linux bridge supports IGMP snooping. It will listen to IGMP
> reports on bridge ports and keep track of which groups have been
> joined on an interface. It will then forward multicast based on this
> group membership.
>
> When the bridge adds or removed groups from an interface, it uses
> switchdev to request the hardware add an mdb to a port, so the
> hardware can perform the selective forwarding between ports.
>
> What is not covered by the current bridge code, is IGMP joins/leaves
> from the host on the brX interface. No such monitoring is performed.
> With a pure software bridge, it is not required. All mulitcast frames
> are passed to the brX interface, and the network stack filters them,
> as it does for any interface. However, when hardware offload is
> involved, things change. We should program the hardware to only send
> multcast packets to the host when the host has in interest in them.
OK, so if I understand this right, without a bridge, we have the
following happen today: with a DSA-enabled setup using any kind of
switch tagging protocol, if a host is interested in receiving particular
multicast traffic, we would receive IGMP joins/leaves through sw0p0, and
the stack should call ndo_set_rx_mode for sw0p0, which would be
dsa_slave_set_rx_mode() and which would synchronize the DSA master
network device with the slave network device, everything works fine
provided that the CPU port is configured to accept multicast traffic.
Note here that we don't really add a MDB entry for sw0p0 when that
happens, but it seems like we should for switches that lack IGMP
snooping and/or multicast filtering.
With the current bridge and DSA code, are not we actually always going
to get the CPU port to be added with the multicast address and therefore
no filtering is occurring and snooping is pretty much useless?
>
> Thus we need to perform IGMP snooping on the brX interface, just
> like any other interface of the bridge. However, currently the brX
> interface is missing all the needed data structures to do this.
> There is no net_bridge_port structure for the brX interface. This
> strucuture is created when an interface is added to the bridge. But
> the brX interface is not a member of the bridge. So this patchset
> makes the brX interface a first class member of the bridge. When the
> brX interface is opened, the interface is added to the bridge. A
> net_bridge_port is allocated for it, and IGMP snooping is performed
> as usual.
Would not making brX be part of the bridge have a huge negative
performance impact on locally generated traffic either? Even though we
do an early return in br_handle_frame() this may become noticeable.
>
> There are some complexities here. Some assumptions are broken, like
> the master interface of a port interface is the bridge interface.
> The brX interface cannot be its own master. The use of
> netdev_master_upper_dev_get() within the bridge code has been
> changed to reflecit this. The bridge receive handler needs to not
> process frames for the brX interface, etc.
>
> The interface downward to the hardware is also an issue. The code
> presented here is a hack and needs to change. But that is secondary
> and can be solved once it is agreed how the bridge needs to change
> to support this use case.
>
> Comment welcome and wanted.
While I understand the reasons why you did it that way, I think this is
going to break a lot of code in bridge that does not expect brX to be a
bridge port member.
Maybe we can just generate switch MDB events targeting the bridge
network device and let switch drivers resolve that to whatever their
CPU/master port is?
It does sound like we are moving more and more to a model where brX
becomes one (if not the only one) net_device representor of what the
CPU/master port of a switch is (at least with DSA) which sort of makes
us go back to the multi-CPU port discussion we had a while ago.
Thanks!
--
Florian
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-28 2:44 ` Florian Fainelli
@ 2017-08-28 13:45 ` Andrew Lunn
0 siblings, 0 replies; 12+ messages in thread
From: Andrew Lunn @ 2017-08-28 13:45 UTC (permalink / raw)
To: Florian Fainelli
Cc: netdev, Vivien Didelot, nikolay, jiri, roopa, stephen, bridge
On Sun, Aug 27, 2017 at 07:44:15PM -0700, Florian Fainelli wrote:
> Hi Andrew,
>
> On 08/26/2017 01:56 PM, Andrew Lunn wrote:
> > This is a WIP patchset i would like comments on from bridge,
> > switchdev and hardware offload people.
> >
> > The linux bridge supports IGMP snooping. It will listen to IGMP
> > reports on bridge ports and keep track of which groups have been
> > joined on an interface. It will then forward multicast based on this
> > group membership.
> >
> > When the bridge adds or removed groups from an interface, it uses
> > switchdev to request the hardware add an mdb to a port, so the
> > hardware can perform the selective forwarding between ports.
> >
> > What is not covered by the current bridge code, is IGMP joins/leaves
> > from the host on the brX interface. No such monitoring is performed.
> > With a pure software bridge, it is not required. All mulitcast frames
> > are passed to the brX interface, and the network stack filters them,
> > as it does for any interface. However, when hardware offload is
> > involved, things change. We should program the hardware to only send
> > multcast packets to the host when the host has in interest in them.
>
> OK, so if I understand this right, without a bridge, we have the
> following happen today: with a DSA-enabled setup using any kind of
> switch tagging protocol, if a host is interested in receiving particular
> multicast traffic, we would receive IGMP joins/leaves through sw0p0, and
> the stack should call ndo_set_rx_mode for sw0p0, which would be
> dsa_slave_set_rx_mode() and which would synchronize the DSA master
> network device with the slave network device, everything works fine
> provided that the CPU port is configured to accept multicast traffic.
Hi Florian
That sounds about right, but it is not a use case i've looked at yet.
I do need to look at it though, because there is a chance i break it
with IGMP snooping support.
> Note here that we don't really add a MDB entry for sw0p0 when that
> happens, but it seems like we should for switches that lack IGMP
> snooping and/or multicast filtering.
Yes. But it is not an MDB in the normal sense, at least from the
switches perspective. An MDB passed to a switch using switchdev says
that traffic for the specified group should go out that port. However,
when the host does a join on sw0p0, we want the exact opposite,
multicast traffic coming in that port of the switch which matches the
group should be forwarded to the host. So my second attempt at this
code adds a new switchdev object, SWITCHDEV_OBJ_ID_PORT_HOST_MDB.
> With the current bridge and DSA code, are not we actually always going
> to get the CPU port to be added with the multicast address and therefore
> no filtering is occurring and snooping is pretty much useless?
Correct. At the moment, all multicast traffic gets delivered to the
host.
> While I understand the reasons why you did it that way, I think this is
> going to break a lot of code in bridge that does not expect brX to be a
> bridge port member.
It does.
Nikolay suggestion works a lot better, and i'm following that now. It
looks good and only requires some minor tweaks to the bridge code.
Andrew
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC WIP 0/5] IGMP snooping for local traffic
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
` (6 preceding siblings ...)
2017-08-28 2:44 ` Florian Fainelli
@ 2017-08-28 15:11 ` Stephen Hemminger
7 siblings, 0 replies; 12+ messages in thread
From: Stephen Hemminger @ 2017-08-28 15:11 UTC (permalink / raw)
To: Andrew Lunn
Cc: netdev, Vivien Didelot, Florian Fainelli, nikolay, jiri, roopa,
bridge
On Sat, 26 Aug 2017 22:56:05 +0200
Andrew Lunn <andrew@lunn.ch> wrote:
> This is a WIP patchset i would like comments on from bridge, switchdev
> and hardware offload people.
>
> The linux bridge supports IGMP snooping. It will listen to IGMP
> reports on bridge ports and keep track of which groups have been
> joined on an interface. It will then forward multicast based on this
> group membership.
>
> When the bridge adds or removed groups from an interface, it uses
> switchdev to request the hardware add an mdb to a port, so the
> hardware can perform the selective forwarding between ports.
>
> What is not covered by the current bridge code, is IGMP joins/leaves
> from the host on the brX interface. No such monitoring is
> performed. With a pure software bridge, it is not required. All
> mulitcast frames are passed to the brX interface, and the network
> stack filters them, as it does for any interface. However, when
> hardware offload is involved, things change. We should program the
> hardware to only send multcast packets to the host when the host has
> in interest in them.
>
> Thus we need to perform IGMP snooping on the brX interface, just like
> any other interface of the bridge. However, currently the brX
> interface is missing all the needed data structures to do this. There
> is no net_bridge_port structure for the brX interface. This strucuture
> is created when an interface is added to the bridge. But the brX
> interface is not a member of the bridge. So this patchset makes the
> brX interface a first class member of the bridge. When the brX
> interface is opened, the interface is added to the bridge. A
> net_bridge_port is allocated for it, and IGMP snooping is performed as
> usual.
>
> There are some complexities here. Some assumptions are broken, like
> the master interface of a port interface is the bridge interface. The
> brX interface cannot be its own master. The use of
> netdev_master_upper_dev_get() within the bridge code has been changed
> to reflecit this. The bridge receive handler needs to not process
> frames for the brX interface, etc.
>
> The interface downward to the hardware is also an issue. The code
> presented here is a hack and needs to change. But that is secondary
> and can be solved once it is agreed how the bridge needs to change to
> support this use case.
>
> Comment welcome and wanted.
>
> Andrew
>
> Andrew Lunn (5):
> net: rtnetlink: Handle bridge port without upper device
> net: bridge: Skip receive handler on brX interface
> net: bridge: Make the brX interface a member of the bridge
> net: dsa: HACK: Handle MDB add/remove for none-switch ports
> net: dsa: Don't include CPU port when adding MDB to a port
>
> include/linux/if_bridge.h | 1 +
> net/bridge/br_device.c | 12 ++++++++++--
> net/bridge/br_if.c | 37 ++++++++++++++++++++++++-------------
> net/bridge/br_input.c | 4 ++++
> net/bridge/br_mdb.c | 2 --
> net/bridge/br_multicast.c | 7 ++++---
> net/bridge/br_private.h | 1 +
> net/core/rtnetlink.c | 23 +++++++++++++++++++++--
> net/dsa/port.c | 19 +++++++++++++++++--
> net/dsa/switch.c | 2 +-
> 10 files changed, 83 insertions(+), 25 deletions(-)
>
Sorry you can't change the semantics of the bridge like this.
There are likely to be scripts and management utilities that won't work after this.
Figure out another way. Such as adding IGMP updates in the local packet send/receive path.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-08-28 15:11 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-26 20:56 [PATCH RFC WIP 0/5] IGMP snooping for local traffic Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 1/5] net: rtnetlink: Handle bridge port without upper device Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 2/5] net: bridge: Skip receive handler on brX interface Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 3/5] net: bridge: Make the brX interface a member of the bridge Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 4/5] net: dsa: HACK: Handle MDB add/remove for none-switch ports Andrew Lunn
2017-08-26 20:56 ` [PATCH RFC WIP 5/5] net: dsa: Don't include CPU port when adding MDB to a port Andrew Lunn
2017-08-26 22:17 ` [PATCH RFC WIP 0/5] IGMP snooping for local traffic Nikolay Aleksandrov
2017-08-26 22:40 ` Nikolay Aleksandrov
2017-08-26 23:02 ` Andrew Lunn
2017-08-28 2:44 ` Florian Fainelli
2017-08-28 13:45 ` Andrew Lunn
2017-08-28 15:11 ` Stephen Hemminger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).