* [patch net 0/3] bridge: mdb: Couple of fixes
@ 2016-04-21 10:52 Jiri Pirko
2016-04-21 10:52 ` [patch net 1/3] switchdev: Adding complete operation to deferred switchdev ops Jiri Pirko
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Jiri Pirko @ 2016-04-21 10:52 UTC (permalink / raw)
To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, stephen, nikolay
From: Jiri Pirko <jiri@mellanox.com>
Elad says:
This patchset fixes two problems reported by Nikolay Aleksandrov. The first
problem is that the MDB offload flag might be accesed without helding the
multicast_lock.
The second problem is that the switchdev mdb offload is deferred and
the offload bit was marked regardless if the operation succeeded or not.
Elad Raz (3):
switchdev: Adding complete operation to deferred switchdev ops
bridge: mdb: Common function for mdb entry translation
bridge: mdb: Marking port-group as offloaded
include/net/switchdev.h | 4 ++
net/bridge/br_mdb.c | 124 +++++++++++++++++++++++++++++-----------------
net/bridge/br_multicast.c | 8 +--
net/bridge/br_private.h | 4 +-
net/switchdev/switchdev.c | 6 +++
5 files changed, 96 insertions(+), 50 deletions(-)
--
2.5.5
^ permalink raw reply [flat|nested] 5+ messages in thread
* [patch net 1/3] switchdev: Adding complete operation to deferred switchdev ops
2016-04-21 10:52 [patch net 0/3] bridge: mdb: Couple of fixes Jiri Pirko
@ 2016-04-21 10:52 ` Jiri Pirko
2016-04-21 10:52 ` [patch net 2/3] bridge: mdb: Common function for mdb entry translation Jiri Pirko
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Jiri Pirko @ 2016-04-21 10:52 UTC (permalink / raw)
To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, stephen, nikolay
From: Elad Raz <eladr@mellanox.com>
When using switchdev deferred operation (SWITCHDEV_F_DEFER), the operation
is executed in different context and the application doesn't have any way
to get the operation real status.
Adding a completion callback fixes that. This patch adds fields to
switchdev_attr and switchdev_obj "complete_priv" field which is used by
the "complete" callback.
Application can set a complete function which will be called once the
operation executed.
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
include/net/switchdev.h | 4 ++++
net/switchdev/switchdev.c | 6 ++++++
2 files changed, 10 insertions(+)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index d451122..51d77b2 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -54,6 +54,8 @@ struct switchdev_attr {
struct net_device *orig_dev;
enum switchdev_attr_id id;
u32 flags;
+ void *complete_priv;
+ void (*complete)(struct net_device *dev, int err, void *priv);
union {
struct netdev_phys_item_id ppid; /* PORT_PARENT_ID */
u8 stp_state; /* PORT_STP_STATE */
@@ -75,6 +77,8 @@ struct switchdev_obj {
struct net_device *orig_dev;
enum switchdev_obj_id id;
u32 flags;
+ void *complete_priv;
+ void (*complete)(struct net_device *dev, int err, void *priv);
};
/* SWITCHDEV_OBJ_ID_PORT_VLAN */
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 2b9b98f..b7e01d8 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -305,6 +305,8 @@ static void switchdev_port_attr_set_deferred(struct net_device *dev,
if (err && err != -EOPNOTSUPP)
netdev_err(dev, "failed (err=%d) to set attribute (id=%d)\n",
err, attr->id);
+ if (attr->complete)
+ attr->complete(dev, err, attr->complete_priv);
}
static int switchdev_port_attr_set_defer(struct net_device *dev,
@@ -434,6 +436,8 @@ static void switchdev_port_obj_add_deferred(struct net_device *dev,
if (err && err != -EOPNOTSUPP)
netdev_err(dev, "failed (err=%d) to add object (id=%d)\n",
err, obj->id);
+ if (obj->complete)
+ obj->complete(dev, err, obj->complete_priv);
}
static int switchdev_port_obj_add_defer(struct net_device *dev,
@@ -502,6 +506,8 @@ static void switchdev_port_obj_del_deferred(struct net_device *dev,
if (err && err != -EOPNOTSUPP)
netdev_err(dev, "failed (err=%d) to del object (id=%d)\n",
err, obj->id);
+ if (obj->complete)
+ obj->complete(dev, err, obj->complete_priv);
}
static int switchdev_port_obj_del_defer(struct net_device *dev,
--
2.5.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [patch net 2/3] bridge: mdb: Common function for mdb entry translation
2016-04-21 10:52 [patch net 0/3] bridge: mdb: Couple of fixes Jiri Pirko
2016-04-21 10:52 ` [patch net 1/3] switchdev: Adding complete operation to deferred switchdev ops Jiri Pirko
@ 2016-04-21 10:52 ` Jiri Pirko
2016-04-21 10:52 ` [patch net 3/3] bridge: mdb: Marking port-group as offloaded Jiri Pirko
2016-04-24 18:23 ` [patch net 0/3] bridge: mdb: Couple of fixes David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Jiri Pirko @ 2016-04-21 10:52 UTC (permalink / raw)
To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, stephen, nikolay
From: Elad Raz <eladr@mellanox.com>
There is duplicate code that translates br_mdb_entry to br_ip let's wrap it
in a common function.
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 33 +++++++++++++++------------------
1 file changed, 15 insertions(+), 18 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index 253bc77..b258120 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -61,6 +61,19 @@ static void __mdb_entry_fill_flags(struct br_mdb_entry *e, unsigned char flags)
e->flags |= MDB_FLAGS_OFFLOAD;
}
+static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
+{
+ memset(ip, 0, sizeof(struct br_ip));
+ ip->vid = entry->vid;
+ ip->proto = entry->addr.proto;
+ if (ip->proto == htons(ETH_P_IP))
+ ip->u.ip4 = entry->addr.u.ip4;
+#if IS_ENABLED(CONFIG_IPV6)
+ else
+ ip->u.ip6 = entry->addr.u.ip6;
+#endif
+}
+
static int br_mdb_fill_info(struct sk_buff *skb, struct netlink_callback *cb,
struct net_device *dev)
{
@@ -509,15 +522,7 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br,
if (!p || p->br != br || p->state == BR_STATE_DISABLED)
return -EINVAL;
- memset(&ip, 0, sizeof(ip));
- ip.vid = entry->vid;
- ip.proto = entry->addr.proto;
- if (ip.proto == htons(ETH_P_IP))
- ip.u.ip4 = entry->addr.u.ip4;
-#if IS_ENABLED(CONFIG_IPV6)
- else
- ip.u.ip6 = entry->addr.u.ip6;
-#endif
+ __mdb_entry_to_br_ip(entry, &ip);
spin_lock_bh(&br->multicast_lock);
ret = br_mdb_add_group(br, p, &ip, entry->state, pg);
@@ -584,15 +589,7 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry)
if (!netif_running(br->dev) || br->multicast_disabled)
return -EINVAL;
- memset(&ip, 0, sizeof(ip));
- ip.vid = entry->vid;
- ip.proto = entry->addr.proto;
- if (ip.proto == htons(ETH_P_IP))
- ip.u.ip4 = entry->addr.u.ip4;
-#if IS_ENABLED(CONFIG_IPV6)
- else
- ip.u.ip6 = entry->addr.u.ip6;
-#endif
+ __mdb_entry_to_br_ip(entry, &ip);
spin_lock_bh(&br->multicast_lock);
mdb = mlock_dereference(br->mdb, br);
--
2.5.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [patch net 3/3] bridge: mdb: Marking port-group as offloaded
2016-04-21 10:52 [patch net 0/3] bridge: mdb: Couple of fixes Jiri Pirko
2016-04-21 10:52 ` [patch net 1/3] switchdev: Adding complete operation to deferred switchdev ops Jiri Pirko
2016-04-21 10:52 ` [patch net 2/3] bridge: mdb: Common function for mdb entry translation Jiri Pirko
@ 2016-04-21 10:52 ` Jiri Pirko
2016-04-24 18:23 ` [patch net 0/3] bridge: mdb: Couple of fixes David Miller
3 siblings, 0 replies; 5+ messages in thread
From: Jiri Pirko @ 2016-04-21 10:52 UTC (permalink / raw)
To: netdev; +Cc: davem, idosch, eladr, yotamg, ogerlitz, stephen, nikolay
From: Elad Raz <eladr@mellanox.com>
There is a race-condition when updating the mdb offload flag without using
the mulicast_lock. This reverts commit 9e8430f8d60d98 ("bridge: mdb:
Passing the port-group pointer to br_mdb module").
This patch marks offloaded MDB entry as "offload" by changing the port-
group flags and marks it as MDB_PG_FLAGS_OFFLOAD.
When switchdev PORT_MDB succeeded and adds a multicast group, a completion
callback is been invoked "br_mdb_complete". The completion function
locks the multicast_lock and finds the right net_bridge_port_group and
marks it as offloaded.
Fixes: 9e8430f8d60d98 ("bridge: mdb: Passing the port-group pointer to br_mdb module")
Reported-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
net/bridge/br_mdb.c | 91 +++++++++++++++++++++++++++++++++--------------
net/bridge/br_multicast.c | 8 +++--
net/bridge/br_private.h | 4 +--
3 files changed, 71 insertions(+), 32 deletions(-)
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b258120..7dbc80d 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -256,9 +256,45 @@ static inline size_t rtnl_mdb_nlmsg_size(void)
+ nla_total_size(sizeof(struct br_mdb_entry));
}
-static void __br_mdb_notify(struct net_device *dev, struct br_mdb_entry *entry,
- int type, struct net_bridge_port_group *pg)
+struct br_mdb_complete_info {
+ struct net_bridge_port *port;
+ struct br_ip ip;
+};
+
+static void br_mdb_complete(struct net_device *dev, int err, void *priv)
{
+ struct br_mdb_complete_info *data = priv;
+ struct net_bridge_port_group __rcu **pp;
+ struct net_bridge_port_group *p;
+ struct net_bridge_mdb_htable *mdb;
+ struct net_bridge_mdb_entry *mp;
+ struct net_bridge_port *port = data->port;
+ struct net_bridge *br = port->br;
+
+ if (err)
+ goto err;
+
+ spin_lock_bh(&br->multicast_lock);
+ mdb = mlock_dereference(br->mdb, br);
+ mp = br_mdb_ip_get(mdb, &data->ip);
+ if (!mp)
+ goto out;
+ for (pp = &mp->ports; (p = mlock_dereference(*pp, br)) != NULL;
+ pp = &p->next) {
+ if (p->port != port)
+ continue;
+ p->flags |= MDB_PG_FLAGS_OFFLOAD;
+ }
+out:
+ spin_unlock_bh(&br->multicast_lock);
+err:
+ kfree(priv);
+}
+
+static void __br_mdb_notify(struct net_device *dev, struct net_bridge_port *p,
+ struct br_mdb_entry *entry, int type)
+{
+ struct br_mdb_complete_info *complete_info;
struct switchdev_obj_port_mdb mdb = {
.obj = {
.id = SWITCHDEV_OBJ_ID_PORT_MDB,
@@ -281,9 +317,14 @@ static void __br_mdb_notify(struct net_device *dev, struct br_mdb_entry *entry,
mdb.obj.orig_dev = port_dev;
if (port_dev && type == RTM_NEWMDB) {
- err = switchdev_port_obj_add(port_dev, &mdb.obj);
- if (!err && pg)
- pg->flags |= MDB_PG_FLAGS_OFFLOAD;
+ complete_info = kmalloc(sizeof(*complete_info), GFP_ATOMIC);
+ if (complete_info) {
+ complete_info->port = p;
+ __mdb_entry_to_br_ip(entry, &complete_info->ip);
+ mdb.obj.complete_priv = complete_info;
+ mdb.obj.complete = br_mdb_complete;
+ switchdev_port_obj_add(port_dev, &mdb.obj);
+ }
} else if (port_dev && type == RTM_DELMDB) {
switchdev_port_obj_del(port_dev, &mdb.obj);
}
@@ -304,21 +345,21 @@ errout:
rtnl_set_sk_err(net, RTNLGRP_MDB, err);
}
-void br_mdb_notify(struct net_device *dev, struct net_bridge_port_group *pg,
- int type)
+void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port,
+ struct br_ip *group, int type, u8 flags)
{
struct br_mdb_entry entry;
memset(&entry, 0, sizeof(entry));
- entry.ifindex = pg->port->dev->ifindex;
- entry.addr.proto = pg->addr.proto;
- entry.addr.u.ip4 = pg->addr.u.ip4;
+ entry.ifindex = port->dev->ifindex;
+ entry.addr.proto = group->proto;
+ entry.addr.u.ip4 = group->u.ip4;
#if IS_ENABLED(CONFIG_IPV6)
- entry.addr.u.ip6 = pg->addr.u.ip6;
+ entry.addr.u.ip6 = group->u.ip6;
#endif
- entry.vid = pg->addr.vid;
- __mdb_entry_fill_flags(&entry, pg->flags);
- __br_mdb_notify(dev, &entry, type, pg);
+ entry.vid = group->vid;
+ __mdb_entry_fill_flags(&entry, flags);
+ __br_mdb_notify(dev, port, &entry, type);
}
static int nlmsg_populate_rtr_fill(struct sk_buff *skb,
@@ -463,8 +504,7 @@ static int br_mdb_parse(struct sk_buff *skb, struct nlmsghdr *nlh,
}
static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
- struct br_ip *group, unsigned char state,
- struct net_bridge_port_group **pg)
+ struct br_ip *group, unsigned char state)
{
struct net_bridge_mdb_entry *mp;
struct net_bridge_port_group *p;
@@ -495,7 +535,6 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
if (unlikely(!p))
return -ENOMEM;
rcu_assign_pointer(*pp, p);
- *pg = p;
if (state == MDB_TEMPORARY)
mod_timer(&p->timer, now + br->multicast_membership_interval);
@@ -503,8 +542,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
}
static int __br_mdb_add(struct net *net, struct net_bridge *br,
- struct br_mdb_entry *entry,
- struct net_bridge_port_group **pg)
+ struct br_mdb_entry *entry)
{
struct br_ip ip;
struct net_device *dev;
@@ -525,7 +563,7 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br,
__mdb_entry_to_br_ip(entry, &ip);
spin_lock_bh(&br->multicast_lock);
- ret = br_mdb_add_group(br, p, &ip, entry->state, pg);
+ ret = br_mdb_add_group(br, p, &ip, entry->state);
spin_unlock_bh(&br->multicast_lock);
return ret;
}
@@ -533,7 +571,6 @@ static int __br_mdb_add(struct net *net, struct net_bridge *br,
static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
{
struct net *net = sock_net(skb->sk);
- struct net_bridge_port_group *pg;
struct net_bridge_vlan_group *vg;
struct net_device *dev, *pdev;
struct br_mdb_entry *entry;
@@ -563,15 +600,15 @@ static int br_mdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
if (br_vlan_enabled(br) && vg && entry->vid == 0) {
list_for_each_entry(v, &vg->vlan_list, vlist) {
entry->vid = v->vid;
- err = __br_mdb_add(net, br, entry, &pg);
+ err = __br_mdb_add(net, br, entry);
if (err)
break;
- __br_mdb_notify(dev, entry, RTM_NEWMDB, pg);
+ __br_mdb_notify(dev, p, entry, RTM_NEWMDB);
}
} else {
- err = __br_mdb_add(net, br, entry, &pg);
+ err = __br_mdb_add(net, br, entry);
if (!err)
- __br_mdb_notify(dev, entry, RTM_NEWMDB, pg);
+ __br_mdb_notify(dev, p, entry, RTM_NEWMDB);
}
return err;
@@ -659,12 +696,12 @@ static int br_mdb_del(struct sk_buff *skb, struct nlmsghdr *nlh)
entry->vid = v->vid;
err = __br_mdb_del(br, entry);
if (!err)
- __br_mdb_notify(dev, entry, RTM_DELMDB, NULL);
+ __br_mdb_notify(dev, p, entry, RTM_DELMDB);
}
} else {
err = __br_mdb_del(br, entry);
if (!err)
- __br_mdb_notify(dev, entry, RTM_DELMDB, NULL);
+ __br_mdb_notify(dev, p, entry, RTM_DELMDB);
}
return err;
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index a4c15df..191ea66 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -283,7 +283,8 @@ static void br_multicast_del_pg(struct net_bridge *br,
rcu_assign_pointer(*pp, p->next);
hlist_del_init(&p->mglist);
del_timer(&p->timer);
- br_mdb_notify(br->dev, p, RTM_DELMDB);
+ br_mdb_notify(br->dev, p->port, &pg->addr, RTM_DELMDB,
+ p->flags);
call_rcu_bh(&p->rcu, br_multicast_free_pg);
if (!mp->ports && !mp->mglist &&
@@ -705,7 +706,7 @@ static int br_multicast_add_group(struct net_bridge *br,
if (unlikely(!p))
goto err;
rcu_assign_pointer(*pp, p);
- br_mdb_notify(br->dev, p, RTM_NEWMDB);
+ br_mdb_notify(br->dev, port, group, RTM_NEWMDB, 0);
found:
mod_timer(&p->timer, now + br->multicast_membership_interval);
@@ -1461,7 +1462,8 @@ br_multicast_leave_group(struct net_bridge *br,
hlist_del_init(&p->mglist);
del_timer(&p->timer);
call_rcu_bh(&p->rcu, br_multicast_free_pg);
- br_mdb_notify(br->dev, p, RTM_DELMDB);
+ br_mdb_notify(br->dev, port, group, RTM_DELMDB,
+ p->flags);
if (!mp->ports && !mp->mglist &&
netif_running(br->dev))
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 1b5d145..d9da857 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -560,8 +560,8 @@ br_multicast_new_port_group(struct net_bridge_port *port, struct br_ip *group,
unsigned char flags);
void br_mdb_init(void);
void br_mdb_uninit(void);
-void br_mdb_notify(struct net_device *dev, struct net_bridge_port_group *pg,
- int type);
+void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port,
+ struct br_ip *group, int type, u8 flags);
void br_rtr_notify(struct net_device *dev, struct net_bridge_port *port,
int type);
--
2.5.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [patch net 0/3] bridge: mdb: Couple of fixes
2016-04-21 10:52 [patch net 0/3] bridge: mdb: Couple of fixes Jiri Pirko
` (2 preceding siblings ...)
2016-04-21 10:52 ` [patch net 3/3] bridge: mdb: Marking port-group as offloaded Jiri Pirko
@ 2016-04-24 18:23 ` David Miller
3 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2016-04-24 18:23 UTC (permalink / raw)
To: jiri; +Cc: netdev, idosch, eladr, yotamg, ogerlitz, stephen, nikolay
From: Jiri Pirko <jiri@resnulli.us>
Date: Thu, 21 Apr 2016 12:52:42 +0200
> From: Jiri Pirko <jiri@mellanox.com>
>
> Elad says:
>
> This patchset fixes two problems reported by Nikolay Aleksandrov. The first
> problem is that the MDB offload flag might be accesed without helding the
> multicast_lock.
> The second problem is that the switchdev mdb offload is deferred and
> the offload bit was marked regardless if the operation succeeded or not.
Series applied, thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-04-24 18:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-21 10:52 [patch net 0/3] bridge: mdb: Couple of fixes Jiri Pirko
2016-04-21 10:52 ` [patch net 1/3] switchdev: Adding complete operation to deferred switchdev ops Jiri Pirko
2016-04-21 10:52 ` [patch net 2/3] bridge: mdb: Common function for mdb entry translation Jiri Pirko
2016-04-21 10:52 ` [patch net 3/3] bridge: mdb: Marking port-group as offloaded Jiri Pirko
2016-04-24 18:23 ` [patch net 0/3] bridge: mdb: Couple of fixes David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).