Netdev List
 help / color / mirror / Atom feed
* [patch net-next v5 10/21] bridge: add API to notify bridge driver of learned FBD on offloaded device
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

From: Scott Feldman <sfeldma@gmail.com>

When the swdev device learns a new mac/vlan on a port, it sends some async
notification to the driver and the driver installs an FDB in the device.
To give a holistic system view, the learned mac/vlan should be reflected
in the bridge's FBD table, so the user, using normal iproute2 cmds, can view
what is currently learned by the device.  This API on the bridge driver gives
a way for the swdev driver to install an FBD entry in the bridge FBD table.
(And remove one).

This is equivalent to the device running these cmds:

  bridge fdb [add|del] <mac> dev <dev> vid <vlan id> master

This patch needs some extra eyeballs for review, in paricular around the
locking and contexts.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
v4->v5:
-no change
v3->v4:
-cosmetic changes to resolve conflicts
-call bh version of spinlock for hash_lock to avoid deadlock
v2->v3:
-added "external" word into function names to emphasize fdbs are learned
 externally
-added "added_by_external_learn" to fbd entry struct indicate the entry
 was learned externaly and build some logic around that
-expose the fact that fdb entry was learned externally to userspace
v1->v2:
-no change
---
 include/linux/if_bridge.h      | 18 +++++++++
 include/uapi/linux/neighbour.h |  1 +
 net/bridge/br_fdb.c            | 91 +++++++++++++++++++++++++++++++++++++++++-
 net/bridge/br_private.h        |  3 +-
 4 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 808dcb8..fa2eca6 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -37,6 +37,24 @@ extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __use
 typedef int br_should_route_hook_t(struct sk_buff *skb);
 extern br_should_route_hook_t __rcu *br_should_route_hook;
 
+#if IS_ENABLED(CONFIG_BRIDGE)
+int br_fdb_external_learn_add(struct net_device *dev,
+			      const unsigned char *addr, u16 vid);
+int br_fdb_external_learn_del(struct net_device *dev,
+			      const unsigned char *addr, u16 vid);
+#else
+static inline int br_fdb_external_learn_add(struct net_device *dev,
+					    const unsigned char *addr, u16 vid)
+{
+	return 0;
+}
+static inline int br_fdb_external_learn_del(struct net_device *dev,
+					    const unsigned char *addr, u16 vid)
+{
+	return 0;
+}
+#endif
+
 #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING)
 int br_multicast_list_adjacent(struct net_device *dev,
 			       struct list_head *br_ip_list);
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 2f043af..f3d77f9 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -38,6 +38,7 @@ enum {
 #define NTF_SELF	0x02
 #define NTF_MASTER	0x04
 #define NTF_PROXY	0x08	/* == ATF_PUBL */
+#define NTF_EXT_LEARNED	0x10
 #define NTF_ROUTER	0x80
 
 /*
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index b1be971..cc36e59 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -481,6 +481,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
 		fdb->is_local = 0;
 		fdb->is_static = 0;
 		fdb->added_by_user = 0;
+		fdb->added_by_external_learn = 0;
 		fdb->updated = fdb->used = jiffies;
 		hlist_add_head_rcu(&fdb->hlist, head);
 	}
@@ -613,7 +614,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 	ndm->ndm_family	 = AF_BRIDGE;
 	ndm->ndm_pad1    = 0;
 	ndm->ndm_pad2    = 0;
-	ndm->ndm_flags	 = 0;
+	ndm->ndm_flags	 = fdb->added_by_external_learn ? NTF_EXT_LEARNED : 0;
 	ndm->ndm_type	 = 0;
 	ndm->ndm_ifindex = fdb->dst ? fdb->dst->dev->ifindex : br->dev->ifindex;
 	ndm->ndm_state   = fdb_to_nud(fdb);
@@ -983,3 +984,91 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 		}
 	}
 }
+
+int br_fdb_external_learn_add(struct net_device *dev,
+			      const unsigned char *addr, u16 vid)
+{
+	struct net_bridge_port *p;
+	struct net_bridge *br;
+	struct hlist_head *head;
+	struct net_bridge_fdb_entry *fdb;
+	int err = 0;
+
+	rtnl_lock();
+
+	p = br_port_get_rtnl(dev);
+	if (!p) {
+		pr_info("bridge: %s not a bridge port\n", dev->name);
+		err = -EINVAL;
+		goto err_rtnl_unlock;
+	}
+
+	br = p->br;
+
+	spin_lock_bh(&br->hash_lock);
+
+	head = &br->hash[br_mac_hash(addr, vid)];
+	fdb = fdb_find(head, addr, vid);
+	if (!fdb) {
+		fdb = fdb_create(head, p, addr, vid);
+		if (!fdb) {
+			err = -ENOMEM;
+			goto err_unlock;
+		}
+		fdb->added_by_external_learn = 1;
+		fdb_notify(br, fdb, RTM_NEWNEIGH);
+	} else if (fdb->added_by_external_learn) {
+		/* Refresh entry */
+		fdb->updated = fdb->used = jiffies;
+	} else if (!fdb->added_by_user) {
+		/* Take over SW learned entry */
+		fdb->added_by_external_learn = 1;
+		fdb->updated = jiffies;
+		fdb_notify(br, fdb, RTM_NEWNEIGH);
+	}
+
+err_unlock:
+	spin_unlock_bh(&br->hash_lock);
+err_rtnl_unlock:
+	rtnl_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL(br_fdb_external_learn_add);
+
+int br_fdb_external_learn_del(struct net_device *dev,
+			      const unsigned char *addr, u16 vid)
+{
+	struct net_bridge_port *p;
+	struct net_bridge *br;
+	struct hlist_head *head;
+	struct net_bridge_fdb_entry *fdb;
+	int err = 0;
+
+	rtnl_lock();
+
+	p = br_port_get_rtnl(dev);
+	if (!p) {
+		pr_info("bridge: %s not a bridge port\n", dev->name);
+		err = -EINVAL;
+		goto err_rtnl_unlock;
+	}
+
+	br = p->br;
+
+	spin_lock_bh(&br->hash_lock);
+
+	head = &br->hash[br_mac_hash(addr, vid)];
+	fdb = fdb_find(head, addr, vid);
+	if (fdb && fdb->added_by_external_learn)
+		fdb_delete(br, fdb);
+	else
+		err = -ENOENT;
+
+	spin_unlock_bh(&br->hash_lock);
+err_rtnl_unlock:
+	rtnl_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL(br_fdb_external_learn_del);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 1b529da..cc36fb3 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -100,7 +100,8 @@ struct net_bridge_fdb_entry
 	mac_addr			addr;
 	unsigned char			is_local:1,
 					is_static:1,
-					added_by_user:1;
+					added_by_user:1,
+					added_by_external_learn:1;
 	__u16				vlan_id;
 };
 
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 09/21] bridge: call netdev_sw_port_stp_update when bridge port STP status changes
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

From: Scott Feldman <sfeldma@gmail.com>

To notify switch driver of change in STP state of bridge port, add new
.ndo op and provide switchdev wrapper func to call ndo op. Use it in bridge
code then.

Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
v4->v5:
-no change
v3->v4:
-added warning when call to netdev_sw_port_stp_update was not successful
 suggested by Andy
v2->v3:
-changed "sw" string to "switch" to avoid confusion
v1->v2:
-no change
---
 include/linux/netdevice.h |  5 +++++
 include/net/switchdev.h   |  7 +++++++
 net/bridge/br_stp.c       |  7 +++++++
 net/switchdev/switchdev.c | 19 +++++++++++++++++++
 4 files changed, 38 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3603f31..29c92ee 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1024,6 +1024,9 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	Called to get an ID of the switch chip this port is part of.
  *	If driver implements this, it indicates that it represents a port
  *	of a switch chip.
+ * int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state);
+ *	Called to notify switch device port of bridge port STP
+ *	state change.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1180,6 +1183,8 @@ struct net_device_ops {
 #ifdef CONFIG_NET_SWITCHDEV
 	int			(*ndo_switch_parent_id_get)(struct net_device *dev,
 							    struct netdev_phys_item_id *psid);
+	int			(*ndo_switch_port_stp_update)(struct net_device *dev,
+							      u8 state);
 #endif
 };
 
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 7a52360..8a6d164 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -16,6 +16,7 @@
 
 int netdev_switch_parent_id_get(struct net_device *dev,
 				struct netdev_phys_item_id *psid);
+int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
 
 #else
 
@@ -25,6 +26,12 @@ static inline int netdev_switch_parent_id_get(struct net_device *dev,
 	return -EOPNOTSUPP;
 }
 
+static inline int netdev_switch_port_stp_update(struct net_device *dev,
+						u8 state)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif
 
 #endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 2b047bc..fb3ebe6 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -12,6 +12,7 @@
  */
 #include <linux/kernel.h>
 #include <linux/rculist.h>
+#include <net/switchdev.h>
 
 #include "br_private.h"
 #include "br_private_stp.h"
@@ -38,7 +39,13 @@ void br_log_state(const struct net_bridge_port *p)
 
 void br_set_state(struct net_bridge_port *p, unsigned int state)
 {
+	int err;
+
 	p->state = state;
+	err = netdev_switch_port_stp_update(p->dev, state);
+	if (err && err != -EOPNOTSUPP)
+		br_warn(p->br, "error setting offload STP state on port %u(%s)\n",
+				(unsigned int) p->port_no, p->dev->name);
 }
 
 /* called under bridge lock */
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 66973de..d162b21 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -31,3 +31,22 @@ int netdev_switch_parent_id_get(struct net_device *dev,
 	return ops->ndo_switch_parent_id_get(dev, psid);
 }
 EXPORT_SYMBOL(netdev_switch_parent_id_get);
+
+/**
+ *	netdev_switch_port_stp_update - Notify switch device port of STP
+ *					state change
+ *	@dev: port device
+ *	@state: port STP state
+ *
+ *	Notify switch device port of bridge port STP state change.
+ */
+int netdev_switch_port_stp_update(struct net_device *dev, u8 state)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (!ops->ndo_switch_port_stp_update)
+		return -EOPNOTSUPP;
+	WARN_ON(!ops->ndo_switch_parent_id_get);
+	return ops->ndo_switch_port_stp_update(dev, state);
+}
+EXPORT_SYMBOL(netdev_switch_port_stp_update);
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 08/21] net-sysfs: expose physical switch id for particular device
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
v4->v5:
-no change
v3->v4:
-added entry to Documentation/ABI/testing/sysfs-class-net as suggested
 by Florian
v2->v3:
-changed "sw" string to "switch" to avoid confusion
v1->v2:
-no change
---
 Documentation/ABI/testing/sysfs-class-net |  8 ++++++++
 net/core/net-sysfs.c                      | 24 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
index e1b2e78..beb8ec4 100644
--- a/Documentation/ABI/testing/sysfs-class-net
+++ b/Documentation/ABI/testing/sysfs-class-net
@@ -216,3 +216,11 @@ Contact:	netdev@vger.kernel.org
 Description:
 		Indicates the interface protocol type as a decimal value. See
 		include/uapi/linux/if_arp.h for all possible values.
+
+What:		/sys/class/net/<iface>/phys_switch_id
+Date:		November 2014
+KernelVersion:	3.19
+Contact:	netdev@vger.kernel.org
+Description:
+		Indicates the unique physical switch identifier of a switch this
+		port belongs to, as a string.
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 26c46f4..9993412 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -12,6 +12,7 @@
 #include <linux/capability.h>
 #include <linux/kernel.h>
 #include <linux/netdevice.h>
+#include <net/switchdev.h>
 #include <linux/if_arp.h>
 #include <linux/slab.h>
 #include <linux/nsproxy.h>
@@ -416,6 +417,28 @@ static ssize_t phys_port_id_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(phys_port_id);
 
+static ssize_t phys_switch_id_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct net_device *netdev = to_net_dev(dev);
+	ssize_t ret = -EINVAL;
+
+	if (!rtnl_trylock())
+		return restart_syscall();
+
+	if (dev_isalive(netdev)) {
+		struct netdev_phys_item_id ppid;
+
+		ret = netdev_switch_parent_id_get(netdev, &ppid);
+		if (!ret)
+			ret = sprintf(buf, "%*phN\n", ppid.id_len, ppid.id);
+	}
+	rtnl_unlock();
+
+	return ret;
+}
+static DEVICE_ATTR_RO(phys_switch_id);
+
 static struct attribute *net_class_attrs[] = {
 	&dev_attr_netdev_group.attr,
 	&dev_attr_type.attr,
@@ -441,6 +464,7 @@ static struct attribute *net_class_attrs[] = {
 	&dev_attr_tx_queue_len.attr,
 	&dev_attr_gro_flush_timeout.attr,
 	&dev_attr_phys_port_id.attr,
+	&dev_attr_phys_switch_id.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(net_class);
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 07/21] rtnl: expose physical switch id for particular device
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

The netdevice represents a port in a switch, it will expose
IFLA_PHYS_SWITCH_ID value via rtnl. Two netdevices with the same value
belong to one physical switch.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
---
v3->v4->v5:
-no change
v2->v3:
-changed "sw" string to "switch" to avoid confusion
v1->v2:
-no change
---
 include/uapi/linux/if_link.h |  1 +
 net/core/rtnetlink.c         | 26 +++++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 36bddc2..623f1a7 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -145,6 +145,7 @@ enum {
 	IFLA_CARRIER,
 	IFLA_PHYS_PORT_ID,
 	IFLA_CARRIER_CHANGES,
+	IFLA_PHYS_SWITCH_ID,
 	__IFLA_MAX
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 7c59c02..b6541f2 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -44,6 +44,7 @@
 
 #include <linux/inet.h>
 #include <linux/netdevice.h>
+#include <net/switchdev.h>
 #include <net/ip.h>
 #include <net/protocol.h>
 #include <net/arp.h>
@@ -869,7 +870,8 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + rtnl_port_size(dev, ext_filter_mask) /* IFLA_VF_PORTS + IFLA_PORT_SELF */
 	       + rtnl_link_get_size(dev) /* IFLA_LINKINFO */
 	       + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */
-	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN); /* IFLA_PHYS_PORT_ID */
+	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN) /* IFLA_PHYS_PORT_ID */
+	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN); /* IFLA_PHYS_SWITCH_ID */
 }
 
 static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev)
@@ -968,6 +970,24 @@ static int rtnl_phys_port_id_fill(struct sk_buff *skb, struct net_device *dev)
 	return 0;
 }
 
+static int rtnl_phys_switch_id_fill(struct sk_buff *skb, struct net_device *dev)
+{
+	int err;
+	struct netdev_phys_item_id psid;
+
+	err = netdev_switch_parent_id_get(dev, &psid);
+	if (err) {
+		if (err == -EOPNOTSUPP)
+			return 0;
+		return err;
+	}
+
+	if (nla_put(skb, IFLA_PHYS_SWITCH_ID, psid.id_len, psid.id))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
 static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 			    int type, u32 pid, u32 seq, u32 change,
 			    unsigned int flags, u32 ext_filter_mask)
@@ -1040,6 +1060,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev,
 	if (rtnl_phys_port_id_fill(skb, dev))
 		goto nla_put_failure;
 
+	if (rtnl_phys_switch_id_fill(skb, dev))
+		goto nla_put_failure;
+
 	attr = nla_reserve(skb, IFLA_STATS,
 			sizeof(struct rtnl_link_stats));
 	if (attr == NULL)
@@ -1199,6 +1222,7 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_NUM_RX_QUEUES]	= { .type = NLA_U32 },
 	[IFLA_PHYS_PORT_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
 	[IFLA_CARRIER_CHANGES]	= { .type = NLA_U32 },  /* ignored */
+	[IFLA_PHYS_SWITCH_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
 };
 
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 06/21] net: introduce generic switch devices support
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

The goal of this is to provide a possibility to support various switch
chips. Drivers should implement relevant ndos to do so. Now there is
only one ndo defined:
- for getting physical switch id is in place.

Note that user can use random port netdevice to access the switch.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
---
v3->v4->v5:
-no change
v2->v3:
-fixed documentation typo pointed out by M. Braun
-changed "sw" string to "switch" to avoid confusion
v1->v2:
-no change
---
 Documentation/networking/switchdev.txt | 59 ++++++++++++++++++++++++++++++++++
 MAINTAINERS                            |  7 ++++
 include/linux/netdevice.h              | 10 ++++++
 include/net/switchdev.h                | 30 +++++++++++++++++
 net/Kconfig                            |  1 +
 net/Makefile                           |  3 ++
 net/switchdev/Kconfig                  | 13 ++++++++
 net/switchdev/Makefile                 |  5 +++
 net/switchdev/switchdev.c              | 33 +++++++++++++++++++
 9 files changed, 161 insertions(+)
 create mode 100644 Documentation/networking/switchdev.txt
 create mode 100644 include/net/switchdev.h
 create mode 100644 net/switchdev/Kconfig
 create mode 100644 net/switchdev/Makefile
 create mode 100644 net/switchdev/switchdev.c

diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
new file mode 100644
index 0000000..f981a92
--- /dev/null
+++ b/Documentation/networking/switchdev.txt
@@ -0,0 +1,59 @@
+Switch (and switch-ish) device drivers HOWTO
+===========================
+
+Please note that the word "switch" is here used in very generic meaning.
+This include devices supporting L2/L3 but also various flow offloading chips,
+including switches embedded into SR-IOV NICs.
+
+Lets describe a topology a bit. Imagine the following example:
+
+       +----------------------------+    +---------------+
+       |     SOME switch chip       |    |      CPU      |
+       +----------------------------+    +---------------+
+       port1 port2 port3 port4 MNGMNT    |     PCI-E     |
+         |     |     |     |     |       +---------------+
+        PHY   PHY    |     |     |         |  NIC0 NIC1
+                     |     |     |         |   |    |
+                     |     |     +- PCI-E -+   |    |
+                     |     +------- MII -------+    |
+                     +------------- MII ------------+
+
+In this example, there are two independent lines between the switch silicon
+and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are
+separate from the switch driver. SOME switch chip is by managed by a driver
+via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be
+connected to some other type of bus.
+
+Now, for the previous example show the representation in kernel:
+
+       +----------------------------+    +---------------+
+       |     SOME switch chip       |    |      CPU      |
+       +----------------------------+    +---------------+
+       sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT    |     PCI-E     |
+         |     |     |     |     |       +---------------+
+        PHY   PHY    |     |     |         |  eth0 eth1
+                     |     |     |         |   |    |
+                     |     |     +- PCI-E -+   |    |
+                     |     +------- MII -------+    |
+                     +------------- MII ------------+
+
+Lets call the example switch driver for SOME switch chip "SOMEswitch". This
+driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX
+created for each port of a switch. These netdevices are instances
+of "SOMEswitch" driver. sw0pX netdevices serve as a "representation"
+of the switch chip. eth0 and eth1 are instances of some other existing driver.
+
+The only difference of the switch-port netdevice from the ordinary netdevice
+is that is implements couple more NDOs:
+
+  ndo_switch_parent_id_get - This returns the same ID for two port netdevices
+			     of the same physical switch chip. This is
+			     mandatory to be implemented by all switch drivers
+			     and serves the caller for recognition of a port
+			     netdevice.
+  ndo_switch_parent_* - Functions that serve for a manipulation of the switch
+			chip itself (it can be though of as a "parent" of the
+			port, therefore the name). They are not port-specific.
+			Caller might use arbitrary port netdevice of the same
+			switch and it will make no difference.
+  ndo_switch_port_* - Functions that serve for a port-specific manipulation.
diff --git a/MAINTAINERS b/MAINTAINERS
index a545d68..05addb6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9058,6 +9058,13 @@ F:	lib/swiotlb.c
 F:	arch/*/kernel/pci-swiotlb.c
 F:	include/linux/swiotlb.h
 
+SWITCHDEV
+M:	Jiri Pirko <jiri@resnulli.us>
+L:	netdev@vger.kernel.org
+S:	Supported
+F:	net/switchdev/
+F:	include/net/switchdev.h
+
 SYNOPSYS ARC ARCHITECTURE
 M:	Vineet Gupta <vgupta@synopsys.com>
 S:	Supported
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4bd41d7..3603f31 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1018,6 +1018,12 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	performing GSO on a packet. The device returns true if it is
  *	able to GSO the packet, false otherwise. If the return value is
  *	false the stack will do software GSO.
+ *
+ * int (*ndo_switch_parent_id_get)(struct net_device *dev,
+ *				   struct netdev_phys_item_id *psid);
+ *	Called to get an ID of the switch chip this port is part of.
+ *	If driver implements this, it indicates that it represents a port
+ *	of a switch chip.
  */
 struct net_device_ops {
 	int			(*ndo_init)(struct net_device *dev);
@@ -1171,6 +1177,10 @@ struct net_device_ops {
 	int			(*ndo_get_lock_subclass)(struct net_device *dev);
 	bool			(*ndo_gso_check) (struct sk_buff *skb,
 						  struct net_device *dev);
+#ifdef CONFIG_NET_SWITCHDEV
+	int			(*ndo_switch_parent_id_get)(struct net_device *dev,
+							    struct netdev_phys_item_id *psid);
+#endif
 };
 
 /**
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
new file mode 100644
index 0000000..7a52360
--- /dev/null
+++ b/include/net/switchdev.h
@@ -0,0 +1,30 @@
+/*
+ * include/net/switchdev.h - Switch device API
+ * Copyright (c) 2014 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#ifndef _LINUX_SWITCHDEV_H_
+#define _LINUX_SWITCHDEV_H_
+
+#include <linux/netdevice.h>
+
+#ifdef CONFIG_NET_SWITCHDEV
+
+int netdev_switch_parent_id_get(struct net_device *dev,
+				struct netdev_phys_item_id *psid);
+
+#else
+
+static inline int netdev_switch_parent_id_get(struct net_device *dev,
+					      struct netdev_phys_item_id *psid)
+{
+	return -EOPNOTSUPP;
+}
+
+#endif
+
+#endif /* _LINUX_SWITCHDEV_H_ */
diff --git a/net/Kconfig b/net/Kconfig
index 99815b5..ff9ffc1 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -228,6 +228,7 @@ source "net/vmw_vsock/Kconfig"
 source "net/netlink/Kconfig"
 source "net/mpls/Kconfig"
 source "net/hsr/Kconfig"
+source "net/switchdev/Kconfig"
 
 config RPS
 	boolean
diff --git a/net/Makefile b/net/Makefile
index 7ed1970..95fc694 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -73,3 +73,6 @@ obj-$(CONFIG_OPENVSWITCH)	+= openvswitch/
 obj-$(CONFIG_VSOCKETS)	+= vmw_vsock/
 obj-$(CONFIG_NET_MPLS_GSO)	+= mpls/
 obj-$(CONFIG_HSR)		+= hsr/
+ifneq ($(CONFIG_NET_SWITCHDEV),)
+obj-y				+= switchdev/
+endif
diff --git a/net/switchdev/Kconfig b/net/switchdev/Kconfig
new file mode 100644
index 0000000..1557545
--- /dev/null
+++ b/net/switchdev/Kconfig
@@ -0,0 +1,13 @@
+#
+# Configuration for Switch device support
+#
+
+config NET_SWITCHDEV
+	boolean "Switch (and switch-ish) device support (EXPERIMENTAL)"
+	depends on INET
+	---help---
+	  This module provides glue between core networking code and device
+	  drivers in order to support hardware switch chips in very generic
+	  meaning of the word "switch". This include devices supporting L2/L3 but
+	  also various flow offloading chips, including switches embedded into
+	  SR-IOV NICs.
diff --git a/net/switchdev/Makefile b/net/switchdev/Makefile
new file mode 100644
index 0000000..5ed63ed
--- /dev/null
+++ b/net/switchdev/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the Switch device API
+#
+
+obj-$(CONFIG_NET_SWITCHDEV) += switchdev.o
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
new file mode 100644
index 0000000..66973de
--- /dev/null
+++ b/net/switchdev/switchdev.c
@@ -0,0 +1,33 @@
+/*
+ * net/switchdev/switchdev.c - Switch device API
+ * Copyright (c) 2014 Jiri Pirko <jiri@resnulli.us>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/netdevice.h>
+#include <net/switchdev.h>
+
+/**
+ *	netdev_switch_parent_id_get - Get ID of a switch
+ *	@dev: port device
+ *	@psid: switch ID
+ *
+ *	Get ID of a switch this port is part of.
+ */
+int netdev_switch_parent_id_get(struct net_device *dev,
+				struct netdev_phys_item_id *psid)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (!ops->ndo_switch_parent_id_get)
+		return -EOPNOTSUPP;
+	return ops->ndo_switch_parent_id_get(dev, psid);
+}
+EXPORT_SYMBOL(netdev_switch_parent_id_get);
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 05/21] net: rename netdev_phys_port_id to more generic name
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

So this can be reused for identification of other "items" as well.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
v1->v2->v3->v4->v5:
-no change
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c      |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |  2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |  2 +-
 include/linux/netdevice.h                        | 16 ++++++++--------
 net/core/dev.c                                   |  2 +-
 net/core/net-sysfs.c                             |  2 +-
 net/core/rtnetlink.c                             |  6 +++---
 8 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index c4bd025..336ef3c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -12537,7 +12537,7 @@ static int bnx2x_validate_addr(struct net_device *dev)
 }
 
 static int bnx2x_get_phys_port_id(struct net_device *netdev,
-				  struct netdev_phys_port_id *ppid)
+				  struct netdev_phys_item_id *ppid)
 {
 	struct bnx2x *bp = netdev_priv(netdev);
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5ed5e40..9ae4270 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7511,7 +7511,7 @@ static void i40e_del_vxlan_port(struct net_device *netdev,
 
 #endif
 static int i40e_get_phys_port_id(struct net_device *netdev,
-				 struct netdev_phys_port_id *ppid)
+				 struct netdev_phys_item_id *ppid)
 {
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_pf *pf = np->vsi->back;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index b7c9978..1597fb0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2259,7 +2259,7 @@ static int mlx4_en_set_vf_link_state(struct net_device *dev, int vf, int link_st
 
 #define PORT_ID_BYTE_LEN 8
 static int mlx4_en_get_phys_port_id(struct net_device *dev,
-				    struct netdev_phys_port_id *ppid)
+				    struct netdev_phys_item_id *ppid)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_dev *mdev = priv->mdev->dev;
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 3227c80..1aa25b1 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -461,7 +461,7 @@ static void qlcnic_82xx_cancel_idc_work(struct qlcnic_adapter *adapter)
 }
 
 static int qlcnic_get_phys_port_id(struct net_device *netdev,
-				   struct netdev_phys_port_id *ppid)
+				   struct netdev_phys_item_id *ppid)
 {
 	struct qlcnic_adapter *adapter = netdev_priv(netdev);
 	struct qlcnic_hardware_context *ahw = adapter->ahw;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 589929c..4bd41d7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -754,13 +754,13 @@ struct netdev_fcoe_hbainfo {
 };
 #endif
 
-#define MAX_PHYS_PORT_ID_LEN 32
+#define MAX_PHYS_ITEM_ID_LEN 32
 
-/* This structure holds a unique identifier to identify the
- * physical port used by a netdevice.
+/* This structure holds a unique identifier to identify some
+ * physical item (port for example) used by a netdevice.
  */
-struct netdev_phys_port_id {
-	unsigned char id[MAX_PHYS_PORT_ID_LEN];
+struct netdev_phys_item_id {
+	unsigned char id[MAX_PHYS_ITEM_ID_LEN];
 	unsigned char id_len;
 };
 
@@ -976,7 +976,7 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	USB_CDC_NOTIFY_NETWORK_CONNECTION) should NOT implement this function.
  *
  * int (*ndo_get_phys_port_id)(struct net_device *dev,
- *			       struct netdev_phys_port_id *ppid);
+ *			       struct netdev_phys_item_id *ppid);
  *	Called to get ID of physical port of this device. If driver does
  *	not implement this, it is assumed that the hw is not able to have
  *	multiple net devices on single physical port.
@@ -1152,7 +1152,7 @@ struct net_device_ops {
 	int			(*ndo_change_carrier)(struct net_device *dev,
 						      bool new_carrier);
 	int			(*ndo_get_phys_port_id)(struct net_device *dev,
-							struct netdev_phys_port_id *ppid);
+							struct netdev_phys_item_id *ppid);
 	void			(*ndo_add_vxlan_port)(struct  net_device *dev,
 						      sa_family_t sa_family,
 						      __be16 port);
@@ -2870,7 +2870,7 @@ void dev_set_group(struct net_device *, int);
 int dev_set_mac_address(struct net_device *, struct sockaddr *);
 int dev_change_carrier(struct net_device *, bool new_carrier);
 int dev_get_phys_port_id(struct net_device *dev,
-			 struct netdev_phys_port_id *ppid);
+			 struct netdev_phys_item_id *ppid);
 struct sk_buff *validate_xmit_skb_list(struct sk_buff *skb, struct net_device *dev);
 struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				    struct netdev_queue *txq, int *ret);
diff --git a/net/core/dev.c b/net/core/dev.c
index ac48362..0814a56 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5846,7 +5846,7 @@ EXPORT_SYMBOL(dev_change_carrier);
  *	Get device physical port ID
  */
 int dev_get_phys_port_id(struct net_device *dev,
-			 struct netdev_phys_port_id *ppid)
+			 struct netdev_phys_item_id *ppid)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 1a24602..26c46f4 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -404,7 +404,7 @@ static ssize_t phys_port_id_show(struct device *dev,
 		return restart_syscall();
 
 	if (dev_isalive(netdev)) {
-		struct netdev_phys_port_id ppid;
+		struct netdev_phys_item_id ppid;
 
 		ret = dev_get_phys_port_id(netdev, &ppid);
 		if (!ret)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 85d1d76..7c59c02 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -869,7 +869,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + rtnl_port_size(dev, ext_filter_mask) /* IFLA_VF_PORTS + IFLA_PORT_SELF */
 	       + rtnl_link_get_size(dev) /* IFLA_LINKINFO */
 	       + rtnl_link_get_af_size(dev) /* IFLA_AF_SPEC */
-	       + nla_total_size(MAX_PHYS_PORT_ID_LEN); /* IFLA_PHYS_PORT_ID */
+	       + nla_total_size(MAX_PHYS_ITEM_ID_LEN); /* IFLA_PHYS_PORT_ID */
 }
 
 static int rtnl_vf_ports_fill(struct sk_buff *skb, struct net_device *dev)
@@ -953,7 +953,7 @@ static int rtnl_port_fill(struct sk_buff *skb, struct net_device *dev,
 static int rtnl_phys_port_id_fill(struct sk_buff *skb, struct net_device *dev)
 {
 	int err;
-	struct netdev_phys_port_id ppid;
+	struct netdev_phys_item_id ppid;
 
 	err = dev_get_phys_port_id(dev, &ppid);
 	if (err) {
@@ -1197,7 +1197,7 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_PROMISCUITY]	= { .type = NLA_U32 },
 	[IFLA_NUM_TX_QUEUES]	= { .type = NLA_U32 },
 	[IFLA_NUM_RX_QUEUES]	= { .type = NLA_U32 },
-	[IFLA_PHYS_PORT_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_PORT_ID_LEN },
+	[IFLA_PHYS_PORT_ID]	= { .type = NLA_BINARY, .len = MAX_PHYS_ITEM_ID_LEN },
 	[IFLA_CARRIER_CHANGES]	= { .type = NLA_U32 },  /* ignored */
 };
 
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 04/21] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple
u16 vid to drivers from there.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
---
v4->v5:
-no change
v3->v4:
-fixed spelling of "fdb" pointed out by Andy
new in v3
---
 drivers/net/ethernet/intel/i40e/i40e_main.c      |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |  9 +++--
 drivers/net/macvlan.c                            |  4 +-
 drivers/net/vxlan.c                              |  4 +-
 include/linux/netdevice.h                        |  8 ++--
 include/linux/rtnetlink.h                        |  6 ++-
 net/bridge/br_fdb.c                              | 39 ++----------------
 net/bridge/br_private.h                          |  4 +-
 net/core/rtnetlink.c                             | 50 ++++++++++++++++++++----
 10 files changed, 70 insertions(+), 60 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 7262077..5ed5e40 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7536,7 +7536,7 @@ static int i40e_get_phys_port_id(struct net_device *netdev,
  */
 static int i40e_ndo_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			    struct net_device *dev,
-			    const unsigned char *addr,
+			    const unsigned char *addr, u16 vid,
 			    u16 flags)
 {
 	struct i40e_netdev_priv *np = netdev_priv(dev);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 932f779..1bad9f4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7708,7 +7708,7 @@ static int ixgbe_set_features(struct net_device *netdev,
 
 static int ixgbe_ndo_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			     struct net_device *dev,
-			     const unsigned char *addr,
+			     const unsigned char *addr, u16 vid,
 			     u16 flags)
 {
 	/* guarantee we can provide a unique filter for the unicast address */
@@ -7717,7 +7717,7 @@ static int ixgbe_ndo_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			return -ENOMEM;
 	}
 
-	return ndo_dflt_fdb_add(ndm, tb, dev, addr, flags);
+	return ndo_dflt_fdb_add(ndm, tb, dev, addr, vid, flags);
 }
 
 static int ixgbe_ndo_bridge_setlink(struct net_device *dev,
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index a913b3a..3227c80 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -376,13 +376,14 @@ static int qlcnic_set_mac(struct net_device *netdev, void *p)
 }
 
 static int qlcnic_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
-			struct net_device *netdev, const unsigned char *addr)
+			struct net_device *netdev,
+			const unsigned char *addr, u16 vid)
 {
 	struct qlcnic_adapter *adapter = netdev_priv(netdev);
 	int err = -EOPNOTSUPP;
 
 	if (!adapter->fdb_mac_learn)
-		return ndo_dflt_fdb_del(ndm, tb, netdev, addr);
+		return ndo_dflt_fdb_del(ndm, tb, netdev, addr, vid);
 
 	if ((adapter->flags & QLCNIC_ESWITCH_ENABLED) ||
 	    qlcnic_sriov_check(adapter)) {
@@ -401,13 +402,13 @@ static int qlcnic_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
 
 static int qlcnic_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			struct net_device *netdev,
-			const unsigned char *addr, u16 flags)
+			const unsigned char *addr, u16 vid, u16 flags)
 {
 	struct qlcnic_adapter *adapter = netdev_priv(netdev);
 	int err = 0;
 
 	if (!adapter->fdb_mac_learn)
-		return ndo_dflt_fdb_add(ndm, tb, netdev, addr, flags);
+		return ndo_dflt_fdb_add(ndm, tb, netdev, addr, vid, flags);
 
 	if (!(adapter->flags & QLCNIC_ESWITCH_ENABLED) &&
 	    !qlcnic_sriov_check(adapter)) {
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index bfb0b6e..a1a3e3e 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -872,7 +872,7 @@ static int macvlan_vlan_rx_kill_vid(struct net_device *dev,
 
 static int macvlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			   struct net_device *dev,
-			   const unsigned char *addr,
+			   const unsigned char *addr, u16 vid,
 			   u16 flags)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
@@ -897,7 +897,7 @@ static int macvlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 
 static int macvlan_fdb_del(struct ndmsg *ndm, struct nlattr *tb[],
 			   struct net_device *dev,
-			   const unsigned char *addr)
+			   const unsigned char *addr, u16 vid)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
 	int err = -EINVAL;
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index e9f81d4..7d8013d 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -849,7 +849,7 @@ static int vxlan_fdb_parse(struct nlattr *tb[], struct vxlan_dev *vxlan,
 /* Add static entry (via netlink) */
 static int vxlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			 struct net_device *dev,
-			 const unsigned char *addr, u16 flags)
+			 const unsigned char *addr, u16 vid, u16 flags)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	/* struct net *net = dev_net(vxlan->dev); */
@@ -885,7 +885,7 @@ static int vxlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 /* Delete entry (via netlink) */
 static int vxlan_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
 			    struct net_device *dev,
-			    const unsigned char *addr)
+			    const unsigned char *addr, u16 vid)
 {
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct vxlan_fdb *f;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2cb7724..589929c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -951,11 +951,11 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *
  * int (*ndo_fdb_add)(struct ndmsg *ndm, struct nlattr *tb[],
  *		      struct net_device *dev,
- *		      const unsigned char *addr, u16 flags)
+ *		      const unsigned char *addr, u16 vid, u16 flags)
  *	Adds an FDB entry to dev for addr.
  * int (*ndo_fdb_del)(struct ndmsg *ndm, struct nlattr *tb[],
  *		      struct net_device *dev,
- *		      const unsigned char *addr)
+ *		      const unsigned char *addr, u16 vid)
  *	Deletes the FDB entry from dev coresponding to addr.
  * int (*ndo_fdb_dump)(struct sk_buff *skb, struct netlink_callback *cb,
  *		       struct net_device *dev, struct net_device *filter_dev,
@@ -1128,11 +1128,13 @@ struct net_device_ops {
 					       struct nlattr *tb[],
 					       struct net_device *dev,
 					       const unsigned char *addr,
+					       u16 vid,
 					       u16 flags);
 	int			(*ndo_fdb_del)(struct ndmsg *ndm,
 					       struct nlattr *tb[],
 					       struct net_device *dev,
-					       const unsigned char *addr);
+					       const unsigned char *addr,
+					       u16 vid);
 	int			(*ndo_fdb_dump)(struct sk_buff *skb,
 						struct netlink_callback *cb,
 						struct net_device *dev,
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 6cacbce..063f0f5 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -94,11 +94,13 @@ extern int ndo_dflt_fdb_add(struct ndmsg *ndm,
 			    struct nlattr *tb[],
 			    struct net_device *dev,
 			    const unsigned char *addr,
-			     u16 flags);
+			    u16 vid,
+			    u16 flags);
 extern int ndo_dflt_fdb_del(struct ndmsg *ndm,
 			    struct nlattr *tb[],
 			    struct net_device *dev,
-			    const unsigned char *addr);
+			    const unsigned char *addr,
+			    u16 vid);
 
 extern int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
 				   struct net_device *dev, u16 mode);
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 08ef4e7..b1be971 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -805,33 +805,17 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge_port *p,
 /* Add new permanent fdb entry with RTM_NEWNEIGH */
 int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 	       struct net_device *dev,
-	       const unsigned char *addr, u16 nlh_flags)
+	       const unsigned char *addr, u16 vid, u16 nlh_flags)
 {
 	struct net_bridge_port *p;
 	int err = 0;
 	struct net_port_vlans *pv;
-	unsigned short vid = VLAN_N_VID;
 
 	if (!(ndm->ndm_state & (NUD_PERMANENT|NUD_NOARP|NUD_REACHABLE))) {
 		pr_info("bridge: RTM_NEWNEIGH with invalid state %#x\n", ndm->ndm_state);
 		return -EINVAL;
 	}
 
-	if (tb[NDA_VLAN]) {
-		if (nla_len(tb[NDA_VLAN]) != sizeof(unsigned short)) {
-			pr_info("bridge: RTM_NEWNEIGH with invalid vlan\n");
-			return -EINVAL;
-		}
-
-		vid = nla_get_u16(tb[NDA_VLAN]);
-
-		if (!vid || vid >= VLAN_VID_MASK) {
-			pr_info("bridge: RTM_NEWNEIGH with invalid vlan id %d\n",
-				vid);
-			return -EINVAL;
-		}
-	}
-
 	if (is_zero_ether_addr(addr)) {
 		pr_info("bridge: RTM_NEWNEIGH with invalid ether address\n");
 		return -EINVAL;
@@ -845,7 +829,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 	}
 
 	pv = nbp_get_vlan_info(p);
-	if (vid != VLAN_N_VID) {
+	if (vid) {
 		if (!pv || !test_bit(vid, pv->vlan_bitmap)) {
 			pr_info("bridge: RTM_NEWNEIGH with unconfigured "
 				"vlan %d on port %s\n", vid, dev->name);
@@ -903,27 +887,12 @@ static int __br_fdb_delete(struct net_bridge_port *p,
 /* Remove neighbor entry with RTM_DELNEIGH */
 int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
 		  struct net_device *dev,
-		  const unsigned char *addr)
+		  const unsigned char *addr, u16 vid)
 {
 	struct net_bridge_port *p;
 	int err;
 	struct net_port_vlans *pv;
-	unsigned short vid = VLAN_N_VID;
-
-	if (tb[NDA_VLAN]) {
-		if (nla_len(tb[NDA_VLAN]) != sizeof(unsigned short)) {
-			pr_info("bridge: RTM_NEWNEIGH with invalid vlan\n");
-			return -EINVAL;
-		}
 
-		vid = nla_get_u16(tb[NDA_VLAN]);
-
-		if (!vid || vid >= VLAN_VID_MASK) {
-			pr_info("bridge: RTM_NEWNEIGH with invalid vlan id %d\n",
-				vid);
-			return -EINVAL;
-		}
-	}
 	p = br_port_get_rtnl(dev);
 	if (p == NULL) {
 		pr_info("bridge: RTM_DELNEIGH %s not a bridge port\n",
@@ -932,7 +901,7 @@ int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
 	}
 
 	pv = nbp_get_vlan_info(p);
-	if (vid != VLAN_N_VID) {
+	if (vid) {
 		if (!pv || !test_bit(vid, pv->vlan_bitmap)) {
 			pr_info("bridge: RTM_DELNEIGH with unconfigured "
 				"vlan %d on port %s\n", vid, dev->name);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 5b37077..1b529da 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -404,9 +404,9 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 		   const unsigned char *addr, u16 vid, bool added_by_user);
 
 int br_fdb_delete(struct ndmsg *ndm, struct nlattr *tb[],
-		  struct net_device *dev, const unsigned char *addr);
+		  struct net_device *dev, const unsigned char *addr, u16 vid);
 int br_fdb_add(struct ndmsg *nlh, struct nlattr *tb[], struct net_device *dev,
-	       const unsigned char *addr, u16 nlh_flags);
+	       const unsigned char *addr, u16 vid, u16 nlh_flags);
 int br_fdb_dump(struct sk_buff *skb, struct netlink_callback *cb,
 		struct net_device *dev, struct net_device *fdev, int idx);
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index a688268..85d1d76 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -36,6 +36,7 @@
 #include <linux/mutex.h>
 #include <linux/if_addr.h>
 #include <linux/if_bridge.h>
+#include <linux/if_vlan.h>
 #include <linux/pci.h>
 #include <linux/etherdevice.h>
 
@@ -2312,7 +2313,7 @@ errout:
 int ndo_dflt_fdb_add(struct ndmsg *ndm,
 		     struct nlattr *tb[],
 		     struct net_device *dev,
-		     const unsigned char *addr,
+		     const unsigned char *addr, u16 vid,
 		     u16 flags)
 {
 	int err = -EINVAL;
@@ -2338,6 +2339,28 @@ int ndo_dflt_fdb_add(struct ndmsg *ndm,
 }
 EXPORT_SYMBOL(ndo_dflt_fdb_add);
 
+static int fdb_vid_parse(struct nlattr *vlan_attr, u16 *p_vid)
+{
+	u16 vid = 0;
+
+	if (vlan_attr) {
+		if (nla_len(vlan_attr) != sizeof(u16)) {
+			pr_info("PF_BRIDGE: RTM_NEWNEIGH with invalid vlan\n");
+			return -EINVAL;
+		}
+
+		vid = nla_get_u16(vlan_attr);
+
+		if (!vid || vid >= VLAN_VID_MASK) {
+			pr_info("PF_BRIDGE: RTM_NEWNEIGH with invalid vlan id %d\n",
+				vid);
+			return -EINVAL;
+		}
+	}
+	*p_vid = vid;
+	return 0;
+}
+
 static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 {
 	struct net *net = sock_net(skb->sk);
@@ -2345,6 +2368,7 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 	struct nlattr *tb[NDA_MAX+1];
 	struct net_device *dev;
 	u8 *addr;
+	u16 vid;
 	int err;
 
 	err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL);
@@ -2370,6 +2394,10 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 	addr = nla_data(tb[NDA_LLADDR]);
 
+	err = fdb_vid_parse(tb[NDA_VLAN], &vid);
+	if (err)
+		return err;
+
 	err = -EOPNOTSUPP;
 
 	/* Support fdb on master device the net/bridge default case */
@@ -2378,7 +2406,8 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 		struct net_device *br_dev = netdev_master_upper_dev_get(dev);
 		const struct net_device_ops *ops = br_dev->netdev_ops;
 
-		err = ops->ndo_fdb_add(ndm, tb, dev, addr, nlh->nlmsg_flags);
+		err = ops->ndo_fdb_add(ndm, tb, dev, addr, vid,
+				       nlh->nlmsg_flags);
 		if (err)
 			goto out;
 		else
@@ -2389,9 +2418,10 @@ static int rtnl_fdb_add(struct sk_buff *skb, struct nlmsghdr *nlh)
 	if ((ndm->ndm_flags & NTF_SELF)) {
 		if (dev->netdev_ops->ndo_fdb_add)
 			err = dev->netdev_ops->ndo_fdb_add(ndm, tb, dev, addr,
+							   vid,
 							   nlh->nlmsg_flags);
 		else
-			err = ndo_dflt_fdb_add(ndm, tb, dev, addr,
+			err = ndo_dflt_fdb_add(ndm, tb, dev, addr, vid,
 					       nlh->nlmsg_flags);
 
 		if (!err) {
@@ -2409,7 +2439,7 @@ out:
 int ndo_dflt_fdb_del(struct ndmsg *ndm,
 		     struct nlattr *tb[],
 		     struct net_device *dev,
-		     const unsigned char *addr)
+		     const unsigned char *addr, u16 vid)
 {
 	int err = -EINVAL;
 
@@ -2438,6 +2468,7 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh)
 	struct net_device *dev;
 	int err = -EINVAL;
 	__u8 *addr;
+	u16 vid;
 
 	if (!netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
@@ -2465,6 +2496,10 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh)
 
 	addr = nla_data(tb[NDA_LLADDR]);
 
+	err = fdb_vid_parse(tb[NDA_VLAN], &vid);
+	if (err)
+		return err;
+
 	err = -EOPNOTSUPP;
 
 	/* Support fdb on master device the net/bridge default case */
@@ -2474,7 +2509,7 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh)
 		const struct net_device_ops *ops = br_dev->netdev_ops;
 
 		if (ops->ndo_fdb_del)
-			err = ops->ndo_fdb_del(ndm, tb, dev, addr);
+			err = ops->ndo_fdb_del(ndm, tb, dev, addr, vid);
 
 		if (err)
 			goto out;
@@ -2485,9 +2520,10 @@ static int rtnl_fdb_del(struct sk_buff *skb, struct nlmsghdr *nlh)
 	/* Embedded bridge, macvlan, and any other device support */
 	if (ndm->ndm_flags & NTF_SELF) {
 		if (dev->netdev_ops->ndo_fdb_del)
-			err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr);
+			err = dev->netdev_ops->ndo_fdb_del(ndm, tb, dev, addr,
+							   vid);
 		else
-			err = ndo_dflt_fdb_del(ndm, tb, dev, addr);
+			err = ndo_dflt_fdb_del(ndm, tb, dev, addr, vid);
 
 		if (!err) {
 			rtnl_fdb_notify(dev, addr, RTM_DELNEIGH);
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 03/21] bridge: convert flags in fbd entry into bitfields
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
v4->v5:
-no change
new in v4
---
 net/bridge/br_private.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 8f3f081..5b37077 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -98,9 +98,9 @@ struct net_bridge_fdb_entry
 	unsigned long			updated;
 	unsigned long			used;
 	mac_addr			addr;
-	unsigned char			is_local;
-	unsigned char			is_static;
-	unsigned char			added_by_user;
+	unsigned char			is_local:1,
+					is_static:1,
+					added_by_user:1;
 	__u16				vlan_id;
 };
 
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 02/21] neigh: sort Neighbor Cache Entry Flags
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
v4->v5:
-no change
new in v4
---
 include/uapi/linux/neighbour.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index 4a1d7e9..2f043af 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -35,11 +35,10 @@ enum {
  */
 
 #define NTF_USE		0x01
-#define NTF_PROXY	0x08	/* == ATF_PUBL */
-#define NTF_ROUTER	0x80
-
 #define NTF_SELF	0x02
 #define NTF_MASTER	0x04
+#define NTF_PROXY	0x08	/* == ATF_PUBL */
+#define NTF_ROUTER	0x80
 
 /*
  *	Neighbor Cache Entry States.
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 01/21] bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal
In-Reply-To: <1417181672-11531-1-git-send-email-jiri@resnulli.us>

The current name might seem that this actually offloads the fdb entry to
hw. So rename it to clearly present that this for hardware address
addition/removal.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
v3->v4->v5:
-no change
v2->v3:
-moved the patch to the patchset head
new in v2 as suggested by DaveM
---
 net/bridge/br_fdb.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 6f6c95c..08ef4e7 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -90,7 +90,7 @@ static void fdb_rcu_free(struct rcu_head *head)
  * are then updated with the new information.
  * Called under RTNL.
  */
-static void fdb_add_hw(struct net_bridge *br, const unsigned char *addr)
+static void fdb_add_hw_addr(struct net_bridge *br, const unsigned char *addr)
 {
 	int err;
 	struct net_bridge_port *p;
@@ -118,7 +118,7 @@ undo:
  * the ports with needed information.
  * Called under RTNL.
  */
-static void fdb_del_hw(struct net_bridge *br, const unsigned char *addr)
+static void fdb_del_hw_addr(struct net_bridge *br, const unsigned char *addr)
 {
 	struct net_bridge_port *p;
 
@@ -133,7 +133,7 @@ static void fdb_del_hw(struct net_bridge *br, const unsigned char *addr)
 static void fdb_delete(struct net_bridge *br, struct net_bridge_fdb_entry *f)
 {
 	if (f->is_static)
-		fdb_del_hw(br, f->addr.addr);
+		fdb_del_hw_addr(br, f->addr.addr);
 
 	hlist_del_rcu(&f->hlist);
 	fdb_notify(br, f, RTM_DELNEIGH);
@@ -514,7 +514,7 @@ static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
 		return -ENOMEM;
 
 	fdb->is_local = fdb->is_static = 1;
-	fdb_add_hw(br, addr);
+	fdb_add_hw_addr(br, addr);
 	fdb_notify(br, fdb, RTM_NEWNEIGH);
 	return 0;
 }
@@ -754,19 +754,19 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
 			fdb->is_local = 1;
 			if (!fdb->is_static) {
 				fdb->is_static = 1;
-				fdb_add_hw(br, addr);
+				fdb_add_hw_addr(br, addr);
 			}
 		} else if (state & NUD_NOARP) {
 			fdb->is_local = 0;
 			if (!fdb->is_static) {
 				fdb->is_static = 1;
-				fdb_add_hw(br, addr);
+				fdb_add_hw_addr(br, addr);
 			}
 		} else {
 			fdb->is_local = 0;
 			if (fdb->is_static) {
 				fdb->is_static = 0;
-				fdb_del_hw(br, addr);
+				fdb_del_hw_addr(br, addr);
 			}
 		}
 
-- 
1.9.3

^ permalink raw reply related

* [patch net-next v5 00/21] introduce rocker switch driver with hardware accelerated datapath api - phase 1: bridge fdb offload
From: Jiri Pirko @ 2014-11-28 13:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse, pshelar,
	azhou, ben, stephen, jeffrey.t.kirsher, vyasevic, xiyou.wangcong,
	john.r.fastabend, edumazet, jhs, sfeldma, f.fainelli, roopa,
	linville, jasowang, ebiederm, nicolas.dichtel, ryazanov.s.a,
	buytenh, aviadr, nbd, alexei.starovoitov, Neil.Jerram, ronye,
	simon.horman, alexander.h.duyck, john.ronciak, mleitner, shrijeet,
	gospo, bcrl, hemal

Hi all.

This patchset is just the first phase of switch and switch-ish device
support api in kernel. Note that the api will extend.

So what this patchset includes:
- introduce switchdev api skeleton for implementing switch drivers
- introduce rocker switch driver which implements switchdev api fdb and
  bridge set/get link ndos

As to the discussion if there is need to have specific class of device
representing the switch itself, so far we found no need to introduce that.
But we are generally ok with the idea and when the time comes and it will
be needed, it can be easily introduced without any disturbance.

This patchset introduces switch id export through rtnetlink and sysfs,
which is similar to what we have for port id in SR-IOV. I will send iproute2
patchset for showing the switch id for port netdevs once this is applied.
This applies also for the PF_BRIDGE and fdb iproute2 patches.

iproute2 patches are now available here:
https://github.com/jpirko/iproute2-rocker

For detailed description and version history, please see individual patches.

In v4 I reordered the patches leaving rocker patches on the end of the patchset.

In v5 I only fixed whitespace issues of patch #13

We have a TODO for related items we want to work on in near future:
https://etherpad.wikimedia.org/p/netdev-swdev-todo

Jiri Pirko (10):
  bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion
  neigh: sort Neighbor Cache Entry Flags
  bridge: convert flags in fbd entry into bitfields
  net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
  net: rename netdev_phys_port_id to more generic name
  net: introduce generic switch devices support
  rtnl: expose physical switch id for particular device
  net-sysfs: expose physical switch id for particular device
  rocker: introduce rocker switch driver
  rocker: implement ndo_fdb_dump

Scott Feldman (9):
  bridge: call netdev_sw_port_stp_update when bridge port STP status
    changes
  bridge: add API to notify bridge driver of learned FBD on offloaded
    device
  bridge: move private brport flags to if_bridge.h so port drivers can
    use flags
  bridge: add new brport flag LEARNING_SYNC
  bridge: add new hwmode swdev
  bridge: add brport flags to dflt bridge_getlink
  rocker: implement rocker ofdpa flow table manipulation
  rocker: implement L2 bridge offloading
  rocker: add ndo_bridge_setlink/getlink support for learning policy

Thomas Graf (2):
  rocker: Add proper validation of Netlink attributes
  rocker: Use logical operators on booleans

 Documentation/ABI/testing/sysfs-class-net        |    8 +
 Documentation/networking/switchdev.txt           |   59 +
 MAINTAINERS                                      |   14 +
 drivers/net/ethernet/Kconfig                     |    1 +
 drivers/net/ethernet/Makefile                    |    1 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    2 +-
 drivers/net/ethernet/emulex/benet/be_main.c      |    3 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c      |    4 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |    6 +-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |    2 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |   11 +-
 drivers/net/ethernet/rocker/Kconfig              |   27 +
 drivers/net/ethernet/rocker/Makefile             |    5 +
 drivers/net/ethernet/rocker/rocker.c             | 4374 ++++++++++++++++++++++
 drivers/net/ethernet/rocker/rocker.h             |  428 +++
 drivers/net/macvlan.c                            |    4 +-
 drivers/net/vxlan.c                              |    4 +-
 include/linux/if_bridge.h                        |   31 +
 include/linux/netdevice.h                        |   39 +-
 include/linux/rtnetlink.h                        |    9 +-
 include/net/switchdev.h                          |   37 +
 include/uapi/linux/if_bridge.h                   |    1 +
 include/uapi/linux/if_link.h                     |    2 +
 include/uapi/linux/neighbour.h                   |    6 +-
 net/Kconfig                                      |    1 +
 net/Makefile                                     |    3 +
 net/bridge/br_fdb.c                              |  144 +-
 net/bridge/br_private.h                          |   21 +-
 net/bridge/br_stp.c                              |    7 +
 net/core/dev.c                                   |    2 +-
 net/core/net-sysfs.c                             |   26 +-
 net/core/rtnetlink.c                             |  119 +-
 net/switchdev/Kconfig                            |   13 +
 net/switchdev/Makefile                           |    5 +
 net/switchdev/switchdev.c                        |   52 +
 35 files changed, 5366 insertions(+), 105 deletions(-)
 create mode 100644 Documentation/networking/switchdev.txt
 create mode 100644 drivers/net/ethernet/rocker/Kconfig
 create mode 100644 drivers/net/ethernet/rocker/Makefile
 create mode 100644 drivers/net/ethernet/rocker/rocker.c
 create mode 100644 drivers/net/ethernet/rocker/rocker.h
 create mode 100644 include/net/switchdev.h
 create mode 100644 net/switchdev/Kconfig
 create mode 100644 net/switchdev/Makefile
 create mode 100644 net/switchdev/switchdev.c

-- 
1.9.3

^ permalink raw reply

* Re: [patch net-next v3 08/17] bridge: call netdev_sw_port_stp_update when bridge port STP status changes
From: Jiri Pirko @ 2014-11-28 13:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Scott Feldman, Roopa Prabhu, Netdev, David S. Miller,
	nhorman@tuxdriver.com, Andy Gospodarek, Thomas Graf,
	dborkman@redhat.com, ogerlitz@mellanox.com, jesse@nicira.com,
	pshelar@nicira.com, azhou@nicira.com, ben@decadent.org.uk,
	stephen@networkplumber.org, Kirsher, Jeffrey T,
	vyasevic@redhat.com, Cong Wang, Fastabend, John R, Eric Dumazet,
	Florian Fainelli, John Linville
In-Reply-To: <547875F1.9080204@mojatatu.com>

Fri, Nov 28, 2014 at 02:17:37PM CET, jhs@mojatatu.com wrote:
>On 11/28/14 05:51, Scott Feldman wrote:
>>On Fri, Nov 28, 2014 at 2:05 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>On 11/25/14, 5:35 PM, Scott Feldman wrote:
>>>>
>>>>   The
>>>>bridge driver or external STP process (msptd) is still controlling STP
>>>>state for the port and processing the BPDUs.  When the state changes
>>>>on the port, the bridge driver is letting HW know, that's it.
>>>
>>>
>>>I understand that. In which case, we should not call it stp state.
>>>It is just port state.
>>
>>Sure, call it port state but it takes on BR_STATE_xxx values which
>>just so happen to correspond exactly to STP states.
>>
>>>And since it is yet another port attribute like port
>>>priority,
>>>we should be able to use the same api to offload it to hw just like the
>>>other port attributes.
>>
>>Well it does...see ndo_bridge_setlink in bridge driver, br_setport
>>where IFLA_BRPORT_STATE is handled...it calls br_set_port_state(),
>>which calls into the swdev port driver.  That's for the case where
>>user or external processing is setting STP state.  For the case where
>>the bridge itself is managing the STP state, the bridge will make the
>>same br_set_port_state() call to adjust the port state.
>>
>
>What Roopa is requesting for if i am not mistaken is the same issue i
>raised earlier as well. We need an opaque way to set and get these
>attributes. We cant afford an ndo ops per bridge or the next thing.
>Its a port level issue - what it is depends on what the underlying
>hardware does.

I agree. This will be addressed (it's in the etherpad todo).

>
>cheers,
>jamal
>

^ permalink raw reply

* Re: [patch net-next v3 08/17] bridge: call netdev_sw_port_stp_update when bridge port STP status changes
From: Jamal Hadi Salim @ 2014-11-28 13:17 UTC (permalink / raw)
  To: Scott Feldman, Roopa Prabhu
  Cc: Jiri Pirko, Netdev, David S. Miller, nhorman@tuxdriver.com,
	Andy Gospodarek, Thomas Graf, dborkman@redhat.com,
	ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
	azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Florian Fainelli, John Linville,
	jasowang@redhat.com
In-Reply-To: <CAE4R7bCfi-90pccDJN350bdeQi2R4H5Xbu3-BADZ29vWhfkJiA@mail.gmail.com>

On 11/28/14 05:51, Scott Feldman wrote:
> On Fri, Nov 28, 2014 at 2:05 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>> On 11/25/14, 5:35 PM, Scott Feldman wrote:
>>>
>>>    The
>>> bridge driver or external STP process (msptd) is still controlling STP
>>> state for the port and processing the BPDUs.  When the state changes
>>> on the port, the bridge driver is letting HW know, that's it.
>>
>>
>> I understand that. In which case, we should not call it stp state.
>> It is just port state.
>
> Sure, call it port state but it takes on BR_STATE_xxx values which
> just so happen to correspond exactly to STP states.
>
>> And since it is yet another port attribute like port
>> priority,
>> we should be able to use the same api to offload it to hw just like the
>> other port attributes.
>
> Well it does...see ndo_bridge_setlink in bridge driver, br_setport
> where IFLA_BRPORT_STATE is handled...it calls br_set_port_state(),
> which calls into the swdev port driver.  That's for the case where
> user or external processing is setting STP state.  For the case where
> the bridge itself is managing the STP state, the bridge will make the
> same br_set_port_state() call to adjust the port state.
>

What Roopa is requesting for if i am not mistaken is the same issue i
raised earlier as well. We need an opaque way to set and get these
attributes. We cant afford an ndo ops per bridge or the next thing.
Its a port level issue - what it is depends on what the underlying
hardware does.

cheers,
jamal

^ permalink raw reply

* Re: [patch net-next v3 04/17] net: introduce generic switch devices support
From: Jamal Hadi Salim @ 2014-11-28 13:13 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Thomas Graf, Scott Feldman, Netdev, David S. Miller,
	nhorman@tuxdriver.com, Andy Gospodarek, dborkman@redhat.com,
	ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
	azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Florian Fainelli, Roopa Prabhu,
	John Linville
In-Reply-To: <20141127135050.GD1877@nanopsycho.orion>

On 11/27/14 08:50, Jiri Pirko wrote:

> $ git grep offload net
> Wouldn't it be confusing to add this another different "offload". That's
> just confusing.
>
> I still like "switch" the best. If it passes packets around, it's a
> "switch", +-. Everybody understand what's going on if you use "switch".
> If you use "offload", everybody is confused...
>

Those are all *legitimate offloads* ;->
The macvlan one looks a little creepy. Perhaps we could eventually
merge all that stuff together with this effort.

cheers,
jamal

^ permalink raw reply

* Re: [patch net-next v4 14/21] bridge: add brport flags to dflt bridge_getlink
From: Jamal Hadi Salim @ 2014-11-28 13:07 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Jiri Pirko, Netdev, David S. Miller, nhorman@tuxdriver.com,
	Andy Gospodarek, Thomas Graf, dborkman@redhat.com,
	ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
	azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Florian Fainelli, Roopa Prabhu,
	John Linville
In-Reply-To: <CAE4R7bBqAg0CoEcz0s6MKnCH60YEk3EMDD875H_p4GtRhGMGUQ@mail.gmail.com>

On 11/27/14 15:46, Scott Feldman wrote:
> On Thu, Nov 27, 2014 at 3:17 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:

>
> For RTM_GETLINK, rtnl_bridge_getlink() calls ndo_bridge_getlink twice
> for each dev, once on bridge and second time on dev.  Each call adds
> an RTM_NEWLINK to skb.  For the ndo_bridge_getlink() call to bridge,
> the MASTER port flags are filled in using br_port_fill_attr().  For
> the second ndo_bridge_getlink() call to dev, the port driver calls
> ndo_dflt_bridge_getlink() which fills in the SELF port flags.  Before
> this patch, ndo_dflt_bridge_getlink() was only filling in hwmode.
>
> Whew, in any case, I think you'll agree this code needs a refactoring
> down the road.  This change is just the bare minimum building on
> what's there to get SELF port flags up to user-space.  A refactoring
> effort should get the port drivers out of parsing/filling netlink msg
> and leave that to the core code in rtnetlink.c.  That way we can have
> one place for policy checks and one place for fill.  I think this
> refactoring effort should be left out in this patch series, otherwise
> this is going to drag on into the next year.
>

I am fine with that. At minimal  br_port_fill_attr() is reusable.

There's a lot of stuff i wish would be "fixed" - one is clearly
not abusing a u8 just to send one bit to the kernel. You just
added one more horn of that sort with the sync learning.
I wish i had time to clean it up. In any case:

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply

* [PATCH] cxgb4: Fill in supported link mode for SFP modules
From: Hariprasad Shenai @ 2014-11-28 13:05 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, nirranjan, Hariprasad Shenai

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 605eaab..279275d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -2362,9 +2362,13 @@ static unsigned int from_fw_linkcaps(unsigned int type, unsigned int caps)
 		     SUPPORTED_10000baseKR_Full | SUPPORTED_1000baseKX_Full |
 		     SUPPORTED_10000baseKX4_Full;
 	else if (type == FW_PORT_TYPE_FIBER_XFI ||
-		 type == FW_PORT_TYPE_FIBER_XAUI || type == FW_PORT_TYPE_SFP)
+		 type == FW_PORT_TYPE_FIBER_XAUI || type == FW_PORT_TYPE_SFP) {
 		v |= SUPPORTED_FIBRE;
-	else if (type == FW_PORT_TYPE_BP40_BA)
+		if (caps & FW_PORT_CAP_SPEED_1G)
+			v |= SUPPORTED_1000baseT_Full;
+		if (caps & FW_PORT_CAP_SPEED_10G)
+			v |= SUPPORTED_10000baseT_Full;
+	} else if (type == FW_PORT_TYPE_BP40_BA)
 		v |= SUPPORTED_40000baseSR4_Full;
 
 	if (caps & FW_PORT_CAP_ANEG)
-- 
1.7.1

^ permalink raw reply related

* Re: [patch net-next v3 02/17] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
From: Jamal Hadi Salim @ 2014-11-28 12:57 UTC (permalink / raw)
  To: Jiri Pirko, Scott Feldman
  Cc: John Fastabend, Netdev, David S. Miller, nhorman@tuxdriver.com,
	Andy Gospodarek, Thomas Graf, dborkman@redhat.com,
	ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
	azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang, Eric Dumazet,
	Florian Fainelli, Roopa Prabhu, John Linville,
	"jasowang@redhat.com" <jasow
In-Reply-To: <20141127215529.GA17784@nanopsycho.orion>

On 11/27/14 16:55, Jiri Pirko wrote:
> Thu, Nov 27, 2014 at 09:59:37PM CET, sfeldma@gmail.com wrote:
>> Ya right now the driver just doesn't call br_fdb_external_learn_add()
>> if LEARNING_SYNC is not set.  It's a port driver setting so it seems
>> fine to handle it in the port driver.  You could move the check up to
>> br_fdb_external_learn_add(), but then you have an extra call every 1s
>> for each fdb entry being refreshed.  (1s or whatever the refresh
>> frequency is).  Easier to avoid this overhead and make the decision at
>> the source.
>
> I have been thinking about moving the check into bridge code, it to make
> it there as well as in drivers. This is easily changeable on demenad
> later though, so I left this for now.
>

It seems more comfortable to move to the core if you are doing polling
with timers... i.e you kick it per-offload via some timer. The arming
being done by the setting of LEARNING_SYNC
There are cases where all this is done via interrupts. i.e the hardware
will issue an interrupt only then do you poll..

Maybe Scott's approach is the correct one. I am indifferent to be
honest, these are some of those things that can be easily refactored
as more hardware shows up.

cheers,
jamal

^ permalink raw reply

* Re: [patch net-next v4 00/21] introduce rocker switch driver with hardware accelerated datapath api - phase 1: bridge fdb offload
From: Jiri Pirko @ 2014-11-28 12:07 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Netdev, David S. Miller, nhorman@tuxdriver.com, Andy Gospodarek,
	Thomas Graf, dborkman@redhat.com, ogerlitz@mellanox.com,
	jesse@nicira.com, pshelar@nicira.com, azhou@nicira.com,
	ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Jamal Hadi Salim,
	Florian Fainelli, Roopa Prabhu, John Linville
In-Reply-To: <CAE4R7bDNbyfxx+XExmej8-p=E3X4cHc+Sr75S=aO67XMYcmU8g@mail.gmail.com>

Fri, Nov 28, 2014 at 12:59:17PM CET, sfeldma@gmail.com wrote:
>Thanks Jiri for coordinating this effort.  It looks like we're close
>to convergence on this first patch set.  I sifted thru review comments
>and compiled a list of items for follow-on work.  I hope it's OK to
>use etherpad for this:
>
>https://etherpad.wikimedia.org/p/netdev-swdev-todo
>
>Feel free to edit as needed and claim an item if you intend to work on
>it.  Don't turn the etherpad into a discussion board...try to keep it
>as list of work items.  I don't think any of these items should hold
>up current patch set.

Agree. Good idea putting the todo list on etherpad.

So I will post v5 just with the whitespace issue fixed (pointed out by
Sergei Shtylyov).

Thanks.

>
>-scott
>
>On Thu, Nov 27, 2014 at 2:40 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Hi all.
>>
>> This patchset is just the first phase of switch and switch-ish device
>> support api in kernel. Note that the api will extend.
>>
>> So what this patchset includes:
>> - introduce switchdev api skeleton for implementing switch drivers
>> - introduce rocker switch driver which implements switchdev api fdb and
>>   bridge set/get link ndos
>>
>> As to the discussion if there is need to have specific class of device
>> representing the switch itself, so far we found no need to introduce that.
>> But we are generally ok with the idea and when the time comes and it will
>> be needed, it can be easily introduced without any disturbance.
>>
>> This patchset introduces switch id export through rtnetlink and sysfs,
>> which is similar to what we have for port id in SR-IOV. I will send iproute2
>> patchset for showing the switch id for port netdevs once this is applied.
>> This applies also for the PF_BRIDGE and fdb iproute2 patches.
>>
>> iproute2 patches are now available here:
>> https://github.com/jpirko/iproute2-rocker
>>
>> For detailed description and version history, please see individual patches.
>>
>> In v4 I reordered the patches leaving rocker patches on the end of the patchset.
>>
>> Jiri Pirko (10):
>>   bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion
>>   neigh: sort Neighbor Cache Entry Flags
>>   bridge: convert flags in fbd entry into bitfields
>>   net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
>>   net: rename netdev_phys_port_id to more generic name
>>   net: introduce generic switch devices support
>>   rtnl: expose physical switch id for particular device
>>   net-sysfs: expose physical switch id for particular device
>>   rocker: introduce rocker switch driver
>>   rocker: implement ndo_fdb_dump
>>
>> Scott Feldman (9):
>>   bridge: call netdev_sw_port_stp_update when bridge port STP status
>>     changes
>>   bridge: add API to notify bridge driver of learned FBD on offloaded
>>     device
>>   bridge: move private brport flags to if_bridge.h so port drivers can
>>     use flags
>>   bridge: add new brport flag LEARNING_SYNC
>>   bridge: add new hwmode swdev
>>   bridge: add brport flags to dflt bridge_getlink
>>   rocker: implement rocker ofdpa flow table manipulation
>>   rocker: implement L2 bridge offloading
>>   rocker: add ndo_bridge_setlink/getlink support for learning policy
>>
>> Thomas Graf (2):
>>   rocker: Add proper validation of Netlink attributes
>>   rocker: Use logical operators on booleans
>>
>>  Documentation/ABI/testing/sysfs-class-net        |    8 +
>>  Documentation/networking/switchdev.txt           |   59 +
>>  MAINTAINERS                                      |   14 +
>>  drivers/net/ethernet/Kconfig                     |    1 +
>>  drivers/net/ethernet/Makefile                    |    1 +
>>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    2 +-
>>  drivers/net/ethernet/emulex/benet/be_main.c      |    3 +-
>>  drivers/net/ethernet/intel/i40e/i40e_main.c      |    4 +-
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |    6 +-
>>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |    2 +-
>>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |   11 +-
>>  drivers/net/ethernet/rocker/Kconfig              |   27 +
>>  drivers/net/ethernet/rocker/Makefile             |    5 +
>>  drivers/net/ethernet/rocker/rocker.c             | 4374 ++++++++++++++++++++++
>>  drivers/net/ethernet/rocker/rocker.h             |  428 +++
>>  drivers/net/macvlan.c                            |    4 +-
>>  drivers/net/vxlan.c                              |    4 +-
>>  include/linux/if_bridge.h                        |   31 +
>>  include/linux/netdevice.h                        |   39 +-
>>  include/linux/rtnetlink.h                        |    9 +-
>>  include/net/switchdev.h                          |   37 +
>>  include/uapi/linux/if_bridge.h                   |    1 +
>>  include/uapi/linux/if_link.h                     |    2 +
>>  include/uapi/linux/neighbour.h                   |    6 +-
>>  net/Kconfig                                      |    1 +
>>  net/Makefile                                     |    3 +
>>  net/bridge/br_fdb.c                              |  144 +-
>>  net/bridge/br_private.h                          |   21 +-
>>  net/bridge/br_stp.c                              |    7 +
>>  net/core/dev.c                                   |    2 +-
>>  net/core/net-sysfs.c                             |   26 +-
>>  net/core/rtnetlink.c                             |  119 +-
>>  net/switchdev/Kconfig                            |   13 +
>>  net/switchdev/Makefile                           |    5 +
>>  net/switchdev/switchdev.c                        |   52 +
>>  35 files changed, 5366 insertions(+), 105 deletions(-)
>>  create mode 100644 Documentation/networking/switchdev.txt
>>  create mode 100644 drivers/net/ethernet/rocker/Kconfig
>>  create mode 100644 drivers/net/ethernet/rocker/Makefile
>>  create mode 100644 drivers/net/ethernet/rocker/rocker.c
>>  create mode 100644 drivers/net/ethernet/rocker/rocker.h
>>  create mode 100644 include/net/switchdev.h
>>  create mode 100644 net/switchdev/Kconfig
>>  create mode 100644 net/switchdev/Makefile
>>  create mode 100644 net/switchdev/switchdev.c
>>
>> --
>> 1.9.3
>>

^ permalink raw reply

* Re: [patch net-next v4 00/21] introduce rocker switch driver with hardware accelerated datapath api - phase 1: bridge fdb offload
From: Scott Feldman @ 2014-11-28 11:59 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Netdev, David S. Miller, nhorman@tuxdriver.com, Andy Gospodarek,
	Thomas Graf, dborkman@redhat.com, ogerlitz@mellanox.com,
	jesse@nicira.com, pshelar@nicira.com, azhou@nicira.com,
	ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Jamal Hadi Salim,
	Florian Fainelli, Roopa Prabhu, John Linville
In-Reply-To: <1417084826-9875-1-git-send-email-jiri@resnulli.us>

Thanks Jiri for coordinating this effort.  It looks like we're close
to convergence on this first patch set.  I sifted thru review comments
and compiled a list of items for follow-on work.  I hope it's OK to
use etherpad for this:

https://etherpad.wikimedia.org/p/netdev-swdev-todo

Feel free to edit as needed and claim an item if you intend to work on
it.  Don't turn the etherpad into a discussion board...try to keep it
as list of work items.  I don't think any of these items should hold
up current patch set.

-scott

On Thu, Nov 27, 2014 at 2:40 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Hi all.
>
> This patchset is just the first phase of switch and switch-ish device
> support api in kernel. Note that the api will extend.
>
> So what this patchset includes:
> - introduce switchdev api skeleton for implementing switch drivers
> - introduce rocker switch driver which implements switchdev api fdb and
>   bridge set/get link ndos
>
> As to the discussion if there is need to have specific class of device
> representing the switch itself, so far we found no need to introduce that.
> But we are generally ok with the idea and when the time comes and it will
> be needed, it can be easily introduced without any disturbance.
>
> This patchset introduces switch id export through rtnetlink and sysfs,
> which is similar to what we have for port id in SR-IOV. I will send iproute2
> patchset for showing the switch id for port netdevs once this is applied.
> This applies also for the PF_BRIDGE and fdb iproute2 patches.
>
> iproute2 patches are now available here:
> https://github.com/jpirko/iproute2-rocker
>
> For detailed description and version history, please see individual patches.
>
> In v4 I reordered the patches leaving rocker patches on the end of the patchset.
>
> Jiri Pirko (10):
>   bridge: rename fdb_*_hw to fdb_*_hw_addr to avoid confusion
>   neigh: sort Neighbor Cache Entry Flags
>   bridge: convert flags in fbd entry into bitfields
>   net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
>   net: rename netdev_phys_port_id to more generic name
>   net: introduce generic switch devices support
>   rtnl: expose physical switch id for particular device
>   net-sysfs: expose physical switch id for particular device
>   rocker: introduce rocker switch driver
>   rocker: implement ndo_fdb_dump
>
> Scott Feldman (9):
>   bridge: call netdev_sw_port_stp_update when bridge port STP status
>     changes
>   bridge: add API to notify bridge driver of learned FBD on offloaded
>     device
>   bridge: move private brport flags to if_bridge.h so port drivers can
>     use flags
>   bridge: add new brport flag LEARNING_SYNC
>   bridge: add new hwmode swdev
>   bridge: add brport flags to dflt bridge_getlink
>   rocker: implement rocker ofdpa flow table manipulation
>   rocker: implement L2 bridge offloading
>   rocker: add ndo_bridge_setlink/getlink support for learning policy
>
> Thomas Graf (2):
>   rocker: Add proper validation of Netlink attributes
>   rocker: Use logical operators on booleans
>
>  Documentation/ABI/testing/sysfs-class-net        |    8 +
>  Documentation/networking/switchdev.txt           |   59 +
>  MAINTAINERS                                      |   14 +
>  drivers/net/ethernet/Kconfig                     |    1 +
>  drivers/net/ethernet/Makefile                    |    1 +
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |    2 +-
>  drivers/net/ethernet/emulex/benet/be_main.c      |    3 +-
>  drivers/net/ethernet/intel/i40e/i40e_main.c      |    4 +-
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |    6 +-
>  drivers/net/ethernet/mellanox/mlx4/en_netdev.c   |    2 +-
>  drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c |   11 +-
>  drivers/net/ethernet/rocker/Kconfig              |   27 +
>  drivers/net/ethernet/rocker/Makefile             |    5 +
>  drivers/net/ethernet/rocker/rocker.c             | 4374 ++++++++++++++++++++++
>  drivers/net/ethernet/rocker/rocker.h             |  428 +++
>  drivers/net/macvlan.c                            |    4 +-
>  drivers/net/vxlan.c                              |    4 +-
>  include/linux/if_bridge.h                        |   31 +
>  include/linux/netdevice.h                        |   39 +-
>  include/linux/rtnetlink.h                        |    9 +-
>  include/net/switchdev.h                          |   37 +
>  include/uapi/linux/if_bridge.h                   |    1 +
>  include/uapi/linux/if_link.h                     |    2 +
>  include/uapi/linux/neighbour.h                   |    6 +-
>  net/Kconfig                                      |    1 +
>  net/Makefile                                     |    3 +
>  net/bridge/br_fdb.c                              |  144 +-
>  net/bridge/br_private.h                          |   21 +-
>  net/bridge/br_stp.c                              |    7 +
>  net/core/dev.c                                   |    2 +-
>  net/core/net-sysfs.c                             |   26 +-
>  net/core/rtnetlink.c                             |  119 +-
>  net/switchdev/Kconfig                            |   13 +
>  net/switchdev/Makefile                           |    5 +
>  net/switchdev/switchdev.c                        |   52 +
>  35 files changed, 5366 insertions(+), 105 deletions(-)
>  create mode 100644 Documentation/networking/switchdev.txt
>  create mode 100644 drivers/net/ethernet/rocker/Kconfig
>  create mode 100644 drivers/net/ethernet/rocker/Makefile
>  create mode 100644 drivers/net/ethernet/rocker/rocker.c
>  create mode 100644 drivers/net/ethernet/rocker/rocker.h
>  create mode 100644 include/net/switchdev.h
>  create mode 100644 net/switchdev/Kconfig
>  create mode 100644 net/switchdev/Makefile
>  create mode 100644 net/switchdev/switchdev.c
>
> --
> 1.9.3
>

^ permalink raw reply

* Re: [patch net-next v3 08/17] bridge: call netdev_sw_port_stp_update when bridge port STP status changes
From: Scott Feldman @ 2014-11-28 10:51 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Jiri Pirko, Netdev, David S. Miller, nhorman@tuxdriver.com,
	Andy Gospodarek, Thomas Graf, dborkman@redhat.com,
	ogerlitz@mellanox.com, jesse@nicira.com, pshelar@nicira.com,
	azhou@nicira.com, ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang,
	Fastabend, John R, Eric Dumazet, Jamal Hadi Salim,
	Florian Fainelli, John Linville
In-Reply-To: <547848D8.1060203@cumulusnetworks.com>

On Fri, Nov 28, 2014 at 2:05 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> On 11/25/14, 5:35 PM, Scott Feldman wrote:
>>
>>   The
>> bridge driver or external STP process (msptd) is still controlling STP
>> state for the port and processing the BPDUs.  When the state changes
>> on the port, the bridge driver is letting HW know, that's it.
>
>
> I understand that. In which case, we should not call it stp state.
> It is just port state.

Sure, call it port state but it takes on BR_STATE_xxx values which
just so happen to correspond exactly to STP states.

> And since it is yet another port attribute like port
> priority,
> we should be able to use the same api to offload it to hw just like the
> other port attributes.

Well it does...see ndo_bridge_setlink in bridge driver, br_setport
where IFLA_BRPORT_STATE is handled...it calls br_set_port_state(),
which calls into the swdev port driver.  That's for the case where
user or external processing is setting STP state.  For the case where
the bridge itself is managing the STP state, the bridge will make the
same br_set_port_state() call to adjust the port state.

> And, Thats why i tried to generalize all bridge port attribute set by
> introducing one generic netdev_sw_port_set_attr api.
> https://marc.info/?l=linux-netdev&m=141661018619712&w=2
>
>
> And coming back to my original comment in this thread,
> the port state should be offloaded to hw only when the swdev mode (or hw
> offload mode ;) is set.
>
>
>>   If the
>> port driver can't do anything with that notification, then it should
>> not implement ndo_switch_port_stp_update.  If it does implement
>> ndo_switch_port_stp_update, then it can adjust its HW (e.g. disable
>> port if BR_DISABLED, etc), and return err code if somehow it failed
>> while adjusting HW.
>>
>> This is not offloading STP state ctrl plane to HW.  The ctrl plane is
>> kept in bridge driver (or mstpd) in SW.  HW stays dumb in this model.
>> The bridge currently has policy control to turn on/off STP per bridge
>> and a netlink hook for external processes to change STP state.
>>
>> On Tue, Nov 25, 2014 at 12:48 PM, Roopa Prabhu
>> <roopa@cumulusnetworks.com> wrote:
>>>
>>> On 11/25/14, 2:28 AM, Jiri Pirko wrote:
>>>>
>>>> From: Scott Feldman <sfeldma@gmail.com>
>>>>
>>>> To notify switch driver of change in STP state of bridge port, add new
>>>> .ndo op and provide switchdev wrapper func to call ndo op. Use it in
>>>> bridge
>>>> code then.
>>>>
>>>> Signed-off-by: Scott Feldman <sfeldma@gmail.com>
>>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>>>> ---
>>>> v2->v3:
>>>> -changed "sw" string to "switch" to avoid confusion
>>>> v1->v2:
>>>> -no change
>>>> ---
>>>>    include/linux/netdevice.h |  5 +++++
>>>>    include/net/switchdev.h   |  7 +++++++
>>>>    net/bridge/br_stp.c       |  2 ++
>>>>    net/switchdev/switchdev.c | 19 +++++++++++++++++++
>>>>    4 files changed, 33 insertions(+)
>>>>
>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>> index ce096dc..66cb64e 100644
>>>> --- a/include/linux/netdevice.h
>>>> +++ b/include/linux/netdevice.h
>>>> @@ -1024,6 +1024,9 @@ typedef u16 (*select_queue_fallback_t)(struct
>>>> net_device *dev,
>>>>     *    Called to get an ID of the switch chip this port is part of.
>>>>     *    If driver implements this, it indicates that it represents a
>>>> port
>>>>     *    of a switch chip.
>>>> + * int (*ndo_switch_port_stp_update)(struct net_device *dev, u8 state);
>>>> + *     Called to notify switch device port of bridge port STP
>>>> + *     state change.
>>>>     */
>>>>    struct net_device_ops {
>>>>          int                     (*ndo_init)(struct net_device *dev);
>>>> @@ -1180,6 +1183,8 @@ struct net_device_ops {
>>>>    #ifdef CONFIG_NET_SWITCHDEV
>>>>          int                     (*ndo_switch_parent_id_get)(struct
>>>> net_device *dev,
>>>>                                                              struct
>>>> netdev_phys_item_id *psid);
>>>> +       int                     (*ndo_switch_port_stp_update)(struct
>>>> net_device *dev,
>>>> +                                                             u8 state);
>>>>    #endif
>>>>    };
>>>>    diff --git a/include/net/switchdev.h b/include/net/switchdev.h
>>>> index 7a52360..8a6d164 100644
>>>> --- a/include/net/switchdev.h
>>>> +++ b/include/net/switchdev.h
>>>> @@ -16,6 +16,7 @@
>>>>      int netdev_switch_parent_id_get(struct net_device *dev,
>>>>                                  struct netdev_phys_item_id *psid);
>>>> +int netdev_switch_port_stp_update(struct net_device *dev, u8 state);
>>>>      #else
>>>>    @@ -25,6 +26,12 @@ static inline int
>>>> netdev_switch_parent_id_get(struct
>>>> net_device *dev,
>>>>          return -EOPNOTSUPP;
>>>>    }
>>>>    +static inline int netdev_switch_port_stp_update(struct net_device
>>>> *dev,
>>>> +                                               u8 state)
>>>> +{
>>>> +       return -EOPNOTSUPP;
>>>> +}
>>>> +
>>>>    #endif
>>>>      #endif /* _LINUX_SWITCHDEV_H_ */
>>>> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
>>>> index 2b047bc..35e016c 100644
>>>> --- a/net/bridge/br_stp.c
>>>> +++ b/net/bridge/br_stp.c
>>>> @@ -12,6 +12,7 @@
>>>>     */
>>>>    #include <linux/kernel.h>
>>>>    #include <linux/rculist.h>
>>>> +#include <net/switchdev.h>
>>>>      #include "br_private.h"
>>>>    #include "br_private_stp.h"
>>>> @@ -39,6 +40,7 @@ void br_log_state(const struct net_bridge_port *p)
>>>>    void br_set_state(struct net_bridge_port *p, unsigned int state)
>>>>    {
>>>>          p->state = state;
>>>> +       netdev_switch_port_stp_update(p->dev, state);
>>>>    }
>>>>      /* called under bridge lock */
>>>> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
>>>> index 66973de..d162b21 100644
>>>> --- a/net/switchdev/switchdev.c
>>>> +++ b/net/switchdev/switchdev.c
>>>> @@ -31,3 +31,22 @@ int netdev_switch_parent_id_get(struct net_device
>>>> *dev,
>>>>          return ops->ndo_switch_parent_id_get(dev, psid);
>>>>    }
>>>>    EXPORT_SYMBOL(netdev_switch_parent_id_get);
>>>> +
>>>> +/**
>>>> + *     netdev_switch_port_stp_update - Notify switch device port of STP
>>>> + *                                     state change
>>>> + *     @dev: port device
>>>> + *     @state: port STP state
>>>> + *
>>>> + *     Notify switch device port of bridge port STP state change.
>>>> + */
>>>> +int netdev_switch_port_stp_update(struct net_device *dev, u8 state)
>>>> +{
>>>> +       const struct net_device_ops *ops = dev->netdev_ops;
>>>> +
>>>> +       if (!ops->ndo_switch_port_stp_update)
>>>> +               return -EOPNOTSUPP;
>>>> +       WARN_ON(!ops->ndo_switch_parent_id_get);
>>>> +       return ops->ndo_switch_port_stp_update(dev, state);
>>>> +}
>>>> +EXPORT_SYMBOL(netdev_switch_port_stp_update);
>>>
>>>
>>> This should also check  if offload is enabled on the bridge/port ?
>>>
>

^ permalink raw reply

* RE: [PATCH iproute2] ip xfrm: support 64bit kernel and 32bit userspace
From: David Laight @ 2014-11-28 10:47 UTC (permalink / raw)
  To: 'roy.qing.li@gmail.com', netdev@vger.kernel.org
In-Reply-To: <1417157936-5522-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
> From: Li RongQing <roy.qing.li@gmail.com>
> 
> The size of struct xfrm_userpolicy_info is 168 bytes for 64bit kernel, and
> 164 bytes for 32bit userspace because of the different alignment. and lead
> to "ip xfrm" be unable to work.
> 
> add a pad in struct xfrm_userpolicy_info, and enable it by set
> KERNEL_64_USERSPACE_32 to y
...
> diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h
> index fa2ecb2..009510c 100644
> --- a/include/linux/xfrm.h
> +++ b/include/linux/xfrm.h
> @@ -407,6 +407,9 @@ struct xfrm_userpolicy_info {
>  	/* Automatically expand selector to include matching ICMP payloads. */
>  #define XFRM_POLICY_ICMP	2
>  	__u8				share;
> +#ifdef KNL_64_US_32
> +	int				pad;
> +#endif
>  };

You could add __attribute__((aligned(8))) to the structure to pad it on
all systems.

Or find the 64bit member and add __attribute__((aligned(4))) to that member
so that the 64bit version doesn't get padded.
That will make the field more expensive to access on systems that can't
do misaligned memory transfers.

	David

^ permalink raw reply

* Re: [patch net-next v3 02/17] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
From: Scott Feldman @ 2014-11-28 10:33 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Jamal Hadi Salim, John Fastabend, Jiri Pirko, Netdev,
	David S. Miller, nhorman@tuxdriver.com, Andy Gospodarek,
	Thomas Graf, dborkman@redhat.com, ogerlitz@mellanox.com,
	jesse@nicira.com, pshelar@nicira.com, azhou@nicira.com,
	ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang, Eric Dumazet,
	Florian Fainelli, John Linville, "j
In-Reply-To: <54784B1D.8090508@cumulusnetworks.com>

On Fri, Nov 28, 2014 at 2:14 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> On 11/25/14, 6:36 PM, Scott Feldman wrote:
>>
>> On Tue, Nov 25, 2014 at 6:50 AM, Jamal Hadi Salim <jhs@mojatatu.com>
>> wrote:
>>>
>>> On 11/25/14 11:30, John Fastabend wrote:
>>>>
>>>> On 11/25/2014 08:18 AM, Jamal Hadi Salim wrote:
>>>>>
>>>>> On 11/25/14 11:01, John Fastabend wrote:
>>>>>>
>>>>>> On 11/25/2014 07:38 AM, Jamal Hadi Salim wrote:
>>>>>>>
>>>>>>> On 11/25/14 05:28, Jiri Pirko wrote:
>>>>>>>>
>>>>>>>> Do the work of parsing NDA_VLAN directly in rtnetlink code, pass
>>>>>>>> simple
>>>>>>>> u16 vid to drivers from there.
>>>>>>>>
>>>
>>>> Actually (after having some coffee) this becomes much more useful
>>>> if you return which items failed. Then you can slam the hardware
>>>> with your 100 entries, probably a lot more then that, and come back
>>>> later and clean it up.
>>>>
>>> Yes, that is the general use case.
>>> Unfortunately at the moment we only return codes on a netlink set
>>> direction - but would be a beauty if we could return what succeeded
>>> and didnt in some form of vector.
>>> Note: all is not lost because you can always do a get afterwards and
>>> find what is missing if you got a return code of "partial success".
>>> Just a little less efficient..
>>>
>>>
>>>> We return a bitmask of which operations were successful. So if SW fails
>>>> we have both bits cleared and we abort. When SW is successful we set the
>>>> SW bit and try to program the HW. If its sucessful we set the HW bit if
>>>> its not we abort with an err. Converting this to (1) is not much work
>>>> just skip the abort.
>>>>
>>> Ok, guess i am gonna have to go stare at the code some more.
>>> I thought we returned one of the error codes?
>>> A bitmask would work for a single entry - because you have two
>>> options add to h/ware and/or s/ware. So response is easy to encode.
>>> But if i have 1000 and they are sparsely populated (think an indexed
>>> table and i have indices 1, 23, 45, etc), then a bitmask would be
>>> hard to use.
>>
>> I'm confused by this discussion.  Do I have this right: You want to
>> send 1000 RTM_NEWNEIGHs to PF_BRIDGE with both NTF_MASTER and NTF_SELF
>> set such that 1000 new FBD entries are installed in both (SW) the
>> bridge's FDB and (HW) the port driver's FDB.  My first confusion is
>> why do you want these FBD entries in bridge's FDB?  We're offloading
>> the switching to HW so HW should be handling fwd plane.  If ctrl pkt
>> make it to SW, it can learn those FDB entries; no need for manual
>> install of FDB entry in SW.  It seems to me you only want to use
>> NTF_SELF to install the FDB entry in HW using the port driver.  And an
>> error code is returned for that install.  Since there is only one
>> target (NTF_SELF) there is no need for bitmask return.
>>
> scott, we do have such usecase today. ie , a fdb entry with both NTF_MASTER
> and NTF_SELF set.
> And these fdb entries can come from an external controller. The path to get
> them to the hw is via the kernel.
> The controller can use `bridge fdb add` to add the fdb entries to the kernel
> (with NTF_MASTER) and also indicate in the same message to add the fdb entry
> to hw (with NTF_SELF). And in this model it is assumed that the kernel fdb
> and hw fdb are in sync.

Ya, I understood that from Jamal's explanation.

^ permalink raw reply

* Re: Is this 32-bit NCM?
From: Enrico Mioso @ 2014-11-28 10:27 UTC (permalink / raw)
  To: Bjørn Mork
  Cc: Alex Strizhevsky, ShaojunMidge.Tan, Mingying.Zhu, youtux,
	linux-usb, netdev, Eli.Britstein
In-Reply-To: <87ppc71xne.fsf@nemi.mork.no>

... and what do you think regarding the last arp packet Kevin shared with us 
after having changed the remainder fixed value.

^ permalink raw reply

* Re: [PATCH net-next] bridge: add vlan id to mdb notifications
From: Roopa Prabhu @ 2014-11-28 10:18 UTC (permalink / raw)
  To: Thomas Graf
  Cc: Stephen Hemminger, vyasevich, netdev, wkok, gospo, jtoppins,
	sashok, Shrijeet Mukherjee
In-Reply-To: <20141126215327.GC13786@casper.infradead.org>

On 11/26/14, 1:53 PM, Thomas Graf wrote:
> On 11/26/14 at 12:40pm, Roopa Prabhu wrote:
>> I have always wondered, if binary netlink attributes have this restriction,
>> they should be discouraged. especially when the other extensible option is
>> to add them as a separate netlink attribute.
> This is pretty much exactly the reason why attributes have been
> introduced ;-)

thanks for confirming that thomas :). Will resubmit the patch with vlan 
information as a separate attribute.

^ permalink raw reply

* Re: [patch net-next v3 02/17] net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del
From: Roopa Prabhu @ 2014-11-28 10:14 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Jamal Hadi Salim, John Fastabend, Jiri Pirko, Netdev,
	David S. Miller, nhorman@tuxdriver.com, Andy Gospodarek,
	Thomas Graf, dborkman@redhat.com, ogerlitz@mellanox.com,
	jesse@nicira.com, pshelar@nicira.com, azhou@nicira.com,
	ben@decadent.org.uk, stephen@networkplumber.org,
	Kirsher, Jeffrey T, vyasevic@redhat.com, Cong Wang, Eric Dumazet,
	Florian Fainelli, John Linville, "j
In-Reply-To: <CAE4R7bD1Hi+fnKp-Sg6Tn4AcpOu2pwgEYecP8LVA1ioO4FkgRQ@mail.gmail.com>

On 11/25/14, 6:36 PM, Scott Feldman wrote:
> On Tue, Nov 25, 2014 at 6:50 AM, Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>> On 11/25/14 11:30, John Fastabend wrote:
>>> On 11/25/2014 08:18 AM, Jamal Hadi Salim wrote:
>>>> On 11/25/14 11:01, John Fastabend wrote:
>>>>> On 11/25/2014 07:38 AM, Jamal Hadi Salim wrote:
>>>>>> On 11/25/14 05:28, Jiri Pirko wrote:
>>>>>>> Do the work of parsing NDA_VLAN directly in rtnetlink code, pass
>>>>>>> simple
>>>>>>> u16 vid to drivers from there.
>>>>>>>
>>
>>> Actually (after having some coffee) this becomes much more useful
>>> if you return which items failed. Then you can slam the hardware
>>> with your 100 entries, probably a lot more then that, and come back
>>> later and clean it up.
>>>
>> Yes, that is the general use case.
>> Unfortunately at the moment we only return codes on a netlink set
>> direction - but would be a beauty if we could return what succeeded
>> and didnt in some form of vector.
>> Note: all is not lost because you can always do a get afterwards and
>> find what is missing if you got a return code of "partial success".
>> Just a little less efficient..
>>
>>
>>> We return a bitmask of which operations were successful. So if SW fails
>>> we have both bits cleared and we abort. When SW is successful we set the
>>> SW bit and try to program the HW. If its sucessful we set the HW bit if
>>> its not we abort with an err. Converting this to (1) is not much work
>>> just skip the abort.
>>>
>> Ok, guess i am gonna have to go stare at the code some more.
>> I thought we returned one of the error codes?
>> A bitmask would work for a single entry - because you have two
>> options add to h/ware and/or s/ware. So response is easy to encode.
>> But if i have 1000 and they are sparsely populated (think an indexed
>> table and i have indices 1, 23, 45, etc), then a bitmask would be
>> hard to use.
> I'm confused by this discussion.  Do I have this right: You want to
> send 1000 RTM_NEWNEIGHs to PF_BRIDGE with both NTF_MASTER and NTF_SELF
> set such that 1000 new FBD entries are installed in both (SW) the
> bridge's FDB and (HW) the port driver's FDB.  My first confusion is
> why do you want these FBD entries in bridge's FDB?  We're offloading
> the switching to HW so HW should be handling fwd plane.  If ctrl pkt
> make it to SW, it can learn those FDB entries; no need for manual
> install of FDB entry in SW.  It seems to me you only want to use
> NTF_SELF to install the FDB entry in HW using the port driver.  And an
> error code is returned for that install.  Since there is only one
> target (NTF_SELF) there is no need for bitmask return.
>
scott, we do have such usecase today. ie , a fdb entry with both 
NTF_MASTER and NTF_SELF set.
And these fdb entries can come from an external controller. The path to 
get them to the hw is via the kernel.
The controller can use `bridge fdb add` to add the fdb entries to the 
kernel (with NTF_MASTER) and also indicate in the same message to add 
the fdb entry to hw (with NTF_SELF). And in this model it is assumed 
that the kernel fdb and hw fdb are in sync.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox