netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v1 0/9] bonding: remove vlan special handling
@ 2013-08-26 20:32 Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter) Veaceslav Falico
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev
  Cc: Jay Vosburgh, Andy Gospodarek, David S. Miller, Eric Dumazet,
	Jiri Pirko, Alexander Duyck, Cong Wang, Veaceslav Falico

v1: Per Jiri's advice, remove the exported netdev_upper struct to keep it
    inside dev.c only, and instead implement a macro to iterate over the
    list and return only net_device *.

The aim of this patchset is to remove bondings' own vlan handling as much
as possible and replace it with the netdev upper device functionality.

This is achieved by adding a helper function to upper dev list handling -
netdev_upper_get_next_dev(dev, iter), which returns the next device after
the list_head **iter, and sets *iter to the next list_head *. This patchset
also adds netdev_for_each_upper_dev(dev, upper, iter), which iterates
through the whole dev->upper_dev_list, setting upper to the net_device.
The only special treatment of vlans remains in rlb code.

This patchset solves several issues with bonding, simplifies it overall,
RCUify further and exports upper list functions for any other users which
might also want to get rid of its own vlan_lists.

I'm testing it continuously currently, no issues found, will update on
anything.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>

---
 drivers/net/bonding/bond_alb.c  |   75 ++++++------
 drivers/net/bonding/bond_alb.h  |    2 -
 drivers/net/bonding/bond_main.c |  244 ++++++++-------------------------------
 drivers/net/bonding/bonding.h   |   21 +++-
 include/linux/netdevice.h       |   10 ++
 net/core/dev.c                  |   25 ++++
 6 files changed, 133 insertions(+), 244 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter)
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:57   ` Jiri Pirko
  2013-08-26 20:32 ` [PATCH net-next v1 2/9] net: add netdev_for_each_upper_dev() Veaceslav Falico
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev
  Cc: Veaceslav Falico, David S. Miller, Eric Dumazet, Jiri Pirko,
	Alexander Duyck, Cong Wang

This function returns the next dev in the dev->upper_dev_list after the
struct list_head **iter position, and updates *iter accordingly. Returns
NULL if there are no devices left.

v1: new patch

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 net/core/dev.c |   25 +++++++++++++++++++++++++
 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1ed2b66..566e99a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4477,6 +4477,31 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
 }
 EXPORT_SYMBOL(netdev_master_upper_dev_get);
 
+/* netdev_upper_get_next_dev - Get the next dev from upper list
+ * @dev: device
+ * @iter: list_head ** of the current position
+ *
+ * Gets the next device from the dev's upper list, starting from iter
+ * position. The caller must hold RCU read lock.
+ */
+struct net_device *netdev_upper_get_next_dev(struct net_device *dev,
+					     struct list_head **iter)
+{
+	struct netdev_upper *upper;
+
+	WARN_ON_ONCE(!rcu_read_lock_held());
+
+	upper = list_entry_rcu((*iter)->next, struct netdev_upper, list);
+
+	if (&upper->list == &dev->upper_dev_list)
+		return NULL;
+
+	*iter = &upper->list;
+
+	return upper->dev;
+}
+EXPORT_SYMBOL(netdev_upper_get_next_dev);
+
 /**
  * netdev_master_upper_dev_get_rcu - Get master upper device
  * @dev: device
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 2/9] net: add netdev_for_each_upper_dev()
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter) Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 3/9] bonding: use netdev_upper list in bond_vlan_used Veaceslav Falico
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev
  Cc: Veaceslav Falico, David S. Miller, Eric Dumazet, Jiri Pirko,
	Alexander Duyck, Cong Wang

The new macro netdev_for_each_upper_dev(dev, upper, iter) iterates through
the dev->upper_dev_list starting from the first element, using the
netdev_upper_get_next_dev(dev, &iter).

Must be called under RCU read lock.

v1: new patch

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Cong Wang <amwang@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 include/linux/netdevice.h |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 077363d..85590b6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2767,6 +2767,16 @@ extern int		bpf_jit_enable;
 extern bool netdev_has_upper_dev(struct net_device *dev,
 				 struct net_device *upper_dev);
 extern bool netdev_has_any_upper_dev(struct net_device *dev);
+extern struct net_device *netdev_upper_get_next_dev(struct net_device *dev,
+						    struct list_head **iter);
+
+/* iterate through upper list, must be called under RCU read lock */
+#define netdev_for_each_upper_dev(dev, upper, iter) \
+	for (iter = &(dev)->upper_dev_list, \
+	     upper = netdev_upper_get_next_dev(dev, &(iter)); \
+	     upper; \
+	     upper = netdev_upper_get_next_dev(dev, &(iter)))
+
 extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
 extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
 extern int netdev_upper_dev_link(struct net_device *dev,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 3/9] bonding: use netdev_upper list in bond_vlan_used
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter) Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 2/9] net: add netdev_for_each_upper_dev() Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 4/9] bonding: make bond_arp_send_all use upper device list Veaceslav Falico
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Convert bond_vlan_used() to traverse the upper device list to see if we
have any vlans above us. It's protected by rcu, and in case we are holding
rtnl_lock we should call vlan_uses_dev() instead - it's faster.

v1: use netdev_for_each_upper_dev()

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bonding.h |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 4bf52d5..9392e00 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -267,9 +267,22 @@ struct bonding {
 #endif /* CONFIG_DEBUG_FS */
 };
 
+/* if we hold rtnl_lock() - call vlan_uses_dev() */
 static inline bool bond_vlan_used(struct bonding *bond)
 {
-	return !list_empty(&bond->vlan_list);
+	struct net_device *upper;
+	struct list_head *iter;
+
+	rcu_read_lock();
+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
+		if (upper->priv_flags & IFF_802_1Q_VLAN) {
+			rcu_read_unlock();
+			return true;
+		}
+	}
+	rcu_read_unlock();
+
+	return false;
 }
 
 #define bond_slave_get_rcu(dev) \
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 4/9] bonding: make bond_arp_send_all use upper device list
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (2 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 3/9] bonding: use netdev_upper list in bond_vlan_used Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices Veaceslav Falico
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently, bond_arp_send_all() is aware only of vlans, which breaks
configurations like bond <- bridge (or any other 'upper' device) with IP
(which is quite a common scenario for virt setups).

To fix this we convert the bond_arp_send_all() to first verify if the rt
device is the bond itself, and if not - to go through its list of upper
devices to see if any of them match the route device for the target. If the
match is a vlan device - we also save its vlan_id and tag it in
bond_arp_send().

Also, clean the function a bit to be more readable.

v1: use netdev_for_each_upper_dev()

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c |   84 ++++++++++++++------------------------
 1 files changed, 31 insertions(+), 53 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7407e65..40eb250 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2444,30 +2444,16 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_
 
 static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 {
-	int i, vlan_id;
-	__be32 *targets = bond->params.arp_targets;
-	struct vlan_entry *vlan;
-	struct net_device *vlan_dev = NULL;
+	struct net_device *upper;
+	struct list_head *iter;
 	struct rtable *rt;
+	__be32 *targets = bond->params.arp_targets, addr;
+	int i, vlan_id;
 
-	for (i = 0; (i < BOND_MAX_ARP_TARGETS); i++) {
-		__be32 addr;
-		if (!targets[i])
-			break;
+	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
 		pr_debug("basa: target %pI4\n", &targets[i]);
-		if (!bond_vlan_used(bond)) {
-			pr_debug("basa: empty vlan: arp_send\n");
-			addr = bond_confirm_addr(bond->dev, targets[i], 0);
-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
-				      addr, 0);
-			continue;
-		}
 
-		/*
-		 * If VLANs are configured, we do a route lookup to
-		 * determine which VLAN interface would be used, so we
-		 * can tag the ARP with the proper VLAN tag.
-		 */
+		/* Find out through which dev should the packet go */
 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
 				     RTO_ONLINK, 0);
 		if (IS_ERR(rt)) {
@@ -2478,47 +2464,39 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 			continue;
 		}
 
-		/*
-		 * This target is not on a VLAN
-		 */
-		if (rt->dst.dev == bond->dev) {
-			ip_rt_put(rt);
-			pr_debug("basa: rtdev == bond->dev: arp_send\n");
-			addr = bond_confirm_addr(bond->dev, targets[i], 0);
-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
-				      addr, 0);
-			continue;
-		}
-
 		vlan_id = 0;
-		list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
-			rcu_read_lock();
-			vlan_dev = __vlan_find_dev_deep(bond->dev,
-							htons(ETH_P_8021Q),
-							vlan->vlan_id);
-			rcu_read_unlock();
-			if (vlan_dev == rt->dst.dev) {
-				vlan_id = vlan->vlan_id;
-				pr_debug("basa: vlan match on %s %d\n",
-				       vlan_dev->name, vlan_id);
-				break;
-			}
-		}
 
-		if (vlan_id && vlan_dev) {
-			ip_rt_put(rt);
-			addr = bond_confirm_addr(vlan_dev, targets[i], 0);
-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
-				      addr, vlan_id);
-			continue;
+		/* bond device itself */
+		if (rt->dst.dev == bond->dev)
+			goto found;
+
+		/* search for upper device, like vlan/bridge/team/etc */
+		rcu_read_lock();
+		netdev_for_each_upper_dev(bond->dev, upper, iter) {
+			if (upper == rt->dst.dev) {
+				/* if it's a vlan - get its VID */
+				if (is_vlan_dev(rt->dst.dev))
+					vlan_id = vlan_dev_vlan_id(rt->dst.dev);
+
+				rcu_read_unlock();
+				goto found;
+			}
 		}
+		rcu_read_unlock();
 
-		if (net_ratelimit()) {
+		/* Not our device - skip */
+		if (net_ratelimit())
 			pr_warning("%s: no path to arp_ip_target %pI4 via rt.dev %s\n",
 				   bond->dev->name, &targets[i],
 				   rt->dst.dev ? rt->dst.dev->name : "NULL");
-		}
 		ip_rt_put(rt);
+		continue;
+
+found:
+		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
+		ip_rt_put(rt);
+		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
+			      addr, vlan_id);
 	}
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (3 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 4/9] bonding: make bond_arp_send_all use upper device list Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:53   ` Jiri Pirko
  2013-08-26 20:32 ` [PATCH net-next v1 6/9] bonding: use vlan_uses_dev() in __bond_release_one() Veaceslav Falico
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently, bond_has_this_ip() is aware only of vlan upper devices, and thus
will return false if the address is associated with the upper bridge or any
other device, and thus will break the arp logic.

Fix this by using the upper device list. For every upper device we verify
if the address associated with it is our address, and if yes - return true.

v1: use netdev_for_each_upper_dev()

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c |   25 +++++++++++++------------
 1 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 40eb250..dbac1ce 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2392,24 +2392,25 @@ re_arm:
 	}
 }
 
-static int bond_has_this_ip(struct bonding *bond, __be32 ip)
+static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
 {
-	struct vlan_entry *vlan;
-	struct net_device *vlan_dev;
+	struct net_device *upper;
+	struct list_head *iter;
+	bool ret = false;
 
 	if (ip == bond_confirm_addr(bond->dev, 0, ip))
-		return 1;
+		return true;
 
-	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
-		rcu_read_lock();
-		vlan_dev = __vlan_find_dev_deep(bond->dev, htons(ETH_P_8021Q),
-						vlan->vlan_id);
-		rcu_read_unlock();
-		if (vlan_dev && ip == bond_confirm_addr(vlan_dev, 0, ip))
-			return 1;
+	rcu_read_lock();
+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
+		if (ip == bond_confirm_addr(upper, 0, ip)) {
+			ret = true;
+			break;
+		}
 	}
+	rcu_read_unlock();
 
-	return 0;
+	return ret;
 }
 
 /*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 6/9] bonding: use vlan_uses_dev() in __bond_release_one()
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (4 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 7/9] bonding: split alb_send_learning_packets() Veaceslav Falico
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

We always hold the rtnl_lock() in __bond_release_one(), so use
vlan_uses_dev() instead of bond_vlan_used().

v1: no change

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_main.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index dbac1ce..aaf14a0 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1954,7 +1954,7 @@ static int __bond_release_one(struct net_device *bond_dev,
 		bond_set_carrier(bond);
 		eth_hw_addr_random(bond_dev);
 
-		if (bond_vlan_used(bond)) {
+		if (vlan_uses_dev(bond_dev)) {
 			pr_warning("%s: Warning: clearing HW address of %s while it still has VLANs.\n",
 				   bond_dev->name, bond_dev->name);
 			pr_warning("%s: When re-adding slaves, make sure the bond's HW address matches its VLANs'.\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 7/9] bonding: split alb_send_learning_packets()
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (5 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 6/9] bonding: use vlan_uses_dev() in __bond_release_one() Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 8/9] bonding: make alb_send_learning_packets() use upper dev list Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 9/9] bonding: remove vlan_list/current_alb_vlan Veaceslav Falico
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Create alb_send_lp_vid(), which will handle the skb/lp creation, vlan
tagging and sending, and use it in alb_send_learning_packets().

This way all the logic remains in alb_send_learning_packets(), which
becomes a lot more cleaner and easier to understand.

v1: no change

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_alb.c |   59 +++++++++++++++++++++++----------------
 1 files changed, 35 insertions(+), 24 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 3a5db7b..3ca3c85 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -971,35 +971,53 @@ static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
 
 /*********************** tlb/rlb shared functions *********************/
 
-static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
+static void alb_send_lp_vid(struct slave *slave, u8 mac_addr[],
+			    u16 vid)
 {
-	struct bonding *bond = bond_get_bond_by_slave(slave);
 	struct learning_pkt pkt;
+	struct sk_buff *skb;
 	int size = sizeof(struct learning_pkt);
-	int i;
+	char *data;
 
 	memset(&pkt, 0, size);
 	memcpy(pkt.mac_dst, mac_addr, ETH_ALEN);
 	memcpy(pkt.mac_src, mac_addr, ETH_ALEN);
 	pkt.type = cpu_to_be16(ETH_P_LOOP);
 
-	for (i = 0; i < MAX_LP_BURST; i++) {
-		struct sk_buff *skb;
-		char *data;
+	skb = dev_alloc_skb(size);
+	if (!skb)
+		return;
 
-		skb = dev_alloc_skb(size);
+	data = skb_put(skb, size);
+	memcpy(data, &pkt, size);
+
+	skb_reset_mac_header(skb);
+	skb->network_header = skb->mac_header + ETH_HLEN;
+	skb->protocol = pkt.type;
+	skb->priority = TC_PRIO_CONTROL;
+	skb->dev = slave->dev;
+
+	if (vid) {
+		skb = vlan_put_tag(skb, htons(ETH_P_8021Q), vid);
 		if (!skb) {
+			pr_err("%s: Error: failed to insert VLAN tag\n",
+			       slave->bond->dev->name);
 			return;
 		}
+	}
 
-		data = skb_put(skb, size);
-		memcpy(data, &pkt, size);
+	dev_queue_xmit(skb);
+}
 
-		skb_reset_mac_header(skb);
-		skb->network_header = skb->mac_header + ETH_HLEN;
-		skb->protocol = pkt.type;
-		skb->priority = TC_PRIO_CONTROL;
-		skb->dev = slave->dev;
+
+static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
+{
+	struct bonding *bond = bond_get_bond_by_slave(slave);
+	u16 vlan_id;
+	int i;
+
+	for (i = 0; i < MAX_LP_BURST; i++) {
+		vlan_id = 0;
 
 		if (bond_vlan_used(bond)) {
 			struct vlan_entry *vlan;
@@ -1008,20 +1026,13 @@ static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
 					      bond->alb_info.current_alb_vlan);
 
 			bond->alb_info.current_alb_vlan = vlan;
-			if (!vlan) {
-				kfree_skb(skb);
+			if (!vlan)
 				continue;
-			}
 
-			skb = vlan_put_tag(skb, htons(ETH_P_8021Q), vlan->vlan_id);
-			if (!skb) {
-				pr_err("%s: Error: failed to insert VLAN tag\n",
-				       bond->dev->name);
-				continue;
-			}
+			vlan_id = vlan->vlan_id;
 		}
 
-		dev_queue_xmit(skb);
+		alb_send_lp_vid(slave, mac_addr, vlan_id);
 	}
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 8/9] bonding: make alb_send_learning_packets() use upper dev list
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (6 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 7/9] bonding: split alb_send_learning_packets() Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  2013-08-26 20:32 ` [PATCH net-next v1 9/9] bonding: remove vlan_list/current_alb_vlan Veaceslav Falico
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently, if there are vlans on top of bond, alb_send_learning_packets()
will never send LPs from the bond itself (i.e. untagged), which might leave
untagged clients unupdated.

Also, the 'circular vlan' logic (i.e. update only MAX_LP_BURST vlans at a
time, and save the last vlan for the next update) is really suboptimal - in
case of lots of vlans it will take a lot of time to update every vlan. It
is also never called in any hot path and sends only a few small packets -
thus the optimization by itself is useless.

So remove the whole current_alb_vlan/MAX_LP_BURST logic from
alb_send_learning_packets(). Instead, we'll first send a packet untagged
and then traverse the upper dev list, sending a tagged packet for each vlan
found. Also, remove the MAX_LP_BURST define - we already don't need it.

v1: use netdev_for_each_upper_dev()

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_alb.c |   33 +++++++++++++--------------------
 drivers/net/bonding/bond_alb.h |    1 -
 2 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 3ca3c85..676d6d6 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1013,27 +1013,20 @@ static void alb_send_lp_vid(struct slave *slave, u8 mac_addr[],
 static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
 {
 	struct bonding *bond = bond_get_bond_by_slave(slave);
-	u16 vlan_id;
-	int i;
-
-	for (i = 0; i < MAX_LP_BURST; i++) {
-		vlan_id = 0;
-
-		if (bond_vlan_used(bond)) {
-			struct vlan_entry *vlan;
+	struct net_device *upper;
+	struct list_head *iter;
 
-			vlan = bond_next_vlan(bond,
-					      bond->alb_info.current_alb_vlan);
-
-			bond->alb_info.current_alb_vlan = vlan;
-			if (!vlan)
-				continue;
-
-			vlan_id = vlan->vlan_id;
-		}
+	/* send untagged */
+	alb_send_lp_vid(slave, mac_addr, 0);
 
-		alb_send_lp_vid(slave, mac_addr, vlan_id);
+	/* loop through vlans and send one packet for each */
+	rcu_read_lock();
+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
+		if (upper->priv_flags & IFF_802_1Q_VLAN)
+			alb_send_lp_vid(slave, mac_addr,
+					vlan_dev_vlan_id(upper));
 	}
+	rcu_read_unlock();
 }
 
 static int alb_set_slave_mac_addr(struct slave *slave, u8 addr[])
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index e7a5b8b..e139172 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -53,7 +53,6 @@ struct slave;
 
 
 #define TLB_NULL_INDEX		0xffffffff
-#define MAX_LP_BURST		3
 
 /* rlb defs */
 #define RLB_HASH_TABLE_SIZE	256
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH net-next v1 9/9] bonding: remove vlan_list/current_alb_vlan
  2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
                   ` (7 preceding siblings ...)
  2013-08-26 20:32 ` [PATCH net-next v1 8/9] bonding: make alb_send_learning_packets() use upper dev list Veaceslav Falico
@ 2013-08-26 20:32 ` Veaceslav Falico
  8 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-26 20:32 UTC (permalink / raw)
  To: netdev; +Cc: Veaceslav Falico, Jay Vosburgh, Andy Gospodarek

Currently there are no real users of vlan_list/current_alb_vlan, only the
helpers which maintain them, so remove them.

v1: no change

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_alb.c  |    5 --
 drivers/net/bonding/bond_alb.h  |    1 -
 drivers/net/bonding/bond_main.c |  133 +--------------------------------------
 drivers/net/bonding/bonding.h   |    6 --
 4 files changed, 2 insertions(+), 143 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 676d6d6..17e2cef 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1763,11 +1763,6 @@ int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr)
 
 void bond_alb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
 {
-	if (bond->alb_info.current_alb_vlan &&
-	    (bond->alb_info.current_alb_vlan->vlan_id == vlan_id)) {
-		bond->alb_info.current_alb_vlan = NULL;
-	}
-
 	if (bond->alb_info.rlb_enabled) {
 		rlb_clear_vlan(bond, vlan_id);
 	}
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index e139172..e02c9c5 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -169,7 +169,6 @@ struct alb_bond_info {
 						 * rx traffic should be
 						 * rebalanced
 						 */
-	struct vlan_entry	*current_alb_vlan;
 };
 
 int bond_alb_initialize(struct bonding *bond, int rlb_enabled);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index aaf14a0..3757558 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -283,116 +283,6 @@ const char *bond_mode_name(int mode)
 /*---------------------------------- VLAN -----------------------------------*/
 
 /**
- * bond_add_vlan - add a new vlan id on bond
- * @bond: bond that got the notification
- * @vlan_id: the vlan id to add
- *
- * Returns -ENOMEM if allocation failed.
- */
-static int bond_add_vlan(struct bonding *bond, unsigned short vlan_id)
-{
-	struct vlan_entry *vlan;
-
-	pr_debug("bond: %s, vlan id %d\n",
-		 (bond ? bond->dev->name : "None"), vlan_id);
-
-	vlan = kzalloc(sizeof(struct vlan_entry), GFP_KERNEL);
-	if (!vlan)
-		return -ENOMEM;
-
-	INIT_LIST_HEAD(&vlan->vlan_list);
-	vlan->vlan_id = vlan_id;
-
-	write_lock_bh(&bond->lock);
-
-	list_add_tail(&vlan->vlan_list, &bond->vlan_list);
-
-	write_unlock_bh(&bond->lock);
-
-	pr_debug("added VLAN ID %d on bond %s\n", vlan_id, bond->dev->name);
-
-	return 0;
-}
-
-/**
- * bond_del_vlan - delete a vlan id from bond
- * @bond: bond that got the notification
- * @vlan_id: the vlan id to delete
- *
- * returns -ENODEV if @vlan_id was not found in @bond.
- */
-static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id)
-{
-	struct vlan_entry *vlan;
-	int res = -ENODEV;
-
-	pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id);
-
-	block_netpoll_tx();
-	write_lock_bh(&bond->lock);
-
-	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
-		if (vlan->vlan_id == vlan_id) {
-			list_del(&vlan->vlan_list);
-
-			if (bond_is_lb(bond))
-				bond_alb_clear_vlan(bond, vlan_id);
-
-			pr_debug("removed VLAN ID %d from bond %s\n",
-				 vlan_id, bond->dev->name);
-
-			kfree(vlan);
-
-			res = 0;
-			goto out;
-		}
-	}
-
-	pr_debug("couldn't find VLAN ID %d in bond %s\n",
-		 vlan_id, bond->dev->name);
-
-out:
-	write_unlock_bh(&bond->lock);
-	unblock_netpoll_tx();
-	return res;
-}
-
-/**
- * bond_next_vlan - safely skip to the next item in the vlans list.
- * @bond: the bond we're working on
- * @curr: item we're advancing from
- *
- * Returns %NULL if list is empty, bond->next_vlan if @curr is %NULL,
- * or @curr->next otherwise (even if it is @curr itself again).
- *
- * Caller must hold bond->lock
- */
-struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
-{
-	struct vlan_entry *next, *last;
-
-	if (list_empty(&bond->vlan_list))
-		return NULL;
-
-	if (!curr) {
-		next = list_entry(bond->vlan_list.next,
-				  struct vlan_entry, vlan_list);
-	} else {
-		last = list_entry(bond->vlan_list.prev,
-				  struct vlan_entry, vlan_list);
-		if (last == curr) {
-			next = list_entry(bond->vlan_list.next,
-					  struct vlan_entry, vlan_list);
-		} else {
-			next = list_entry(curr->vlan_list.next,
-					  struct vlan_entry, vlan_list);
-		}
-	}
-
-	return next;
-}
-
-/**
  * bond_dev_queue_xmit - Prepare skb for xmit.
  *
  * @bond: bond device that got this skb for tx.
@@ -451,13 +341,6 @@ static int bond_vlan_rx_add_vid(struct net_device *bond_dev,
 			goto unwind;
 	}
 
-	res = bond_add_vlan(bond, vid);
-	if (res) {
-		pr_err("%s: Error: Failed to add vlan id %d\n",
-		       bond_dev->name, vid);
-		goto unwind;
-	}
-
 	return 0;
 
 unwind:
@@ -478,17 +361,12 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
 {
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave;
-	int res;
 
 	bond_for_each_slave(bond, slave)
 		vlan_vid_del(slave->dev, proto, vid);
 
-	res = bond_del_vlan(bond, vid);
-	if (res) {
-		pr_err("%s: Error: Failed to remove vlan id %d\n",
-		       bond_dev->name, vid);
-		return res;
-	}
+	if (bond_is_lb(bond))
+		bond_alb_clear_vlan(bond, vid);
 
 	return 0;
 }
@@ -4121,7 +3999,6 @@ static void bond_setup(struct net_device *bond_dev)
 
 	/* Initialize pointers */
 	bond->dev = bond_dev;
-	INIT_LIST_HEAD(&bond->vlan_list);
 
 	/* Initialize the device entry points */
 	ether_setup(bond_dev);
@@ -4174,7 +4051,6 @@ static void bond_uninit(struct net_device *bond_dev)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave, *tmp_slave;
-	struct vlan_entry *vlan, *tmp;
 
 	bond_netpoll_cleanup(bond_dev);
 
@@ -4186,11 +4062,6 @@ static void bond_uninit(struct net_device *bond_dev)
 	list_del(&bond->bond_list);
 
 	bond_debug_unregister(bond);
-
-	list_for_each_entry_safe(vlan, tmp, &bond->vlan_list, vlan_list) {
-		list_del(&vlan->vlan_list);
-		kfree(vlan);
-	}
 }
 
 /*------------------------- Module initialization ---------------------------*/
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 9392e00..a26b134 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -185,11 +185,6 @@ struct bond_parm_tbl {
 
 #define BOND_MAX_MODENAME_LEN 20
 
-struct vlan_entry {
-	struct list_head vlan_list;
-	unsigned short vlan_id;
-};
-
 struct slave {
 	struct net_device *dev; /* first - useful for panic debug */
 	struct list_head list;
@@ -254,7 +249,6 @@ struct bonding {
 	struct   ad_bond_info ad_info;
 	struct   alb_bond_info alb_info;
 	struct   bond_params params;
-	struct   list_head vlan_list;
 	struct   workqueue_struct *wq;
 	struct   delayed_work mii_work;
 	struct   delayed_work arp_work;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-26 20:32 ` [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices Veaceslav Falico
@ 2013-08-26 20:53   ` Jiri Pirko
  2013-08-27 11:16     ` Veaceslav Falico
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2013-08-26 20:53 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
>Currently, bond_has_this_ip() is aware only of vlan upper devices, and thus
>will return false if the address is associated with the upper bridge or any
>other device, and thus will break the arp logic.
>
>Fix this by using the upper device list. For every upper device we verify
>if the address associated with it is our address, and if yes - return true.
>
>v1: use netdev_for_each_upper_dev()
>
>CC: Jay Vosburgh <fubar@us.ibm.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>---
> drivers/net/bonding/bond_main.c |   25 +++++++++++++------------
> 1 files changed, 13 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 40eb250..dbac1ce 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2392,24 +2392,25 @@ re_arm:
> 	}
> }
> 
>-static int bond_has_this_ip(struct bonding *bond, __be32 ip)
>+static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
> {
>-	struct vlan_entry *vlan;
>-	struct net_device *vlan_dev;
>+	struct net_device *upper;
>+	struct list_head *iter;
>+	bool ret = false;
> 
> 	if (ip == bond_confirm_addr(bond->dev, 0, ip))
>-		return 1;
>+		return true;
> 
>-	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
>-		rcu_read_lock();
>-		vlan_dev = __vlan_find_dev_deep(bond->dev, htons(ETH_P_8021Q),
>-						vlan->vlan_id);
>-		rcu_read_unlock();
>-		if (vlan_dev && ip == bond_confirm_addr(vlan_dev, 0, ip))
>-			return 1;
>+	rcu_read_lock();
>+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
>+		if (ip == bond_confirm_addr(upper, 0, ip)) {
>+			ret = true;
>+			break;
>+		}

You need the same recursion __vlan_find_dev_deep() is doing. If you do
not do that, you will miss anything over the first upper level.


> 	}
>+	rcu_read_unlock();
> 
>-	return 0;
>+	return ret;
> }
> 
> /*
>-- 
>1.7.1
>
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter)
  2013-08-26 20:32 ` [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter) Veaceslav Falico
@ 2013-08-26 20:57   ` Jiri Pirko
  2013-08-27 10:42     ` Veaceslav Falico
  0 siblings, 1 reply; 20+ messages in thread
From: Jiri Pirko @ 2013-08-26 20:57 UTC (permalink / raw)
  To: Veaceslav Falico
  Cc: netdev, David S. Miller, Eric Dumazet, Alexander Duyck, Cong Wang

Mon, Aug 26, 2013 at 10:32:34PM CEST, vfalico@redhat.com wrote:
>This function returns the next dev in the dev->upper_dev_list after the
>struct list_head **iter position, and updates *iter accordingly. Returns
>NULL if there are no devices left.
>
>v1: new patch
>
>CC: "David S. Miller" <davem@davemloft.net>
>CC: Eric Dumazet <edumazet@google.com>
>CC: Jiri Pirko <jiri@resnulli.us>
>CC: Alexander Duyck <alexander.h.duyck@intel.com>
>CC: Cong Wang <amwang@redhat.com>
>Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
>---
> net/core/dev.c |   25 +++++++++++++++++++++++++
> 1 files changed, 25 insertions(+), 0 deletions(-)
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 1ed2b66..566e99a 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -4477,6 +4477,31 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
> }
> EXPORT_SYMBOL(netdev_master_upper_dev_get);
> 
>+/* netdev_upper_get_next_dev - Get the next dev from upper list
>+ * @dev: device
>+ * @iter: list_head ** of the current position
>+ *
>+ * Gets the next device from the dev's upper list, starting from iter
>+ * position. The caller must hold RCU read lock.
>+ */
>+struct net_device *netdev_upper_get_next_dev(struct net_device *dev,
>+					     struct list_head **iter)

		This should be probably rather named
		"netdev_upper_get_next_dev_rcu" That way it is clear
		right away. Also if you introduce non-rcu variant in
		future, you won't introduce confusion :)


>+{
>+	struct netdev_upper *upper;
>+
>+	WARN_ON_ONCE(!rcu_read_lock_held());
>+
>+	upper = list_entry_rcu((*iter)->next, struct netdev_upper, list);
>+
>+	if (&upper->list == &dev->upper_dev_list)
>+		return NULL;
>+
>+	*iter = &upper->list;
>+
>+	return upper->dev;
>+}
>+EXPORT_SYMBOL(netdev_upper_get_next_dev);
>+
> /**
>  * netdev_master_upper_dev_get_rcu - Get master upper device
>  * @dev: device
>-- 
>1.7.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter)
  2013-08-26 20:57   ` Jiri Pirko
@ 2013-08-27 10:42     ` Veaceslav Falico
  0 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-27 10:42 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, David S. Miller, Eric Dumazet, Alexander Duyck, Cong Wang

On Mon, Aug 26, 2013 at 10:57:48PM +0200, Jiri Pirko wrote:
>Mon, Aug 26, 2013 at 10:32:34PM CEST, vfalico@redhat.com wrote:
...snip...
>>+struct net_device *netdev_upper_get_next_dev(struct net_device *dev,
>>+					     struct list_head **iter)
>
>		This should be probably rather named
>		"netdev_upper_get_next_dev_rcu" That way it is clear
>		right away. Also if you introduce non-rcu variant in
>		future, you won't introduce confusion :)

Agreed, it will be easier, will do in v2.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-26 20:53   ` Jiri Pirko
@ 2013-08-27 11:16     ` Veaceslav Falico
  2013-08-27 11:25       ` Jiri Pirko
  0 siblings, 1 reply; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-27 11:16 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On Mon, Aug 26, 2013 at 10:53:38PM +0200, Jiri Pirko wrote:
>Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
...snip...
>>+	rcu_read_lock();
>>+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
>>+		if (ip == bond_confirm_addr(upper, 0, ip)) {
>>+			ret = true;
>>+			break;
>>+		}
>
>You need the same recursion __vlan_find_dev_deep() is doing. If you do
>not do that, you will miss anything over the first upper level.

Good point, and it's true for other uses also - bond_arp_send_all(), for
example, will also miss anything that's higher than the first upper level.

I can't think of a use case scenario when we would need only the first
upper level - so maybe we should either make netdev_for_each_upper_dev()
recursive by default (I don't know how it can be done easily, tbh, without
modifying the existing code), or add something like:

diff --git a/net/core/dev.c b/net/core/dev.c
index 566e99a..4a4468f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4387,6 +4387,31 @@ static void __append_search_uppers(struct list_head *search_list,
  	}
  }
  
+struct net_device *netdev_upper_recursive_do_rcu(struct net_device *dev,
+						 struct net_device *orig_dev,
+						 bool (*f)(struct net_device *,
+							   struct net_device *))
+{
+	struct netdev_upper *upper;
+	struct net_device *ret = NULL;
+
+	list_for_each_entry_rcu(upper, &dev->upper_dev_list, list) {
+		if (f(orig_dev, upper->dev)) {
+			ret = upper->dev;
+			break;
+		}
+
+		if (!list_empty(&upper->dev->upper_dev_list)) {
+			ret = netdev_upper_recursive_do_rcu(upper->dev,
+							    orig_dev, f);
+			if (ret)
+				break;
+		}
+	}
+
+	return ret;
+}
+
  static bool __netdev_search_upper_dev(struct net_device *dev,
  				      struct net_device *upper_dev)
  {

How do you think?

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-27 11:16     ` Veaceslav Falico
@ 2013-08-27 11:25       ` Jiri Pirko
  2013-08-27 11:53         ` Veaceslav Falico
  2013-08-27 18:10         ` Veaceslav Falico
  0 siblings, 2 replies; 20+ messages in thread
From: Jiri Pirko @ 2013-08-27 11:25 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

Tue, Aug 27, 2013 at 01:16:48PM CEST, vfalico@redhat.com wrote:
>On Mon, Aug 26, 2013 at 10:53:38PM +0200, Jiri Pirko wrote:
>>Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
>...snip...
>>>+	rcu_read_lock();
>>>+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
>>>+		if (ip == bond_confirm_addr(upper, 0, ip)) {
>>>+			ret = true;
>>>+			break;
>>>+		}
>>
>>You need the same recursion __vlan_find_dev_deep() is doing. If you do
>>not do that, you will miss anything over the first upper level.
>
>Good point, and it's true for other uses also - bond_arp_send_all(), for
>example, will also miss anything that's higher than the first upper level.
>
>I can't think of a use case scenario when we would need only the first
>upper level - so maybe we should either make netdev_for_each_upper_dev()
>recursive by default (I don't know how it can be done easily, tbh, without
>modifying the existing code), or add something like:
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 566e99a..4a4468f 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -4387,6 +4387,31 @@ static void __append_search_uppers(struct list_head *search_list,
> 	}
> }
>+struct net_device *netdev_upper_recursive_do_rcu(struct net_device *dev,
>+						 struct net_device *orig_dev,
>+						 bool (*f)(struct net_device *,
>+							   struct net_device *))
>+{
>+	struct netdev_upper *upper;
>+	struct net_device *ret = NULL;
>+
>+	list_for_each_entry_rcu(upper, &dev->upper_dev_list, list) {
>+		if (f(orig_dev, upper->dev)) {
>+			ret = upper->dev;
>+			break;
>+		}
>+
>+		if (!list_empty(&upper->dev->upper_dev_list)) {
>+			ret = netdev_upper_recursive_do_rcu(upper->dev,
>+							    orig_dev, f);
>+			if (ret)
>+				break;
>+		}
>+	}
>+
>+	return ret;
>+}
>+
> static bool __netdev_search_upper_dev(struct net_device *dev,
> 				      struct net_device *upper_dev)
> {
>
>How do you think?

I do not like this. How about to put all levels to upper_dev list and
mark those who are not direct (not level1) ? Then we can use single list
for all purposes.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-27 11:25       ` Jiri Pirko
@ 2013-08-27 11:53         ` Veaceslav Falico
  2013-08-27 18:10         ` Veaceslav Falico
  1 sibling, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-27 11:53 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On Tue, Aug 27, 2013 at 01:25:29PM +0200, Jiri Pirko wrote:
>Tue, Aug 27, 2013 at 01:16:48PM CEST, vfalico@redhat.com wrote:
>>On Mon, Aug 26, 2013 at 10:53:38PM +0200, Jiri Pirko wrote:
>>>Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
>>...snip...
>>>>+	rcu_read_lock();
>>>>+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
>>>>+		if (ip == bond_confirm_addr(upper, 0, ip)) {
>>>>+			ret = true;
>>>>+			break;
>>>>+		}
>>>
>>>You need the same recursion __vlan_find_dev_deep() is doing. If you do
>>>not do that, you will miss anything over the first upper level.
>>
>>Good point, and it's true for other uses also - bond_arp_send_all(), for
>>example, will also miss anything that's higher than the first upper level.
>>
>>I can't think of a use case scenario when we would need only the first
>>upper level - so maybe we should either make netdev_for_each_upper_dev()
>>recursive by default (I don't know how it can be done easily, tbh, without
>>modifying the existing code), or add something like:
>>
>>diff --git a/net/core/dev.c b/net/core/dev.c
>>index 566e99a..4a4468f 100644
>>--- a/net/core/dev.c
>>+++ b/net/core/dev.c
>>@@ -4387,6 +4387,31 @@ static void __append_search_uppers(struct list_head *search_list,
>> 	}
>> }
>>+struct net_device *netdev_upper_recursive_do_rcu(struct net_device *dev,
>>+						 struct net_device *orig_dev,
>>+						 bool (*f)(struct net_device *,
>>+							   struct net_device *))
>>+{
>>+	struct netdev_upper *upper;
>>+	struct net_device *ret = NULL;
>>+
>>+	list_for_each_entry_rcu(upper, &dev->upper_dev_list, list) {
>>+		if (f(orig_dev, upper->dev)) {
>>+			ret = upper->dev;
>>+			break;
>>+		}
>>+
>>+		if (!list_empty(&upper->dev->upper_dev_list)) {
>>+			ret = netdev_upper_recursive_do_rcu(upper->dev,
>>+							    orig_dev, f);
>>+			if (ret)
>>+				break;
>>+		}
>>+	}
>>+
>>+	return ret;
>>+}
>>+
>> static bool __netdev_search_upper_dev(struct net_device *dev,
>> 				      struct net_device *upper_dev)
>> {
>>
>>How do you think?
>
>I do not like this. How about to put all levels to upper_dev list and
>mark those who are not direct (not level1) ? Then we can use single list
>for all purposes.

I don't see how it can be done on attach/removal of upper devices, cause we
don't have a way to know the 'lower' devices, and will break scenarios like

bond -> bridge ->+ vlan

when vlan is added, we can't update bond's upper_dev_list.

And if we'll start doing it 'on the fly', while searching, by using
search_list (or something new), we'll be racy and require locking, not just
RCU.

Am I again missing something? :)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-27 11:25       ` Jiri Pirko
  2013-08-27 11:53         ` Veaceslav Falico
@ 2013-08-27 18:10         ` Veaceslav Falico
  2013-08-28 12:00           ` Veaceslav Falico
  2013-08-28 14:56           ` Vlad Yasevich
  1 sibling, 2 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-27 18:10 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On Tue, Aug 27, 2013 at 01:25:29PM +0200, Jiri Pirko wrote:
>Tue, Aug 27, 2013 at 01:16:48PM CEST, vfalico@redhat.com wrote:
>>On Mon, Aug 26, 2013 at 10:53:38PM +0200, Jiri Pirko wrote:
>>>Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
>>...snip...
>>>>+	rcu_read_lock();
>>>>+	netdev_for_each_upper_dev(bond->dev, upper, iter) {
>>>>+		if (ip == bond_confirm_addr(upper, 0, ip)) {
>>>>+			ret = true;
>>>>+			break;
>>>>+		}
>>>
>>>You need the same recursion __vlan_find_dev_deep() is doing. If you do
>>>not do that, you will miss anything over the first upper level.
>>
>>Good point, and it's true for other uses also - bond_arp_send_all(), for
>>example, will also miss anything that's higher than the first upper level.
>>
>>I can't think of a use case scenario when we would need only the first
>>upper level - so maybe we should either make netdev_for_each_upper_dev()
>>recursive by default (I don't know how it can be done easily, tbh, without
>>modifying the existing code), or add something like:
>>
>>diff --git a/net/core/dev.c b/net/core/dev.c
>>index 566e99a..4a4468f 100644
>>--- a/net/core/dev.c
>>+++ b/net/core/dev.c
>>@@ -4387,6 +4387,31 @@ static void __append_search_uppers(struct list_head *search_list,
>> 	}
>> }
>>+struct net_device *netdev_upper_recursive_do_rcu(struct net_device *dev,
>>+						 struct net_device *orig_dev,
>>+						 bool (*f)(struct net_device *,
>>+							   struct net_device *))
>>+{
>>+	struct netdev_upper *upper;
>>+	struct net_device *ret = NULL;
>>+
>>+	list_for_each_entry_rcu(upper, &dev->upper_dev_list, list) {
>>+		if (f(orig_dev, upper->dev)) {
>>+			ret = upper->dev;
>>+			break;
>>+		}
>>+
>>+		if (!list_empty(&upper->dev->upper_dev_list)) {
>>+			ret = netdev_upper_recursive_do_rcu(upper->dev,
>>+							    orig_dev, f);
>>+			if (ret)
>>+				break;
>>+		}
>>+	}
>>+
>>+	return ret;
>>+}
>>+
>> static bool __netdev_search_upper_dev(struct net_device *dev,
>> 				      struct net_device *upper_dev)
>> {
>>
>>How do you think?
>
>I do not like this. How about to put all levels to upper_dev list and
>mark those who are not direct (not level1) ? Then we can use single list
>for all purposes.

I've looked at the code a bit more and I really don't see a way to do
non-recursive, RCUed way to traverse the whole list of upper devices.

I see three ways to handle this situation:

1) The one that I've posted, recursive search and calling a provided
function (the function should also get as a parameter a user-provided void
*pointer). It's, indeed, a bit hacky, however will work perfectly.

2) Implementing the search (recursive) in bonding (or any further device)
itself. Less intrusive, however it'll be code duplication actually for
point 1).

3) Adding lower_dev_list, populating it accordingly, and also adding an int
distance to the netdev_upper (or, with this approach, rather netdev_adjacent
or something like that), which will help to implement your idea - a device
will have lower/upper_dev_list populated with all lower/upper devices and
their distance (i.e. distance == 1 means that it's first level of
lower/upper device). With this approach, we might also afterwards get rid
of slave lists from 'grouping' devices like bonding/team/bridge/etc. and,
thus, the locking.

Now I'd rather go with 1), but if you don't like it - I can go with 2).
And, if 3) sounds ok, I can implement it also and try to get rid of bonding
slave list (hopefully).

Do you have any other ideas/thoughts?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-27 18:10         ` Veaceslav Falico
@ 2013-08-28 12:00           ` Veaceslav Falico
  2013-08-28 14:56           ` Vlad Yasevich
  1 sibling, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-28 12:00 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, Jay Vosburgh, Andy Gospodarek

On Tue, Aug 27, 2013 at 08:10:01PM +0200, Veaceslav Falico wrote:
...snip...
>3) Adding lower_dev_list, populating it accordingly, and also adding an int
>distance to the netdev_upper (or, with this approach, rather netdev_adjacent
>or something like that), which will help to implement your idea - a device
>will have lower/upper_dev_list populated with all lower/upper devices and
>their distance (i.e. distance == 1 means that it's first level of
>lower/upper device). With this approach, we might also afterwards get rid
>of slave lists from 'grouping' devices like bonding/team/bridge/etc. and,
>thus, the locking.

Just FYI, I've taken this approach in the next version, seems to (suddenly)
work fine, FWIW...

I'll send the v2 in few minutes.

Thanks again!

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-27 18:10         ` Veaceslav Falico
  2013-08-28 12:00           ` Veaceslav Falico
@ 2013-08-28 14:56           ` Vlad Yasevich
  2013-08-28 16:32             ` Veaceslav Falico
  1 sibling, 1 reply; 20+ messages in thread
From: Vlad Yasevich @ 2013-08-28 14:56 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: Jiri Pirko, netdev, Jay Vosburgh, Andy Gospodarek

On 08/27/2013 02:10 PM, Veaceslav Falico wrote:
> On Tue, Aug 27, 2013 at 01:25:29PM +0200, Jiri Pirko wrote:
>> Tue, Aug 27, 2013 at 01:16:48PM CEST, vfalico@redhat.com wrote:
>>> On Mon, Aug 26, 2013 at 10:53:38PM +0200, Jiri Pirko wrote:
>>>> Mon, Aug 26, 2013 at 10:32:38PM CEST, vfalico@redhat.com wrote:
>>> ...snip...
>>>>> +    rcu_read_lock();
>>>>> +    netdev_for_each_upper_dev(bond->dev, upper, iter) {
>>>>> +        if (ip == bond_confirm_addr(upper, 0, ip)) {
>>>>> +            ret = true;
>>>>> +            break;
>>>>> +        }
>>>>
>>>> You need the same recursion __vlan_find_dev_deep() is doing. If you do
>>>> not do that, you will miss anything over the first upper level.
>>>
>>> Good point, and it's true for other uses also - bond_arp_send_all(), for
>>> example, will also miss anything that's higher than the first upper
>>> level.
>>>
>>> I can't think of a use case scenario when we would need only the first
>>> upper level - so maybe we should either make netdev_for_each_upper_dev()
>>> recursive by default (I don't know how it can be done easily, tbh,
>>> without
>>> modifying the existing code), or add something like:
>>>
>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>> index 566e99a..4a4468f 100644
>>> --- a/net/core/dev.c
>>> +++ b/net/core/dev.c
>>> @@ -4387,6 +4387,31 @@ static void __append_search_uppers(struct
>>> list_head *search_list,
>>>     }
>>> }
>>> +struct net_device *netdev_upper_recursive_do_rcu(struct net_device
>>> *dev,
>>> +                         struct net_device *orig_dev,
>>> +                         bool (*f)(struct net_device *,
>>> +                               struct net_device *))
>>> +{
>>> +    struct netdev_upper *upper;
>>> +    struct net_device *ret = NULL;
>>> +
>>> +    list_for_each_entry_rcu(upper, &dev->upper_dev_list, list) {
>>> +        if (f(orig_dev, upper->dev)) {
>>> +            ret = upper->dev;
>>> +            break;
>>> +        }
>>> +
>>> +        if (!list_empty(&upper->dev->upper_dev_list)) {
>>> +            ret = netdev_upper_recursive_do_rcu(upper->dev,
>>> +                                orig_dev, f);
>>> +            if (ret)
>>> +                break;
>>> +        }
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> static bool __netdev_search_upper_dev(struct net_device *dev,
>>>                       struct net_device *upper_dev)
>>> {
>>>
>>> How do you think?
>>
>> I do not like this. How about to put all levels to upper_dev list and
>> mark those who are not direct (not level1) ? Then we can use single list
>> for all purposes.
>
> I've looked at the code a bit more and I really don't see a way to do
> non-recursive, RCUed way to traverse the whole list of upper devices.
>
> I see three ways to handle this situation:
>
> 1) The one that I've posted, recursive search and calling a provided
> function (the function should also get as a parameter a user-provided void
> *pointer). It's, indeed, a bit hacky, however will work perfectly.
>
> 2) Implementing the search (recursive) in bonding (or any further device)
> itself. Less intrusive, however it'll be code duplication actually for
> point 1).
>
> 3) Adding lower_dev_list, populating it accordingly, and also adding an int
> distance to the netdev_upper (or, with this approach, rather
> netdev_adjacent
> or something like that), which will help to implement your idea - a device
> will have lower/upper_dev_list populated with all lower/upper devices and
> their distance (i.e. distance == 1 means that it's first level of
> lower/upper device). With this approach, we might also afterwards get rid
> of slave lists from 'grouping' devices like bonding/team/bridge/etc. and,
> thus, the locking.
>
> Now I'd rather go with 1), but if you don't like it - I can go with 2).
> And, if 3) sounds ok, I can implement it also and try to get rid of bonding
> slave list (hopefully).
>
> Do you have any other ideas/thoughts?

I've been playing with approach 2 for some work I am doing for
macvtap-over-bond, but I like you idea for 3 better.

It would make things simpler in the long run.

-vlad

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices
  2013-08-28 14:56           ` Vlad Yasevich
@ 2013-08-28 16:32             ` Veaceslav Falico
  0 siblings, 0 replies; 20+ messages in thread
From: Veaceslav Falico @ 2013-08-28 16:32 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Jiri Pirko, netdev, Jay Vosburgh, Andy Gospodarek

On Wed, Aug 28, 2013 at 10:56:32AM -0400, Vlad Yasevich wrote:
...snip...
>I've been playing with approach 2 for some work I am doing for
>macvtap-over-bond, but I like you idea for 3 better.
>
>It would make things simpler in the long run.

I've just sent the third version of the 3rd approach, which implements
lower/upper device lists for every device that are guaranteed, under RCU,
tu have all the upper/lower devices linked to the device. It also permits
to distinguish the 'neighbour' devices (i.e. linked on the first level,
i.e. direct-linked) from the 'distant' ones, which are not directly linked
but rather inherited.

It also implements the first function (and the only one that I've needed
to get rid of vlan lists in bonding) - netdev_for_each_upper_dev_rcu(),
which can be used as a loop:

struct net_device *upper;
struct list_head *iter;

rcu_read_lock();
netdev_for_each_upper_dev_rcu(dev, upper, iter) {
	... do whatever you want with the upper dev ...
}
rcu_read_unlock();


In case you'll need, you can easily implement the same way
netdev_for_each_lower_dev_rcu().

Hope that helps.

>
>-vlad

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-08-28 16:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-26 20:32 [PATCH net-next v1 0/9] bonding: remove vlan special handling Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 1/9] net: add netdev_upper_get_next_dev(dev, iter) Veaceslav Falico
2013-08-26 20:57   ` Jiri Pirko
2013-08-27 10:42     ` Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 2/9] net: add netdev_for_each_upper_dev() Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 3/9] bonding: use netdev_upper list in bond_vlan_used Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 4/9] bonding: make bond_arp_send_all use upper device list Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 5/9] bonding: convert bond_has_this_ip() to use upper devices Veaceslav Falico
2013-08-26 20:53   ` Jiri Pirko
2013-08-27 11:16     ` Veaceslav Falico
2013-08-27 11:25       ` Jiri Pirko
2013-08-27 11:53         ` Veaceslav Falico
2013-08-27 18:10         ` Veaceslav Falico
2013-08-28 12:00           ` Veaceslav Falico
2013-08-28 14:56           ` Vlad Yasevich
2013-08-28 16:32             ` Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 6/9] bonding: use vlan_uses_dev() in __bond_release_one() Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 7/9] bonding: split alb_send_learning_packets() Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 8/9] bonding: make alb_send_learning_packets() use upper dev list Veaceslav Falico
2013-08-26 20:32 ` [PATCH net-next v1 9/9] bonding: remove vlan_list/current_alb_vlan Veaceslav Falico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).