From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jay Vosburgh <jay.vosburgh@canonical.com>
Subject: Re: [PATCH v2 net] bonding: Fix stacked device detection in arp monitoring
Date: Wed, 07 May 2014 09:43:59 -0700
Message-ID: <29645.1399481039@localhost.localdomain>
References: <1399470461-22213-1-git-send-email-vyasevic@redhat.com>
Cc: netdev@vger.kernel.org, Veaceslav Falico <vfalico@redhat.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	Ding Tianhong <dingtianhong@huawei.com>,
	Patric McHardy <kaber@trash.net>
To: Vlad Yasevich <vyasevic@redhat.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from youngberry.canonical.com ([91.189.89.112]:46883 "EHLO
	youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1756032AbaEGQoG (ORCPT
	<rfc822;netdev@vger.kernel.org>); Wed, 7 May 2014 12:44:06 -0400
In-reply-to: <1399470461-22213-1-git-send-email-vyasevic@redhat.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

Vlad Yasevich <vyasevic@redhat.com> wrote:

>Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
>	bonding: support QinQ for bond arp interval
>
>the arp monitoring code allowed for proper detection of devices
>stacked on top of vlans.  Since the above commit, the
>code can still detect a device stacked on top of single
>vlan, but not a device stacked on top of Q-in-Q configuration.
>The search will only set the inner vlan tag if the route
>device is the vlan device.  However, this is not always the
>case, as it is possible to extend the stacked configuration.
>
>With this patch it is possible to provision devices on
>top Q-in-Q vlan configuration that should be used as
>a source of ARP monitoring information.
>
>For example:
>ip link add link bond0 vlan10 type vlan proto 802.1q id 10
>ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
>ip link add link vlan100 type macvlan
>
>Note:  This patch limites the number of stacked VLANs to 2,
>just like before.  The original, however had another issue
>in that if we had more then 2 levels of VLANs, we would end
>up generating incorrectly tagged traffic.  This is no longer
>possible.
>
>Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
>CC: Jay Vosburgh <j.vosburgh@gmail.com>
>CC: Veaceslav Falico <vfalico@redhat.com>
>CC: Andy Gospodarek <andy@greyhouse.net>
>CC: Ding Tianhong <dingtianhong@huawei.com>
>CC: Patric McHardy <kaber@trash.net>
>Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>---
>v2->v1:
>* Changed the function name to better describe what the function is doing.
>  We are not just finding the stack of vlan devices, we are also verifything
>  the path between the bonding device and the route output device.
>* Added some more commenets about what the function is doing.
>* Fixed an issue with multiple peer vlans.
>* Removed all occurances of 'inner' and 'outer' and replaced it with tag
>  array.

	I think you may have misunderstood my prior comment; I meant
that I liked the "inner" and "outer" names better than "tag[0]" and
"tag[1]".

	I did notice that the inner and outer parameters could be
removed from bond_arp_send as well, but, again, I found the "inner" and
"outer" names more descriptive than tag[0] or tag[1]; perhaps a #define
for the magic numbers (0 = "outer", 1 = "inner" and 2 = "max nesting"),
or at least a comment that says straight up "tag[0] is the outer tag,
tag[1] is the inner tag (if there are two tags)" is in order.

> drivers/net/bonding/bond_main.c | 114 ++++++++++++++++++----------------------
> 1 file changed, 52 insertions(+), 62 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 9d08e00..f592f96 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2126,8 +2126,7 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
>  */
> static void bond_arp_send(struct net_device *slave_dev, int arp_op,
> 			  __be32 dest_ip, __be32 src_ip,
>-			  struct bond_vlan_tag *inner,
>-			  struct bond_vlan_tag *outer)
>+			  struct bond_vlan_tag *tags)
> {
> 	struct sk_buff *skb;
> 
>@@ -2141,12 +2140,12 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
> 		net_err_ratelimited("ARP packet allocation failed\n");
> 		return;
> 	}
>-	if (outer->vlan_id) {
>-		if (inner->vlan_id) {
>+	if (tags[0].vlan_id) {
>+		if (tags[1].vlan_id) {
> 			pr_debug("inner tag: proto %X vid %X\n",
>-				 ntohs(inner->vlan_proto), inner->vlan_id);
>-			skb = __vlan_put_tag(skb, inner->vlan_proto,
>-					     inner->vlan_id);
>+				 ntohs(tags[1].vlan_proto), tags[1].vlan_id);
>+			skb = __vlan_put_tag(skb, tags[1].vlan_proto,
>+					     tags[1].vlan_id);
> 			if (!skb) {
> 				net_err_ratelimited("failed to insert inner VLAN tag\n");
> 				return;
>@@ -2154,8 +2153,8 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
> 		}
> 
> 		pr_debug("outer reg: proto %X vid %X\n",
>-			 ntohs(outer->vlan_proto), outer->vlan_id);
>-		skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id);
>+			 ntohs(tags[0].vlan_proto), tags[0].vlan_id);
>+		skb = vlan_put_tag(skb, tags[0].vlan_proto, tags[0].vlan_id);
> 		if (!skb) {
> 			net_err_ratelimited("failed to insert outer VLAN tag\n");
> 			return;
>@@ -2164,22 +2163,52 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op,
> 	arp_xmit(skb);
> }
> 
>+/* Check to make sure that @end device is stacked on top of the @start
>+ * device.  Invofrmation about any intervening vlans are saved into
>+ * the @tag array.  @idx parametet specifies how many vlans deep we are
>+ * are currently looking. We currently only support 2 levels of vlan stacking.
>+ * Return true if we have a valid stacking configuration.  Otherwise false.
>+ */

	Spelling nits: "Information" and "parameter".

>+static bool bond_check_path(struct net_device *start, struct net_device *end,
>+			    struct bond_vlan_tag *tag, int idx)
>+{
>+	struct net_device *upper;
>+	struct list_head  *iter;
>+
>+	/* We do not support more then 2 levels of VLAN nesting */
>+	if (idx >= 2)
>+		return false;
>+
>+	netdev_for_each_all_upper_dev_rcu(start, upper, iter) {
>+		if (is_vlan_dev(upper)) {
>+			tag[idx].vlan_proto = vlan_dev_vlan_proto(upper);
>+			tag[idx].vlan_id = vlan_dev_vlan_id(upper);
>+		}
>+		if (upper == end)
>+			return true;
>+
>+		/* Look at the devices list  of 'upper' only if it is a
>+		 * vlan device.
>+		 */
>+		if (is_vlan_dev(upper) &&
>+		    bond_check_path(upper, end, tag, idx+1))
>+			return true;

	This may or may not be a realistic configuration, but will this
function traverse correctly if there is some other device type between
the two vlans?  E.g., eth0 -> bond0 -> vlan100 -> bridge -> vlan200,
where "vlan200" is the "end" device holding the IP address from the
route lookup.  It need not be a bridge in there, but I think this would
be a legal configuration.

	-J

>+	}
>+	return false;
>+}
>+
> 
> static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> {
>-	struct net_device *upper, *vlan_upper;
>-	struct list_head *iter, *vlan_iter;
> 	struct rtable *rt;
>-	struct bond_vlan_tag inner, outer;
>+	struct bond_vlan_tag tags[2];
> 	__be32 *targets = bond->params.arp_targets, addr;
> 	int i;
>+	bool ret;
> 
> 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
> 		pr_debug("basa: target %pI4\n", &targets[i]);
>-		inner.vlan_proto = 0;
>-		inner.vlan_id = 0;
>-		outer.vlan_proto = 0;
>-		outer.vlan_id = 0;
>+		memset(tags, 0, sizeof(tags));
> 
> 		/* Find out through which dev should the packet go */
> 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
>@@ -2192,7 +2221,8 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 				net_warn_ratelimited("%s: no route to arp_ip_target %pI4 and arp_validate is set\n",
> 						     bond->dev->name,
> 						     &targets[i]);
>-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer);
>+			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>+				      0, tags);
> 			continue;
> 		}
> 
>@@ -2201,52 +2231,12 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 			goto found;
> 
> 		rcu_read_lock();
>-		/* first we search only for vlan devices. for every vlan
>-		 * found we verify its upper dev list, searching for the
>-		 * rt->dst.dev. If found we save the tag of the vlan and
>-		 * proceed to send the packet.
>-		 */
>-		netdev_for_each_all_upper_dev_rcu(bond->dev, vlan_upper,
>-						  vlan_iter) {
>-			if (!is_vlan_dev(vlan_upper))
>-				continue;
>-
>-			if (vlan_upper == rt->dst.dev) {
>-				outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>-				outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>-				rcu_read_unlock();
>-				goto found;
>-			}
>-			netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
>-							  iter) {
>-				if (upper == rt->dst.dev) {
>-					/* If the upper dev is a vlan dev too,
>-					 *  set the vlan tag to inner tag.
>-					 */
>-					if (is_vlan_dev(upper)) {
>-						inner.vlan_proto = vlan_dev_vlan_proto(upper);
>-						inner.vlan_id = vlan_dev_vlan_id(upper);
>-					}
>-					outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>-					outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>-					rcu_read_unlock();
>-					goto found;
>-				}
>-			}
>-		}
>-
>-		/* if the device we're looking for is not on top of any of
>-		 * our upper vlans, then just search for any dev that
>-		 * matches, and in case it's a vlan - save the id
>-		 */
>-		netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
>-			if (upper == rt->dst.dev) {
>-				rcu_read_unlock();
>-				goto found;
>-			}
>-		}
>+		ret = bond_check_path(bond->dev, rt->dst.dev, tags, 0);
> 		rcu_read_unlock();
> 
>+		if (ret)
>+			goto found;
>+
> 		/* Not our device - skip */
> 		pr_debug("%s: no path to arp_ip_target %pI4 via rt.dev %s\n",
> 			 bond->dev->name, &targets[i],
>@@ -2259,7 +2249,7 @@ found:
> 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
> 		ip_rt_put(rt);
> 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>-			      addr, &inner, &outer);
>+			      addr, tags);
> 	}
> }
> 
>-- 
>1.9.0
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com