netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] bonding: support QinQ for bond arp interval
@ 2014-03-17 12:06 Ding Tianhong
  2014-03-17 12:06 ` [PATCH net-next 1/2] vlan: make a new function vlan_dev_vlan_proto() and export Ding Tianhong
  2014-03-17 12:06 ` [PATCH net-next 2/2] bonding: support QinQ for bond arp interval Ding Tianhong
  0 siblings, 2 replies; 9+ messages in thread
From: Ding Tianhong @ 2014-03-17 12:06 UTC (permalink / raw)
  To: fubar, vfalico, andy, kaber; +Cc: davem, netdev

Ding Tianhong (2):
  vlan: make a new function  vlan_dev_vlan_proto() and export
  bonding: support QinQ for bond arp interval

 drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++-----------
 drivers/net/bonding/bonding.h   |  5 ++++
 include/linux/if_vlan.h         |  7 +++++
 net/8021q/vlan_core.c           |  6 ++++
 4 files changed, 63 insertions(+), 16 deletions(-)

-- 
1.8.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/2] vlan: make a new function  vlan_dev_vlan_proto() and export
  2014-03-17 12:06 [PATCH net-next 0/2] bonding: support QinQ for bond arp interval Ding Tianhong
@ 2014-03-17 12:06 ` Ding Tianhong
  2014-03-17 12:06 ` [PATCH net-next 2/2] bonding: support QinQ for bond arp interval Ding Tianhong
  1 sibling, 0 replies; 9+ messages in thread
From: Ding Tianhong @ 2014-03-17 12:06 UTC (permalink / raw)
  To: fubar, vfalico, andy, kaber; +Cc: davem, netdev

The vlan support 2 proto: 802.1q and 802.1ad, so make a new function
called vlan_dev_vlan_proto() which could return the vlan proto for
input dev.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 include/linux/if_vlan.h | 7 +++++++
 net/8021q/vlan_core.c   | 6 ++++++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index bbedfb5..967a428 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -110,6 +110,7 @@ extern struct net_device *__vlan_find_dev_deep(struct net_device *real_dev,
 					       __be16 vlan_proto, u16 vlan_id);
 extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
 extern u16 vlan_dev_vlan_id(const struct net_device *dev);
+extern __be16 vlan_dev_vlan_proto(const struct net_device *dev);
 
 /**
  *	struct vlan_priority_tci_mapping - vlan egress priority mappings
@@ -216,6 +217,12 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev)
 	return 0;
 }
 
+static inline __be16 vlan_dev_vlan_proto(const struct net_device *dev)
+{
+	BUG();
+	return 0;
+}
+
 static inline u16 vlan_dev_get_egress_qos_mask(struct net_device *dev,
 					       u32 skprio)
 {
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 35b3c19..3c32bd2 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -106,6 +106,12 @@ u16 vlan_dev_vlan_id(const struct net_device *dev)
 }
 EXPORT_SYMBOL(vlan_dev_vlan_id);
 
+__be16 vlan_dev_vlan_proto(const struct net_device *dev)
+{
+	return vlan_dev_priv(dev)->vlan_proto;
+}
+EXPORT_SYMBOL(vlan_dev_vlan_proto);
+
 static struct sk_buff *vlan_reorder_header(struct sk_buff *skb)
 {
 	if (skb_cow(skb, skb_headroom(skb)) < 0)
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-17 12:06 [PATCH net-next 0/2] bonding: support QinQ for bond arp interval Ding Tianhong
  2014-03-17 12:06 ` [PATCH net-next 1/2] vlan: make a new function vlan_dev_vlan_proto() and export Ding Tianhong
@ 2014-03-17 12:06 ` Ding Tianhong
  2014-03-17 17:46   ` Jay Vosburgh
  1 sibling, 1 reply; 9+ messages in thread
From: Ding Tianhong @ 2014-03-17 12:06 UTC (permalink / raw)
  To: fubar, vfalico, andy, kaber; +Cc: davem, netdev

The bond send arp request to indicate that the slave is active, and if the bond dev
is a vlan dev, it will set the vlan tag in skb to notice the vlan group, but the
bond could only send a skb with 802.1q proto, not support for QinQ.

So add outer tag for lower vlan tag and inner tag for upper vlan tag to support QinQ,
The new skb will be consist of two vlan tag just like this:

dst mac | src mac | outer vlan tag | inner vlan tag | data | .....

If We don't need QinQ, the inner vlan tag could be set to 0 and use outer vlan tag
 as a normal vlan group.

Using "ip link" to configure the bond for QinQ and add test log:

ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200

ifconfig bond0.20 11.11.20.36/24
ifconfig bond0.20.200 11.11.200.36/24

echo +11.11.200.37 > /sys/class/net/bond0/bonding/arp_ip_target

90:e2:ba:07:4a:5c (oui Unknown) > Broadcast, ethertype 802.1Q-QinQ (0x88a8),length 50: vlan 20, p 0,ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 11.11.200.37 tell 11.11.200.36, length 28

90:e2:ba:06:f9:86 (oui Unknown) > 90:e2:ba:07:4a:5c (oui Unknown), ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 20, p 0, ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 11.11.200.37 is-at 90:e2:ba:06:f9:86 (oui Unknown), length 28

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++-----------
 drivers/net/bonding/bonding.h   |  5 ++++
 2 files changed, 50 insertions(+), 16 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 324389b..068f983 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2124,12 +2124,15 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
  * switches in VLAN mode (especially if ports are configured as
  * "native" to a VLAN) might not pass non-tagged frames.
  */
-static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_ip, __be32 src_ip, unsigned short vlan_id)
+static void bond_arp_send(struct net_device *slave_dev, int arp_op,
+			  __be32 dest_ip, __be32 src_ip,
+			  struct bond_vlan_tag *inner,
+			  struct bond_vlan_tag *outer)
 {
 	struct sk_buff *skb;
 
-	pr_debug("arp %d on slave %s: dst %pI4 src %pI4 vid %d\n",
-		 arp_op, slave_dev->name, &dest_ip, &src_ip, vlan_id);
+	pr_debug("arp %d on slave %s: dst %pI4 src %pI4\n",
+		 arp_op, slave_dev->name, &dest_ip, &src_ip);
 
 	skb = arp_create(arp_op, ETH_P_ARP, dest_ip, slave_dev, src_ip,
 			 NULL, slave_dev->dev_addr, NULL);
@@ -2138,10 +2141,22 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_
 		pr_err("ARP packet allocation failed\n");
 		return;
 	}
-	if (vlan_id) {
-		skb = vlan_put_tag(skb, htons(ETH_P_8021Q), vlan_id);
+	if (outer->vlan_id) {
+		if (inner->vlan_id) {
+			pr_debug("inner tag: proto %X vid %X\n",
+				 ntohs(inner->vlan_proto), inner->vlan_id);
+			skb = __vlan_put_tag(skb, inner->vlan_proto, inner->vlan_id);
+			if (!skb) {
+				pr_err("failed to insert inner VLAN tag\n");
+				return;
+			}
+		}
+
+		pr_debug("outer reg: proto %X vid %X\n",
+			 ntohs(outer->vlan_proto), outer->vlan_id);
+		skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id);
 		if (!skb) {
-			pr_err("failed to insert VLAN tag\n");
+			pr_err("failed to insert outer VLAN tag\n");
 			return;
 		}
 	}
@@ -2154,11 +2169,16 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 	struct net_device *upper, *vlan_upper;
 	struct list_head *iter, *vlan_iter;
 	struct rtable *rt;
+	struct bond_vlan_tag inner, outer;
 	__be32 *targets = bond->params.arp_targets, addr;
-	int i, vlan_id;
+	int i;
 
 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
 		pr_debug("basa: target %pI4\n", &targets[i]);
+		inner.vlan_proto = 0;
+		inner.vlan_id = 0;
+		outer.vlan_proto = 0;
+		outer.vlan_id = 0;
 
 		/* Find out through which dev should the packet go */
 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
@@ -2170,12 +2190,10 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 			if (bond->params.arp_validate && net_ratelimit())
 				pr_warn("%s: no route to arp_ip_target %pI4 and arp_validate is set\n",
 					bond->dev->name, &targets[i]);
-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, 0);
+			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer);
 			continue;
 		}
 
-		vlan_id = 0;
-
 		/* bond device itself */
 		if (rt->dst.dev == bond->dev)
 			goto found;
@@ -2192,10 +2210,25 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 						  vlan_iter) {
 			if (!is_vlan_dev(vlan_upper))
 				continue;
+
+			if (vlan_upper == rt->dst.dev) {
+				outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
+				outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
+				rcu_read_unlock();
+				goto found;
+			}
 			netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
 							  iter) {
 				if (upper == rt->dst.dev) {
-					vlan_id = vlan_dev_vlan_id(vlan_upper);
+					/* If the upper dev is a vlan dev too,
+					 *  set the vlan tag to inner tag.
+					 */
+					if (is_vlan_dev(upper)) {
+						inner.vlan_proto = vlan_dev_vlan_proto(upper);
+						inner.vlan_id = vlan_dev_vlan_id(upper);
+					}
+					outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
+					outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
 					rcu_read_unlock();
 					goto found;
 				}
@@ -2208,10 +2241,6 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
 		 */
 		netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
 			if (upper == rt->dst.dev) {
-				/* if it's a vlan - get its VID */
-				if (is_vlan_dev(upper))
-					vlan_id = vlan_dev_vlan_id(upper);
-
 				rcu_read_unlock();
 				goto found;
 			}
@@ -2230,7 +2259,7 @@ found:
 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
 		ip_rt_put(rt);
 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
-			      addr, vlan_id);
+			      addr, &inner, &outer);
 	}
 }
 
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 0896f1d..b8bdd0a 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -266,6 +266,11 @@ struct bonding {
 #define bond_slave_get_rtnl(dev) \
 	((struct slave *) rtnl_dereference(dev->rx_handler_data))
 
+struct bond_vlan_tag {
+	__be16		vlan_proto;
+	unsigned short	vlan_id;
+};
+
 /**
  * Returns NULL if the net_device does not belong to any of the bond's slaves
  *
-- 
1.8.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-17 12:06 ` [PATCH net-next 2/2] bonding: support QinQ for bond arp interval Ding Tianhong
@ 2014-03-17 17:46   ` Jay Vosburgh
  2014-03-18  1:59     ` Ding Tianhong
  2014-03-20  8:24     ` Michal Kubecek
  0 siblings, 2 replies; 9+ messages in thread
From: Jay Vosburgh @ 2014-03-17 17:46 UTC (permalink / raw)
  To: Ding Tianhong; +Cc: vfalico, andy, kaber, davem, netdev

Ding Tianhong <dingtianhong@huawei.com> wrote:

>The bond send arp request to indicate that the slave is active, and if the bond dev
>is a vlan dev, it will set the vlan tag in skb to notice the vlan group, but the
>bond could only send a skb with 802.1q proto, not support for QinQ.
>
>So add outer tag for lower vlan tag and inner tag for upper vlan tag to support QinQ,
>The new skb will be consist of two vlan tag just like this:
>
>dst mac | src mac | outer vlan tag | inner vlan tag | data | .....
>
>If We don't need QinQ, the inner vlan tag could be set to 0 and use outer vlan tag
> as a normal vlan group.
>
>Using "ip link" to configure the bond for QinQ and add test log:
>
>ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
>ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200

	Is this nesting backwards?  The way I read it (and the way I
recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
VLAN.  Adding the second VLAN, .200 in this example, would be the second
(outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.

	In other words, adding a VLAN to an already existing VLAN makes
the newly added VLAN the "outer" and the already existing VLAN the
"inner."  Am I confused?

>ifconfig bond0.20 11.11.20.36/24
>ifconfig bond0.20.200 11.11.200.36/24
>
>echo +11.11.200.37 > /sys/class/net/bond0/bonding/arp_ip_target
>
>90:e2:ba:07:4a:5c (oui Unknown) > Broadcast, ethertype 802.1Q-QinQ (0x88a8),length 50: vlan 20, p 0,ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 11.11.200.37 tell 11.11.200.36, length 28
>
>90:e2:ba:06:f9:86 (oui Unknown) > 90:e2:ba:07:4a:5c (oui Unknown), ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 20, p 0, ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 11.11.200.37 is-at 90:e2:ba:06:f9:86 (oui Unknown), length 28
>
>Cc: Jay Vosburgh <fubar@us.ibm.com>
>Cc: Veaceslav Falico <vfalico@redhat.com>
>Cc: Andy Gospodarek <andy@greyhouse.net>
>Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>---
> drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++-----------
> drivers/net/bonding/bonding.h   |  5 ++++
> 2 files changed, 50 insertions(+), 16 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 324389b..068f983 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2124,12 +2124,15 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
>  * switches in VLAN mode (especially if ports are configured as
>  * "native" to a VLAN) might not pass non-tagged frames.
>  */
>-static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_ip, __be32 src_ip, unsigned short vlan_id)
>+static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>+			  __be32 dest_ip, __be32 src_ip,
>+			  struct bond_vlan_tag *inner,
>+			  struct bond_vlan_tag *outer)
> {
> 	struct sk_buff *skb;
>
>-	pr_debug("arp %d on slave %s: dst %pI4 src %pI4 vid %d\n",
>-		 arp_op, slave_dev->name, &dest_ip, &src_ip, vlan_id);
>+	pr_debug("arp %d on slave %s: dst %pI4 src %pI4\n",
>+		 arp_op, slave_dev->name, &dest_ip, &src_ip);
>
> 	skb = arp_create(arp_op, ETH_P_ARP, dest_ip, slave_dev, src_ip,
> 			 NULL, slave_dev->dev_addr, NULL);
>@@ -2138,10 +2141,22 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_
> 		pr_err("ARP packet allocation failed\n");
> 		return;
> 	}
>-	if (vlan_id) {
>-		skb = vlan_put_tag(skb, htons(ETH_P_8021Q), vlan_id);
>+	if (outer->vlan_id) {
>+		if (inner->vlan_id) {
>+			pr_debug("inner tag: proto %X vid %X\n",
>+				 ntohs(inner->vlan_proto), inner->vlan_id);
>+			skb = __vlan_put_tag(skb, inner->vlan_proto, inner->vlan_id);
>+			if (!skb) {
>+				pr_err("failed to insert inner VLAN tag\n");

	This should be rate limited and not a straight pr_err; it's
going to spam the log if the system is out of free memory (the only
reason __vlan_put_tag can fail at the moment).  The existing code works
the same way, but it's wrong, too.

>+				return;
>+			}
>+		}
>+
>+		pr_debug("outer reg: proto %X vid %X\n",
>+			 ntohs(outer->vlan_proto), outer->vlan_id);
>+		skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id);
> 		if (!skb) {
>-			pr_err("failed to insert VLAN tag\n");
>+			pr_err("failed to insert outer VLAN tag\n");

	Same comment here.

> 			return;
> 		}
> 	}
>@@ -2154,11 +2169,16 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 	struct net_device *upper, *vlan_upper;
> 	struct list_head *iter, *vlan_iter;
> 	struct rtable *rt;
>+	struct bond_vlan_tag inner, outer;
> 	__be32 *targets = bond->params.arp_targets, addr;
>-	int i, vlan_id;
>+	int i;
>
> 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
> 		pr_debug("basa: target %pI4\n", &targets[i]);
>+		inner.vlan_proto = 0;
>+		inner.vlan_id = 0;
>+		outer.vlan_proto = 0;
>+		outer.vlan_id = 0;
>
> 		/* Find out through which dev should the packet go */
> 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
>@@ -2170,12 +2190,10 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 			if (bond->params.arp_validate && net_ratelimit())
> 				pr_warn("%s: no route to arp_ip_target %pI4 and arp_validate is set\n",
> 					bond->dev->name, &targets[i]);
>-			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, 0);
>+			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer);
> 			continue;
> 		}
>
>-		vlan_id = 0;
>-
> 		/* bond device itself */
> 		if (rt->dst.dev == bond->dev)
> 			goto found;

	Also, in the code a few lines after the above, there's a "TODO:
QinQ?"  comment.  If you're adding QinQ support, we should probably
remove the TODO reminder.

	-J

>@@ -2192,10 +2210,25 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 						  vlan_iter) {
> 			if (!is_vlan_dev(vlan_upper))
> 				continue;
>+
>+			if (vlan_upper == rt->dst.dev) {
>+				outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>+				outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>+				rcu_read_unlock();
>+				goto found;
>+			}
> 			netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
> 							  iter) {
> 				if (upper == rt->dst.dev) {
>-					vlan_id = vlan_dev_vlan_id(vlan_upper);
>+					/* If the upper dev is a vlan dev too,
>+					 *  set the vlan tag to inner tag.
>+					 */
>+					if (is_vlan_dev(upper)) {
>+						inner.vlan_proto = vlan_dev_vlan_proto(upper);
>+						inner.vlan_id = vlan_dev_vlan_id(upper);
>+					}
>+					outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>+					outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
> 					rcu_read_unlock();
> 					goto found;
> 				}
>@@ -2208,10 +2241,6 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
> 		 */
> 		netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
> 			if (upper == rt->dst.dev) {
>-				/* if it's a vlan - get its VID */
>-				if (is_vlan_dev(upper))
>-					vlan_id = vlan_dev_vlan_id(upper);
>-
> 				rcu_read_unlock();
> 				goto found;
> 			}
>@@ -2230,7 +2259,7 @@ found:
> 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
> 		ip_rt_put(rt);
> 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>-			      addr, vlan_id);
>+			      addr, &inner, &outer);
> 	}
> }
>
>diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>index 0896f1d..b8bdd0a 100644
>--- a/drivers/net/bonding/bonding.h
>+++ b/drivers/net/bonding/bonding.h
>@@ -266,6 +266,11 @@ struct bonding {
> #define bond_slave_get_rtnl(dev) \
> 	((struct slave *) rtnl_dereference(dev->rx_handler_data))
>
>+struct bond_vlan_tag {
>+	__be16		vlan_proto;
>+	unsigned short	vlan_id;
>+};
>+
> /**
>  * Returns NULL if the net_device does not belong to any of the bond's slaves
>  *
>-- 
>1.8.0

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-17 17:46   ` Jay Vosburgh
@ 2014-03-18  1:59     ` Ding Tianhong
  2014-03-20  8:24     ` Michal Kubecek
  1 sibling, 0 replies; 9+ messages in thread
From: Ding Tianhong @ 2014-03-18  1:59 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: vfalico, andy, kaber, davem, netdev

On 2014/3/18 1:46, Jay Vosburgh wrote:
> Ding Tianhong <dingtianhong@huawei.com> wrote:
> 
>> The bond send arp request to indicate that the slave is active, and if the bond dev
>> is a vlan dev, it will set the vlan tag in skb to notice the vlan group, but the
>> bond could only send a skb with 802.1q proto, not support for QinQ.
>>
>> So add outer tag for lower vlan tag and inner tag for upper vlan tag to support QinQ,
>> The new skb will be consist of two vlan tag just like this:
>>
>> dst mac | src mac | outer vlan tag | inner vlan tag | data | .....
>>
>> If We don't need QinQ, the inner vlan tag could be set to 0 and use outer vlan tag
>> as a normal vlan group.
>>
>> Using "ip link" to configure the bond for QinQ and add test log:
>>
>> ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
>> ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200
> 
> 	Is this nesting backwards?  The way I read it (and the way I
> recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
> just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
> VLAN.  Adding the second VLAN, .200 in this example, would be the second
> (outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.
> 
> 	In other words, adding a VLAN to an already existing VLAN makes
> the newly added VLAN the "outer" and the already existing VLAN the
> "inner."  Am I confused?
> 

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8ad227ff89a7e6f05d07cd0acfd95ed3a24450ca

So I think the first vlan would be the 802.1ad as the "outer" vlan tag to support transmit skb in network(internet) and the 
second vlan would be the 802.1q as the "inner" vlan tag to support in vlan group,   and it could work well just
like the patch said.

>> ifconfig bond0.20 11.11.20.36/24
>> ifconfig bond0.20.200 11.11.200.36/24
>>
>> echo +11.11.200.37 > /sys/class/net/bond0/bonding/arp_ip_target
>>
>> 90:e2:ba:07:4a:5c (oui Unknown) > Broadcast, ethertype 802.1Q-QinQ (0x88a8),length 50: vlan 20, p 0,ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 11.11.200.37 tell 11.11.200.36, length 28
>>
>> 90:e2:ba:06:f9:86 (oui Unknown) > 90:e2:ba:07:4a:5c (oui Unknown), ethertype 802.1Q-QinQ (0x88a8), length 50: vlan 20, p 0, ethertype 802.1Q, vlan 200, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Reply 11.11.200.37 is-at 90:e2:ba:06:f9:86 (oui Unknown), length 28
>>
>> Cc: Jay Vosburgh <fubar@us.ibm.com>
>> Cc: Veaceslav Falico <vfalico@redhat.com>
>> Cc: Andy Gospodarek <andy@greyhouse.net>
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>> ---
>> drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++-----------
>> drivers/net/bonding/bonding.h   |  5 ++++
>> 2 files changed, 50 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 324389b..068f983 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2124,12 +2124,15 @@ static bool bond_has_this_ip(struct bonding *bond, __be32 ip)
>>  * switches in VLAN mode (especially if ports are configured as
>>  * "native" to a VLAN) might not pass non-tagged frames.
>>  */
>> -static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_ip, __be32 src_ip, unsigned short vlan_id)
>> +static void bond_arp_send(struct net_device *slave_dev, int arp_op,
>> +			  __be32 dest_ip, __be32 src_ip,
>> +			  struct bond_vlan_tag *inner,
>> +			  struct bond_vlan_tag *outer)
>> {
>> 	struct sk_buff *skb;
>>
>> -	pr_debug("arp %d on slave %s: dst %pI4 src %pI4 vid %d\n",
>> -		 arp_op, slave_dev->name, &dest_ip, &src_ip, vlan_id);
>> +	pr_debug("arp %d on slave %s: dst %pI4 src %pI4\n",
>> +		 arp_op, slave_dev->name, &dest_ip, &src_ip);
>>
>> 	skb = arp_create(arp_op, ETH_P_ARP, dest_ip, slave_dev, src_ip,
>> 			 NULL, slave_dev->dev_addr, NULL);
>> @@ -2138,10 +2141,22 @@ static void bond_arp_send(struct net_device *slave_dev, int arp_op, __be32 dest_
>> 		pr_err("ARP packet allocation failed\n");
>> 		return;
>> 	}
>> -	if (vlan_id) {
>> -		skb = vlan_put_tag(skb, htons(ETH_P_8021Q), vlan_id);
>> +	if (outer->vlan_id) {
>> +		if (inner->vlan_id) {
>> +			pr_debug("inner tag: proto %X vid %X\n",
>> +				 ntohs(inner->vlan_proto), inner->vlan_id);
>> +			skb = __vlan_put_tag(skb, inner->vlan_proto, inner->vlan_id);
>> +			if (!skb) {
>> +				pr_err("failed to insert inner VLAN tag\n");
> 
> 	This should be rate limited and not a straight pr_err; it's
> going to spam the log if the system is out of free memory (the only
> reason __vlan_put_tag can fail at the moment).  The existing code works
> the same way, but it's wrong, too.
> 
Yes, I will modify them.

>> +				return;
>> +			}
>> +		}
>> +
>> +		pr_debug("outer reg: proto %X vid %X\n",
>> +			 ntohs(outer->vlan_proto), outer->vlan_id);
>> +		skb = vlan_put_tag(skb, outer->vlan_proto, outer->vlan_id);
>> 		if (!skb) {
>> -			pr_err("failed to insert VLAN tag\n");
>> +			pr_err("failed to insert outer VLAN tag\n");
> 
> 	Same comment here.
> 
>> 			return;
>> 		}
>> 	}
>> @@ -2154,11 +2169,16 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 	struct net_device *upper, *vlan_upper;
>> 	struct list_head *iter, *vlan_iter;
>> 	struct rtable *rt;
>> +	struct bond_vlan_tag inner, outer;
>> 	__be32 *targets = bond->params.arp_targets, addr;
>> -	int i, vlan_id;
>> +	int i;
>>
>> 	for (i = 0; i < BOND_MAX_ARP_TARGETS && targets[i]; i++) {
>> 		pr_debug("basa: target %pI4\n", &targets[i]);
>> +		inner.vlan_proto = 0;
>> +		inner.vlan_id = 0;
>> +		outer.vlan_proto = 0;
>> +		outer.vlan_id = 0;
>>
>> 		/* Find out through which dev should the packet go */
>> 		rt = ip_route_output(dev_net(bond->dev), targets[i], 0,
>> @@ -2170,12 +2190,10 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 			if (bond->params.arp_validate && net_ratelimit())
>> 				pr_warn("%s: no route to arp_ip_target %pI4 and arp_validate is set\n",
>> 					bond->dev->name, &targets[i]);
>> -			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, 0);
>> +			bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i], 0, &inner, &outer);
>> 			continue;
>> 		}
>>
>> -		vlan_id = 0;
>> -
>> 		/* bond device itself */
>> 		if (rt->dst.dev == bond->dev)
>> 			goto found;
> 
> 	Also, in the code a few lines after the above, there's a "TODO:
> QinQ?"  comment.  If you're adding QinQ support, we should probably
> remove the TODO reminder.
>
> 	-J
> 
Yes, I miss it, new patch to clean it, thanks.

Regards
Ding

>> @@ -2192,10 +2210,25 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 						  vlan_iter) {
>> 			if (!is_vlan_dev(vlan_upper))
>> 				continue;
>> +
>> +			if (vlan_upper == rt->dst.dev) {
>> +				outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>> +				outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>> +				rcu_read_unlock();
>> +				goto found;
>> +			}
>> 			netdev_for_each_all_upper_dev_rcu(vlan_upper, upper,
>> 							  iter) {
>> 				if (upper == rt->dst.dev) {
>> -					vlan_id = vlan_dev_vlan_id(vlan_upper);
>> +					/* If the upper dev is a vlan dev too,
>> +					 *  set the vlan tag to inner tag.
>> +					 */
>> +					if (is_vlan_dev(upper)) {
>> +						inner.vlan_proto = vlan_dev_vlan_proto(upper);
>> +						inner.vlan_id = vlan_dev_vlan_id(upper);
>> +					}
>> +					outer.vlan_proto = vlan_dev_vlan_proto(vlan_upper);
>> +					outer.vlan_id = vlan_dev_vlan_id(vlan_upper);
>> 					rcu_read_unlock();
>> 					goto found;
>> 				}
>> @@ -2208,10 +2241,6 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
>> 		 */
>> 		netdev_for_each_all_upper_dev_rcu(bond->dev, upper, iter) {
>> 			if (upper == rt->dst.dev) {
>> -				/* if it's a vlan - get its VID */
>> -				if (is_vlan_dev(upper))
>> -					vlan_id = vlan_dev_vlan_id(upper);
>> -
>> 				rcu_read_unlock();
>> 				goto found;
>> 			}
>> @@ -2230,7 +2259,7 @@ found:
>> 		addr = bond_confirm_addr(rt->dst.dev, targets[i], 0);
>> 		ip_rt_put(rt);
>> 		bond_arp_send(slave->dev, ARPOP_REQUEST, targets[i],
>> -			      addr, vlan_id);
>> +			      addr, &inner, &outer);
>> 	}
>> }
>>
>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>> index 0896f1d..b8bdd0a 100644
>> --- a/drivers/net/bonding/bonding.h
>> +++ b/drivers/net/bonding/bonding.h
>> @@ -266,6 +266,11 @@ struct bonding {
>> #define bond_slave_get_rtnl(dev) \
>> 	((struct slave *) rtnl_dereference(dev->rx_handler_data))
>>
>> +struct bond_vlan_tag {
>> +	__be16		vlan_proto;
>> +	unsigned short	vlan_id;
>> +};
>> +
>> /**
>>  * Returns NULL if the net_device does not belong to any of the bond's slaves
>>  *
>> -- 
>> 1.8.0
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-17 17:46   ` Jay Vosburgh
  2014-03-18  1:59     ` Ding Tianhong
@ 2014-03-20  8:24     ` Michal Kubecek
  2014-03-20  9:29       ` Ding Tianhong
  1 sibling, 1 reply; 9+ messages in thread
From: Michal Kubecek @ 2014-03-20  8:24 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Ding Tianhong, vfalico, andy, kaber, davem, netdev

On Mon, Mar 17, 2014 at 10:46:04AM -0700, Jay Vosburgh wrote:
> Ding Tianhong <dingtianhong@huawei.com> wrote:
> >
> >ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
> >ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200
> 
> 	Is this nesting backwards?  The way I read it (and the way I
> recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
> just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
> VLAN.  Adding the second VLAN, .200 in this example, would be the second
> (outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.
> 
> 	In other words, adding a VLAN to an already existing VLAN makes
> the newly added VLAN the "outer" and the already existing VLAN the
> "inner."  Am I confused?

I don't think so. My understanding is that when sending a packet to
a vlan device, it is tagged (according to its "proto") and passed to its
underlying device.

So in the case above, sending a packet to bond0.20.200 would add an
802.1q tag and pass the result to bond0.20 which would add an 802.1ad
tag and pass the result to bond0. Which is the way it should work.

                                                       Michal Kubecek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-20  8:24     ` Michal Kubecek
@ 2014-03-20  9:29       ` Ding Tianhong
  2014-03-20  9:40         ` Michal Kubecek
  0 siblings, 1 reply; 9+ messages in thread
From: Ding Tianhong @ 2014-03-20  9:29 UTC (permalink / raw)
  To: Michal Kubecek, Jay Vosburgh; +Cc: vfalico, andy, kaber, davem, netdev

On 2014/3/20 16:24, Michal Kubecek wrote:
> On Mon, Mar 17, 2014 at 10:46:04AM -0700, Jay Vosburgh wrote:
>> Ding Tianhong <dingtianhong@huawei.com> wrote:
>>>
>>> ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
>>> ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200
>>
>> 	Is this nesting backwards?  The way I read it (and the way I
>> recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
>> just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
>> VLAN.  Adding the second VLAN, .200 in this example, would be the second
>> (outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.
>>
>> 	In other words, adding a VLAN to an already existing VLAN makes
>> the newly added VLAN the "outer" and the already existing VLAN the
>> "inner."  Am I confused?
> 
> I don't think so. My understanding is that when sending a packet to
> a vlan device, it is tagged (according to its "proto") and passed to its
> underlying device.
> 
> So in the case above, sending a packet to bond0.20.200 would add an
> 802.1q tag and pass the result to bond0.20 which would add an 802.1ad
> tag and pass the result to bond0. Which is the way it should work.
> 
>                                                        Michal Kubecek

Hi Michal:

I agree with your analysis of QinQ for vlan in common usage, but I think you miss
something for how the arp interval works, the bonding need to create a new
arp request with vlan tag to confirm that the slave should be active or unactive,  
the skb could not be passed to bond0.20.200, we have to build QinQ skb ourselves.

Ding

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-20  9:29       ` Ding Tianhong
@ 2014-03-20  9:40         ` Michal Kubecek
  2014-03-20 10:21           ` Ding Tianhong
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Kubecek @ 2014-03-20  9:40 UTC (permalink / raw)
  To: Ding Tianhong; +Cc: Jay Vosburgh, vfalico, andy, kaber, davem, netdev

On Thu, Mar 20, 2014 at 05:29:39PM +0800, Ding Tianhong wrote:
> On 2014/3/20 16:24, Michal Kubecek wrote:
> > On Mon, Mar 17, 2014 at 10:46:04AM -0700, Jay Vosburgh wrote:
> >> Ding Tianhong <dingtianhong@huawei.com> wrote:
> >>>
> >>> ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
> >>> ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200
> >>
> >> 	Is this nesting backwards?  The way I read it (and the way I
> >> recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
> >> just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
> >> VLAN.  Adding the second VLAN, .200 in this example, would be the second
> >> (outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.
> >>
> >> 	In other words, adding a VLAN to an already existing VLAN makes
> >> the newly added VLAN the "outer" and the already existing VLAN the
> >> "inner."  Am I confused?
> > 
> > I don't think so. My understanding is that when sending a packet to
> > a vlan device, it is tagged (according to its "proto") and passed to its
> > underlying device.
> > 
> > So in the case above, sending a packet to bond0.20.200 would add an
> > 802.1q tag and pass the result to bond0.20 which would add an 802.1ad
> > tag and pass the result to bond0. Which is the way it should work.
> 
> I agree with your analysis of QinQ for vlan in common usage, but I think you miss
> something for how the arp interval works, the bonding need to create a new
> arp request with vlan tag to confirm that the slave should be active or unactive,  
> the skb could not be passed to bond0.20.200, we have to build QinQ skb ourselves.

My comment referred only to the way stacked interfaces are set up in the
quoted example, not to the original issue.

                                                         Michal Kubecek

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/2] bonding: support QinQ for bond arp interval
  2014-03-20  9:40         ` Michal Kubecek
@ 2014-03-20 10:21           ` Ding Tianhong
  0 siblings, 0 replies; 9+ messages in thread
From: Ding Tianhong @ 2014-03-20 10:21 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: Jay Vosburgh, vfalico, andy, kaber, davem, netdev

On 2014/3/20 17:40, Michal Kubecek wrote:
> On Thu, Mar 20, 2014 at 05:29:39PM +0800, Ding Tianhong wrote:
>> On 2014/3/20 16:24, Michal Kubecek wrote:
>>> On Mon, Mar 17, 2014 at 10:46:04AM -0700, Jay Vosburgh wrote:
>>>> Ding Tianhong <dingtianhong@huawei.com> wrote:
>>>>>
>>>>> ip link add link bond0  bond0.20 type vlan proto 802.1ad id 20
>>>>> ip link add link bond0.20  bond0.20.200 type vlan proto 802.1q id 200
>>>>
>>>> 	Is this nesting backwards?  The way I read it (and the way I
>>>> recall that VLANs nest), "bond0.20" is the "regular" VLAN, i.e., if we
>>>> just have bond0.20 it would be a standard 802.1q (ethertype 0x8100)
>>>> VLAN.  Adding the second VLAN, .200 in this example, would be the second
>>>> (outer) tag, and would be the 802.1ad (ethertype 0x88a8) tag.
>>>>
>>>> 	In other words, adding a VLAN to an already existing VLAN makes
>>>> the newly added VLAN the "outer" and the already existing VLAN the
>>>> "inner."  Am I confused?
>>>
>>> I don't think so. My understanding is that when sending a packet to
>>> a vlan device, it is tagged (according to its "proto") and passed to its
>>> underlying device.
>>>
>>> So in the case above, sending a packet to bond0.20.200 would add an
>>> 802.1q tag and pass the result to bond0.20 which would add an 802.1ad
>>> tag and pass the result to bond0. Which is the way it should work.
>>
>> I agree with your analysis of QinQ for vlan in common usage, but I think you miss
>> something for how the arp interval works, the bonding need to create a new
>> arp request with vlan tag to confirm that the slave should be active or unactive,  
>> the skb could not be passed to bond0.20.200, we have to build QinQ skb ourselves.
> 
> My comment referred only to the way stacked interfaces are set up in the
> quoted example, not to the original issue.
> 
>                                                          Michal Kubecek
> 

Sorry for that, I misunderstood.:)

Ding

> 
> .
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-03-20 10:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-17 12:06 [PATCH net-next 0/2] bonding: support QinQ for bond arp interval Ding Tianhong
2014-03-17 12:06 ` [PATCH net-next 1/2] vlan: make a new function vlan_dev_vlan_proto() and export Ding Tianhong
2014-03-17 12:06 ` [PATCH net-next 2/2] bonding: support QinQ for bond arp interval Ding Tianhong
2014-03-17 17:46   ` Jay Vosburgh
2014-03-18  1:59     ` Ding Tianhong
2014-03-20  8:24     ` Michal Kubecek
2014-03-20  9:29       ` Ding Tianhong
2014-03-20  9:40         ` Michal Kubecek
2014-03-20 10:21           ` Ding Tianhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).