[PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
@ 2009-11-30 20:14 Andy Gospodarek
  2009-11-30 20:22 ` Patrick McHardy
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Andy Gospodarek @ 2009-11-30 20:14 UTC (permalink / raw)
  To: netdev, fubar


This allows a bond device to specify an arp_ip_target as a host that is
not on the same vlan as the base bond device.  A configuration like
this, now works:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1

Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:40:05:30:ff:30

Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:13:21:be:33:e9

I've tested this on 2.6.31 with devices and support VLAN acceleration
and those that do not as well as a backport on 2.6.18 with success.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

---
 drivers/net/bonding/bond_main.c |    3 +++
 net/8021q/vlan_core.c           |    2 ++
 net/core/dev.c                  |    6 +++---
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 726bd75..aee8973 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2691,6 +2691,9 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
 	unsigned char *arp_ptr;
 	__be32 sip, tip;
 
+	while (dev->priv_flags & IFF_802_1Q_VLAN)
+		dev = vlan_dev_real_dev(dev);
+
 	if (!(dev->priv_flags & IFF_BONDING) || !(dev->flags & IFF_MASTER))
 		goto out;
 
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..8d8a778 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 5d131c2..eeee269 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
 		skb->skb_iif = skb->dev->ifindex;
 
 	null_or_orig = NULL;
-	orig_dev = skb->dev;
-	if (orig_dev->master) {
+	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
+	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
 		if (skb_bond_should_drop(skb))
 			null_or_orig = orig_dev; /* deliver only exact match */
 		else
@@ -2492,7 +2492,7 @@ ncls:
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
 		if (ptype->type == type &&
 		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;
-- 
1.6.2.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-11-30 20:14 [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Andy Gospodarek
@ 2009-11-30 20:22 ` Patrick McHardy
  2009-11-30 20:53   ` Andy Gospodarek
  2009-12-01  0:00 ` Jay Vosburgh
  2009-12-02  5:13 ` Eric Dumazet
  2 siblings, 1 reply; 25+ messages in thread
From: Patrick McHardy @ 2009-11-30 20:22 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev, fubar

Andy Gospodarek wrote:
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index e75a2f3..8d8a778 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>  	if (skb_bond_should_drop(skb))
>  		goto drop;
>  
> +	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>  
> @@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
>  	if (skb_bond_should_drop(skb))
>  		goto drop;
>  
> +	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>  

How about pulling the skb->iif assignment in netif_receive_skb() up
before the vlan_hwaccel_do_receive() call instead? I'd actually call
this a bug fix since hardware accelerated devices should not differ
from non-accelerated devices in their iif value.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-11-30 20:22 ` Patrick McHardy
@ 2009-11-30 20:53   ` Andy Gospodarek
  2009-12-01  9:42     ` Patrick McHardy
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-11-30 20:53 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Andy Gospodarek, netdev, fubar

On Mon, Nov 30, 2009 at 09:22:33PM +0100, Patrick McHardy wrote:
> Andy Gospodarek wrote:
> > diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> > index e75a2f3..8d8a778 100644
> > --- a/net/8021q/vlan_core.c
> > +++ b/net/8021q/vlan_core.c
> > @@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
> >  	if (skb_bond_should_drop(skb))
> >  		goto drop;
> >  
> > +	skb->skb_iif = skb->dev->ifindex;
> >  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> >  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> >  
> > @@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
> >  	if (skb_bond_should_drop(skb))
> >  		goto drop;
> >  
> > +	skb->skb_iif = skb->dev->ifindex;
> >  	__vlan_hwaccel_put_tag(skb, vlan_tci);
> >  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
> >  
> 
> How about pulling the skb->iif assignment in netif_receive_skb() up
> before the vlan_hwaccel_do_receive() call instead?

I'm not sure how that will help.  My goal with the two changes in
vlan_core.c was to capture the input device index before we lost it --
which is precisely what happens when coming up through a driver like
bnx2 that supports vlan acceleration:

(NAPI poll)
bnx2_poll
  bnx2_poll_work
    bnx2_rx_int
      vlan_hwaccel_receive_skb
        __vlan_hwaccel_rx
          (skb->dev set to vlan device)
          netif_receive_skb

In that case, setting skb_iif in netif_receive_skb would be too late.
The original dev would no longer be known.

> I'd actually call
> this a bug fix since hardware accelerated devices should not differ
> from non-accelerated devices in their iif value.

This dances the line between bugfix and feature since the arp monitoring
component was never really advertised as working, but you are correct
that it could be a bug since skb_iif should be consistent depending the
the devices capabilities.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-11-30 20:53   ` Andy Gospodarek
@ 2009-12-01  9:42     ` Patrick McHardy
  0 siblings, 0 replies; 25+ messages in thread
From: Patrick McHardy @ 2009-12-01  9:42 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev, fubar

Andy Gospodarek wrote:
> On Mon, Nov 30, 2009 at 09:22:33PM +0100, Patrick McHardy wrote:
>> Andy Gospodarek wrote:
>>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>>> index e75a2f3..8d8a778 100644
>>> --- a/net/8021q/vlan_core.c
>>> +++ b/net/8021q/vlan_core.c
>>> @@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>>>  	if (skb_bond_should_drop(skb))
>>>  		goto drop;
>>>  
>>> +	skb->skb_iif = skb->dev->ifindex;
>>>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>>>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>>>  
>>> @@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
>>>  	if (skb_bond_should_drop(skb))
>>>  		goto drop;
>>>  
>>> +	skb->skb_iif = skb->dev->ifindex;
>>>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>>>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>>>  
>> How about pulling the skb->iif assignment in netif_receive_skb() up
>> before the vlan_hwaccel_do_receive() call instead?
> 
> I'm not sure how that will help.  My goal with the two changes in
> vlan_core.c was to capture the input device index before we lost it --
> which is precisely what happens when coming up through a driver like
> bnx2 that supports vlan acceleration:
> 
> (NAPI poll)
> bnx2_poll
>   bnx2_poll_work
>     bnx2_rx_int
>       vlan_hwaccel_receive_skb
>         __vlan_hwaccel_rx
>           (skb->dev set to vlan device)
>           netif_receive_skb
> 
> In that case, setting skb_iif in netif_receive_skb would be too late.
> The original dev would no longer be known.

Right, I've misread your patch, sorry.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-11-30 20:14 [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Andy Gospodarek
  2009-11-30 20:22 ` Patrick McHardy
@ 2009-12-01  0:00 ` Jay Vosburgh
  2009-12-01  1:21   ` Andy Gospodarek
  2009-12-02  5:13 ` Eric Dumazet
  2 siblings, 1 reply; 25+ messages in thread
From: Jay Vosburgh @ 2009-12-01  0:00 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev

Andy Gospodarek <andy@greyhouse.net> wrote:

>This allows a bond device to specify an arp_ip_target as a host that is
>not on the same vlan as the base bond device.  A configuration like
>this, now works:
>
>1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>    inet 127.0.0.1/8 scope host lo
>    inet6 ::1/128 scope host
>       valid_lft forever preferred_lft forever
>2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>    inet6 fe80::213:21ff:febe:33e9/64 scope link
>       valid_lft forever preferred_lft forever
>9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
>    inet6 fe80::213:21ff:febe:33e9/64 scope link
>       valid_lft forever preferred_lft forever

	I'm not quite clear here on exactly what it is that doesn't
work.

	Putting the arp_ip_target on a VLAN destination already works
(and has for a long time); I just checked against a 2.6.32-rc to make
sure I wasn't misremembering.

	Perhaps there's some nuance of "not on the same vlan as the base
bond device" that I'm missing.  What I see working before me is, e.g., a
bond0.777 VLAN interface atop a regular bond0 active-backup with a
couple of slaves; bond0 may or may not have an IP address of its own.
The arp_ip_target destination is on VLAN 777 somewhere.

	Is this what your patch is meant to enable, or is it something
different?  I'm pulling down today's net-next to see if this is
something that broke recently.

	-J

>Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)
>
>Bonding Mode: fault-tolerance (active-backup)
>Primary Slave: None
>Currently Active Slave: eth1
>MII Status: up
>MII Polling Interval (ms): 0
>Up Delay (ms): 0
>Down Delay (ms): 0
>ARP Polling Interval (ms): 1000
>ARP IP target/s (n.n.n.n form): 10.0.100.1
>
>Slave Interface: eth1
>MII Status: up
>Link Failure Count: 1
>Permanent HW addr: 00:40:05:30:ff:30
>
>Slave Interface: eth0
>MII Status: up
>Link Failure Count: 0
>Permanent HW addr: 00:13:21:be:33:e9
>
>I've tested this on 2.6.31 with devices and support VLAN acceleration
>and those that do not as well as a backport on 2.6.18 with success.
>
>Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
>
>---
> drivers/net/bonding/bond_main.c |    3 +++
> net/8021q/vlan_core.c           |    2 ++
> net/core/dev.c                  |    6 +++---
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 726bd75..aee8973 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2691,6 +2691,9 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
> 	unsigned char *arp_ptr;
> 	__be32 sip, tip;
>
>+	while (dev->priv_flags & IFF_802_1Q_VLAN)
>+		dev = vlan_dev_real_dev(dev);
>+
> 	if (!(dev->priv_flags & IFF_BONDING) || !(dev->flags & IFF_MASTER))
> 		goto out;
>
>diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>index e75a2f3..8d8a778 100644
>--- a/net/8021q/vlan_core.c
>+++ b/net/8021q/vlan_core.c
>@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
> 	if (skb_bond_should_drop(skb))
> 		goto drop;
>
>+	skb->skb_iif = skb->dev->ifindex;
> 	__vlan_hwaccel_put_tag(skb, vlan_tci);
> 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>
>@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
> 	if (skb_bond_should_drop(skb))
> 		goto drop;
>
>+	skb->skb_iif = skb->dev->ifindex;
> 	__vlan_hwaccel_put_tag(skb, vlan_tci);
> 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 5d131c2..eeee269 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
> 		skb->skb_iif = skb->dev->ifindex;
>
> 	null_or_orig = NULL;
>-	orig_dev = skb->dev;
>-	if (orig_dev->master) {
>+	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
>+	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
> 		if (skb_bond_should_drop(skb))
> 			null_or_orig = orig_dev; /* deliver only exact match */
> 		else
>@@ -2492,7 +2492,7 @@ ncls:
> 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
> 		if (ptype->type == type &&
> 		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
>-		     ptype->dev == orig_dev)) {
>+		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
> 			if (pt_prev)
> 				ret = deliver_skb(skb, pt_prev, orig_dev);
> 			pt_prev = ptype;
>-- 
>1.6.2.5
>

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01  0:00 ` Jay Vosburgh
@ 2009-12-01  1:21   ` Andy Gospodarek
  2009-12-01  1:57     ` Jay Vosburgh
  2009-12-01  4:11     ` [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Eric Dumazet
  0 siblings, 2 replies; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-01  1:21 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Mon, Nov 30, 2009 at 04:00:38PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> 
> >This allows a bond device to specify an arp_ip_target as a host that is
> >not on the same vlan as the base bond device.  A configuration like
> >this, now works:
> >
> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
> >    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >    inet 127.0.0.1/8 scope host lo
> >    inet6 ::1/128 scope host
> >       valid_lft forever preferred_lft forever
> >2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
> >       valid_lft forever preferred_lft forever
> >9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
> >       valid_lft forever preferred_lft forever
> 
> 	I'm not quite clear here on exactly what it is that doesn't
> work.
> 
> 	Putting the arp_ip_target on a VLAN destination already works
> (and has for a long time); I just checked against a 2.6.32-rc to make
> sure I wasn't misremembering.
> 
> 	Perhaps there's some nuance of "not on the same vlan as the base
> bond device" that I'm missing.  What I see working before me is, e.g., a
> bond0.777 VLAN interface atop a regular bond0 active-backup with a
> couple of slaves; bond0 may or may not have an IP address of its own.
> The arp_ip_target destination is on VLAN 777 somewhere.

Do you have net.ipv4.conf.all.arp_ignore set to 0 and/or an IP address
assigned on bond0?  I can easily reproduce this with no IP on bond0 and
net.ipv4.conf.all.arp_ignore = 1.

I can't say for sure that the sysctl setting makes a difference, but I
have that on all my test rigs, so it's worth mentioning.

> 	Is this what your patch is meant to enable, or is it something
> different?  I'm pulling down today's net-next to see if this is
> something that broke recently.
> 

I first tested and found the problem while running 2.6.30-rc series
after it was reported to be a problem on RHEL5.  It's not clear how long
it has been broken, but this situation is odd enough that it probably
never worked as it was never tested.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01  1:21   ` Andy Gospodarek
@ 2009-12-01  1:57     ` Jay Vosburgh
  2009-12-01 14:44       ` Andy Gospodarek
  2009-12-01  4:11     ` [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Eric Dumazet
  1 sibling, 1 reply; 25+ messages in thread
From: Jay Vosburgh @ 2009-12-01  1:57 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev

Andy Gospodarek <andy@greyhouse.net> wrote:

>On Mon, Nov 30, 2009 at 04:00:38PM -0800, Jay Vosburgh wrote:
>> Andy Gospodarek <andy@greyhouse.net> wrote:
>> 
>> >This allows a bond device to specify an arp_ip_target as a host that is
>> >not on the same vlan as the base bond device.  A configuration like
>> >this, now works:
>> >
>> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
>> >    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> >    inet 127.0.0.1/8 scope host lo
>> >    inet6 ::1/128 scope host
>> >       valid_lft forever preferred_lft forever
>> >2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>> >3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>> >8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
>> >       valid_lft forever preferred_lft forever
>> >9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>> >    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
>> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
>> >       valid_lft forever preferred_lft forever
>> 
>> 	I'm not quite clear here on exactly what it is that doesn't
>> work.
>> 
>> 	Putting the arp_ip_target on a VLAN destination already works
>> (and has for a long time); I just checked against a 2.6.32-rc to make
>> sure I wasn't misremembering.
>> 
>> 	Perhaps there's some nuance of "not on the same vlan as the base
>> bond device" that I'm missing.  What I see working before me is, e.g., a
>> bond0.777 VLAN interface atop a regular bond0 active-backup with a
>> couple of slaves; bond0 may or may not have an IP address of its own.
>> The arp_ip_target destination is on VLAN 777 somewhere.
>
>Do you have net.ipv4.conf.all.arp_ignore set to 0 and/or an IP address
>assigned on bond0?  I can easily reproduce this with no IP on bond0 and
>net.ipv4.conf.all.arp_ignore = 1.
>
>I can't say for sure that the sysctl setting makes a difference, but I
>have that on all my test rigs, so it's worth mentioning.
>
>> 	Is this what your patch is meant to enable, or is it something
>> different?  I'm pulling down today's net-next to see if this is
>> something that broke recently.
>> 
>
>I first tested and found the problem while running 2.6.30-rc series
>after it was reported to be a problem on RHEL5.  It's not clear how long
>it has been broken, but this situation is odd enough that it probably
>never worked as it was never tested.

	I tried it with both arp_ignore set to 1 and 0, and with the
bond0 interface with and without an IP address.  It works fine in all
four cases.  I'm using net-next-2.6 pulled earlier today; it claims to
be 2.6.32-rc7.

	I've tested "ARP monitor over VLAN" in the past, so it's worked
for me before.  Heck, it's working right now.

	I thought maybe you have "arp_validate" enabled (which doesn't
work over a VLAN), but your patch doesn't help there, so presumably not.
Fixing that is a totally separate adventure into hook-ville; I'd briefly
hoped you'd found a better way.

	When it's failing, are you getting any messages in dmesg?  I'm
wondering specifically about any of the various routing-related things
that bond_arp_send_all might kick out.

	Do you have any of the other arp_whatever sysctls enabled?  My
configuration has them all set to zero; maybe one of those is messing
something up.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01  1:57     ` Jay Vosburgh
@ 2009-12-01 14:44       ` Andy Gospodarek
  2009-12-01 21:28         ` Jay Vosburgh
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-01 14:44 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Mon, Nov 30, 2009 at 05:57:15PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> 
> >On Mon, Nov 30, 2009 at 04:00:38PM -0800, Jay Vosburgh wrote:
> >> Andy Gospodarek <andy@greyhouse.net> wrote:
> >> 
> >> >This allows a bond device to specify an arp_ip_target as a host that is
> >> >not on the same vlan as the base bond device.  A configuration like
> >> >this, now works:
> >> >
> >> >1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
> >> >    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >> >    inet 127.0.0.1/8 scope host lo
> >> >    inet6 ::1/128 scope host
> >> >       valid_lft forever preferred_lft forever
> >> >2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
> >> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >> >3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
> >> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >> >8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
> >> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
> >> >       valid_lft forever preferred_lft forever
> >> >9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
> >> >    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
> >> >    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
> >> >    inet6 fe80::213:21ff:febe:33e9/64 scope link
> >> >       valid_lft forever preferred_lft forever
> >> 
> >> 	I'm not quite clear here on exactly what it is that doesn't
> >> work.
> >> 
> >> 	Putting the arp_ip_target on a VLAN destination already works
> >> (and has for a long time); I just checked against a 2.6.32-rc to make
> >> sure I wasn't misremembering.
> >> 
> >> 	Perhaps there's some nuance of "not on the same vlan as the base
> >> bond device" that I'm missing.  What I see working before me is, e.g., a
> >> bond0.777 VLAN interface atop a regular bond0 active-backup with a
> >> couple of slaves; bond0 may or may not have an IP address of its own.
> >> The arp_ip_target destination is on VLAN 777 somewhere.
> >
> >Do you have net.ipv4.conf.all.arp_ignore set to 0 and/or an IP address
> >assigned on bond0?  I can easily reproduce this with no IP on bond0 and
> >net.ipv4.conf.all.arp_ignore = 1.
> >
> >I can't say for sure that the sysctl setting makes a difference, but I
> >have that on all my test rigs, so it's worth mentioning.
> >
> >> 	Is this what your patch is meant to enable, or is it something
> >> different?  I'm pulling down today's net-next to see if this is
> >> something that broke recently.
> >> 
> >
> >I first tested and found the problem while running 2.6.30-rc series
> >after it was reported to be a problem on RHEL5.  It's not clear how long
> >it has been broken, but this situation is odd enough that it probably
> >never worked as it was never tested.
> 
> 	I tried it with both arp_ignore set to 1 and 0, and with the
> bond0 interface with and without an IP address.  It works fine in all
> four cases.  I'm using net-next-2.6 pulled earlier today; it claims to
> be 2.6.32-rc7.
> 
> 	I've tested "ARP monitor over VLAN" in the past, so it's worked
> for me before.  Heck, it's working right now.
> 
> 	I thought maybe you have "arp_validate" enabled (which doesn't
> work over a VLAN), but your patch doesn't help there, so presumably not.
> Fixing that is a totally separate adventure into hook-ville; I'd briefly
> hoped you'd found a better way.
> 

I am using arp_validate, actually.  I forgot that the arp_validate
option doesn't show up in the output of /proc/net/bonding/bondX and I
intended to have that in the subject, but somehow dropped it.

Here is a bit more detail on my setup:

# grep -v ^# /etc/sysconfig/network-scripts/ifcfg-bond0 
DEVICE=bond0
BOOTPROTO=none
ONBOOT=no
BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3"

(The 'active' and 'backup' arp_validate options work just as well as
'all.')

Here is the dmesg output for the above config:

bonding: bond0: Warning: failed to get speed and duplex from eth3, assumed to be 100Mb/sec and Full.
bonding: bond0: enslaving eth3 as a backup interface with an up link.
bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting ARP monitoring interval to 1000.
bonding: bond0: ARP target 10.0.100.1 is already present
bonding: bond0: setting arp_validate to all (3).
bnx2: eth2 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON
bnx2: eth3 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON
bonding: bond0: link status definitely down for interface eth2, disabling it
bonding: bond0: making interface eth3 the new active one.
bonding: bond0: no route to arp_ip_target 10.0.100.1
bonding: bond0: link status definitely up for interface eth2.
bond0: no IPv6 routers present
bond0.100: no IPv6 routers present

# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth3
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1

Slave Interface: eth2
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:10:18:36:0a:d4

Slave Interface: eth3
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:10:18:36:0a:d6

> 	When it's failing, are you getting any messages in dmesg?  I'm
> wondering specifically about any of the various routing-related things
> that bond_arp_send_all might kick out.
> 

When it doesn't work, dmesg just looks like this:

bonding: bond0: Warning: failed to get speed and duplex from eth3, assumed to be 100Mb/sec and Full.
bonding: bond0: enslaving eth3 as a backup interface with an up link.
bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting ARP monitoring interval to 1000.
bonding: bond0: ARP target 10.0.100.1 is already present
bonding: bond0: setting arp_validate to all (3).
bnx2: eth2 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON
bnx2: eth3 NIC Copper Link is Up, 100 Mbps full duplex, receive & transmit flow control ON
bonding: bond0: link status definitely down for interface eth2, disabling it
bonding: bond0: making interface eth3 the new active one.
bonding: bond0: no route to arp_ip_target 10.0.100.1
bonding: bond0: link status definitely down for interface eth3, disabling it
bonding: bond0: now running without any active interface !
bond0: no IPv6 routers present
bond0.100: no IPv6 routers present

# cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: None
MII Status: down
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1

Slave Interface: eth2
MII Status: down
Link Failure Count: 1
Permanent HW addr: 00:10:18:36:0a:d4

Slave Interface: eth3
MII Status: down
Link Failure Count: 1
Permanent HW addr: 00:10:18:36:0a:d6

Though I think that information won't be that useful now that I've
actually explained this patch makes the arp_validate options work with a
vlan setup like this.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01 14:44       ` Andy Gospodarek
@ 2009-12-01 21:28         ` Jay Vosburgh
  2009-12-01 23:07           ` Andy Gospodarek
  2009-12-02 21:24           ` Andy Gospodarek
  0 siblings, 2 replies; 25+ messages in thread
From: Jay Vosburgh @ 2009-12-01 21:28 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev

Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
>I am using arp_validate, actually.  I forgot that the arp_validate
>option doesn't show up in the output of /proc/net/bonding/bondX and I
>intended to have that in the subject, but somehow dropped it.

	Ok, I was doing it wrong earlier; it works with arp_validate.
I'm seeing one problem with tcpdump, though, which I'll get to in a
minute.

	Could you update the summary / changelog message to mention that
this patch fixes the specific case of arp_validate + arp_ip_target on
VLAN?

	Second, in regards to this:

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
 		skb->skb_iif = skb->dev->ifindex;

 	null_or_orig = NULL;
-	orig_dev = skb->dev;
-	if (orig_dev->master) {
+	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
+	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
 		if (skb_bond_should_drop(skb))
 			null_or_orig = orig_dev; /* deliver only exact match */
 		else

	Would it be useful to add a comment to the effect that VLAN
packets are run through skb_bond_should_drop at the VLAN layer?

	Lastly, in regards to this:

@@ -2492,7 +2492,7 @@ ncls:
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
 		if (ptype->type == type &&
 		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;

	This is presumably here because orig_dev will now be the actual
slave the packet arrived on, but we want to additionally deliver to the
master, correct?

	Lastly, tcpdump.

	This patch appears to affect what traffic tcpdump of a slave or
the bonding master itself will capture.  Previously, tcpdump of the
active slave would see only the transmitted packets sent over the bond,
and tcpdump of the inactive slave would see incoming Ethernet-layer
multicast or broadcasts sent to its switch port.  Tcpdump on the master
would see all sent and non-VLAN received traffic, and tcpdump of the
VLAN interface over the master would see just the VLAN traffic.

	After this change, tcpdump of the active slave or of the bonding
master (bond0) sees both sent and received traffic for the VLAN, but
nothing for the non-VLAN traffic other than incoming broadcast /
multicasts.  This holds true whether or not a VLAN is configured.

	I added a "ptype->dev == orig_dev->master" test to the ptype_all
receive block in netif_receive_skb, but it didn't help.  At the moment,
I'm not exactly sure why tcpdump breaks.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01 21:28         ` Jay Vosburgh
@ 2009-12-01 23:07           ` Andy Gospodarek
  2009-12-02 21:24           ` Andy Gospodarek
  1 sibling, 0 replies; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-01 23:07 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Tue, Dec 01, 2009 at 01:28:13PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> [...]
> >I am using arp_validate, actually.  I forgot that the arp_validate
> >option doesn't show up in the output of /proc/net/bonding/bondX and I
> >intended to have that in the subject, but somehow dropped it.
> 
> 	Ok, I was doing it wrong earlier; it works with arp_validate.
> I'm seeing one problem with tcpdump, though, which I'll get to in a
> minute.
> 
> 	Could you update the summary / changelog message to mention that
> this patch fixes the specific case of arp_validate + arp_ip_target on
> VLAN?
> 

Yep, I would be glad to do that.

> 	Second, in regards to this:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
>  		skb->skb_iif = skb->dev->ifindex;
> 
>  	null_or_orig = NULL;
> -	orig_dev = skb->dev;
> -	if (orig_dev->master) {
> +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
>  		if (skb_bond_should_drop(skb))
>  			null_or_orig = orig_dev; /* deliver only exact match */
>  		else
> 
> 	Would it be useful to add a comment to the effect that VLAN
> packets are run through skb_bond_should_drop at the VLAN layer?
> 
> 	Lastly, in regards to this:
> 
> @@ -2492,7 +2492,7 @@ ncls:
>  			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>  		if (ptype->type == type &&
>  		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> -		     ptype->dev == orig_dev)) {
> +		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
>  			if (pt_prev)
>  				ret = deliver_skb(skb, pt_prev, orig_dev);
>  			pt_prev = ptype;
> 
> 	This is presumably here because orig_dev will now be the actual
> slave the packet arrived on, but we want to additionally deliver to the
> master, correct?

Yes, that is exactly right.

> 
> 	Lastly, tcpdump.
> 
> 	This patch appears to affect what traffic tcpdump of a slave or
> the bonding master itself will capture.  Previously, tcpdump of the
> active slave would see only the transmitted packets sent over the bond,
> and tcpdump of the inactive slave would see incoming Ethernet-layer
> multicast or broadcasts sent to its switch port.  Tcpdump on the master
> would see all sent and non-VLAN received traffic, and tcpdump of the
> VLAN interface over the master would see just the VLAN traffic.
> 
> 	After this change, tcpdump of the active slave or of the bonding
> master (bond0) sees both sent and received traffic for the VLAN, but
> nothing for the non-VLAN traffic other than incoming broadcast /
> multicasts.  This holds true whether or not a VLAN is configured.
> 
> 	I added a "ptype->dev == orig_dev->master" test to the ptype_all
> receive block in netif_receive_skb, but it didn't help.  At the moment,
> I'm not exactly sure why tcpdump breaks.
> 

This is interesting; I'll investigate.

> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01 21:28         ` Jay Vosburgh
  2009-12-01 23:07           ` Andy Gospodarek
@ 2009-12-02 21:24           ` Andy Gospodarek
  2009-12-07 18:13             ` Andy Gospodarek
  1 sibling, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-02 21:24 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

On Tue, Dec 01, 2009 at 01:28:13PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> [...]
> >I am using arp_validate, actually.  I forgot that the arp_validate
> >option doesn't show up in the output of /proc/net/bonding/bondX and I
> >intended to have that in the subject, but somehow dropped it.
> 
> 	Ok, I was doing it wrong earlier; it works with arp_validate.
> I'm seeing one problem with tcpdump, though, which I'll get to in a
> minute.
> 
> 	Could you update the summary / changelog message to mention that
> this patch fixes the specific case of arp_validate + arp_ip_target on
> VLAN?
> 
> 	Second, in regards to this:
> 
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
>  		skb->skb_iif = skb->dev->ifindex;
> 
>  	null_or_orig = NULL;
> -	orig_dev = skb->dev;
> -	if (orig_dev->master) {
> +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
>  		if (skb_bond_should_drop(skb))
>  			null_or_orig = orig_dev; /* deliver only exact match */
>  		else
> 
> 	Would it be useful to add a comment to the effect that VLAN
> packets are run through skb_bond_should_drop at the VLAN layer?
> 
> 	Lastly, in regards to this:
> 
> @@ -2492,7 +2492,7 @@ ncls:
>  			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>  		if (ptype->type == type &&
>  		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> -		     ptype->dev == orig_dev)) {
> +		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
>  			if (pt_prev)
>  				ret = deliver_skb(skb, pt_prev, orig_dev);
>  			pt_prev = ptype;
> 
> 	This is presumably here because orig_dev will now be the actual
> slave the packet arrived on, but we want to additionally deliver to the
> master, correct?
> 
> 	Lastly, tcpdump.
> 
> 	This patch appears to affect what traffic tcpdump of a slave or
> the bonding master itself will capture.  Previously, tcpdump of the
> active slave would see only the transmitted packets sent over the bond,
> and tcpdump of the inactive slave would see incoming Ethernet-layer
> multicast or broadcasts sent to its switch port.  Tcpdump on the master
> would see all sent and non-VLAN received traffic, and tcpdump of the
> VLAN interface over the master would see just the VLAN traffic.
> 
> 	After this change, tcpdump of the active slave or of the bonding
> master (bond0) sees both sent and received traffic for the VLAN, but
> nothing for the non-VLAN traffic other than incoming broadcast /
> multicasts.  This holds true whether or not a VLAN is configured.
> 
> 	I added a "ptype->dev == orig_dev->master" test to the ptype_all
> receive block in netif_receive_skb, but it didn't help.  At the moment,
> I'm not exactly sure why tcpdump breaks.
> 

Jay,

The issue was that that orig_dev was getting set to the active slave, so
your running tcpdump on the active slave made the conditional inside
this loop:

        list_for_each_entry_rcu(ptype, &ptype_all, list) {
                if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
                    ptype->dev == orig_dev) {
                        if (pt_prev)
                                ret = deliver_skb(skb, pt_prev, orig_dev);
                        pt_prev = ptype;
                }
        }

hit and deliver_skb was being called for all traffic coming toward
bond0.<vid>.  I'm not completely happy with this solutoin, but I think
it resolves both the original problem I was trying to solve and the
regression you discovered with your original patch.  Let me know if you
see everything working now like I do.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 726bd75..b1e3b2f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
 	bond = netdev_priv(dev);
 	read_lock(&bond->lock);
 
+	/*
+	 * We may have dev passed in as a vlan device, so make sure to get to the
+	 * core netdev before continuing.
+	 */
+	if (dev->priv_flags & IFF_802_1Q_VLAN) {
+		dev = vlan_dev_real_dev(dev);
+		/*
+		 * Don't necessarily trust passed in orig_dev since vlan accelerated
+		 * netdevs and bonding don't play well together.
+		 */
+		orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
+	}
+
 	pr_debug("bond_arp_rcv: bond %s skb->dev %s orig_dev %s\n",
 		bond->dev->name, skb->dev ? skb->dev->name : "NULL",
 		orig_dev ? orig_dev->name : "NULL");
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..8d8a778 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 5d131c2..9c3ba0d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2421,6 +2421,7 @@ int netif_receive_skb(struct sk_buff *skb)
 {
 	struct packet_type *ptype, *pt_prev;
 	struct net_device *orig_dev;
+	struct net_device *bond_dev;
 	struct net_device *null_or_orig;
 	int ret = NET_RX_DROP;
 	__be16 type;
@@ -2487,12 +2488,24 @@ ncls:
 	if (!skb)
 		goto out;
 
+	/*
+	 * A bonding interface with a VLAN on top doesn't play nicely when the
+	 * netdev in the bond is capable of stripping the VLAN tag for us.
+	 * Knowing the base bond device is important in the event that bond
+	 * control frames arrive with a VLAN tag, but need to be serviced by
+	 * a hook installed for the base bond device.
+	 */
+	bond_dev = skb->dev;
+	if ((bond_dev->priv_flags & IFF_802_1Q_VLAN) &&
+	    (vlan_dev_real_dev(bond_dev)->priv_flags & IFF_BONDING))
+		bond_dev = vlan_dev_real_dev(bond_dev);
+
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
 		if (ptype->type == type &&
 		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		     ptype->dev == orig_dev || ptype->dev == bond_dev)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-02 21:24           ` Andy Gospodarek
@ 2009-12-07 18:13             ` Andy Gospodarek
  2009-12-07 18:24               ` Eric Dumazet
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-07 18:13 UTC (permalink / raw)
  To: Jay Vosburgh, netdev

On Wed, Dec 02, 2009 at 04:24:49PM -0500, Andy Gospodarek wrote:
> On Tue, Dec 01, 2009 at 01:28:13PM -0800, Jay Vosburgh wrote:
> > Andy Gospodarek <andy@greyhouse.net> wrote:
> > [...]
> > >I am using arp_validate, actually.  I forgot that the arp_validate
> > >option doesn't show up in the output of /proc/net/bonding/bondX and I
> > >intended to have that in the subject, but somehow dropped it.
> > 
> > 	Ok, I was doing it wrong earlier; it works with arp_validate.
> > I'm seeing one problem with tcpdump, though, which I'll get to in a
> > minute.
> > 
> > 	Could you update the summary / changelog message to mention that
> > this patch fixes the specific case of arp_validate + arp_ip_target on
> > VLAN?
> > 
> > 	Second, in regards to this:
> > 
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
> >  		skb->skb_iif = skb->dev->ifindex;
> > 
> >  	null_or_orig = NULL;
> > -	orig_dev = skb->dev;
> > -	if (orig_dev->master) {
> > +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> > +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
> >  		if (skb_bond_should_drop(skb))
> >  			null_or_orig = orig_dev; /* deliver only exact match */
> >  		else
> > 
> > 	Would it be useful to add a comment to the effect that VLAN
> > packets are run through skb_bond_should_drop at the VLAN layer?
> > 
> > 	Lastly, in regards to this:
> > 
> > @@ -2492,7 +2492,7 @@ ncls:
> >  			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
> >  		if (ptype->type == type &&
> >  		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> > -		     ptype->dev == orig_dev)) {
> > +		     ptype->dev == orig_dev || ptype->dev == orig_dev->master)) {
> >  			if (pt_prev)
> >  				ret = deliver_skb(skb, pt_prev, orig_dev);
> >  			pt_prev = ptype;
> > 
> > 	This is presumably here because orig_dev will now be the actual
> > slave the packet arrived on, but we want to additionally deliver to the
> > master, correct?
> > 
> > 	Lastly, tcpdump.
> > 
> > 	This patch appears to affect what traffic tcpdump of a slave or
> > the bonding master itself will capture.  Previously, tcpdump of the
> > active slave would see only the transmitted packets sent over the bond,
> > and tcpdump of the inactive slave would see incoming Ethernet-layer
> > multicast or broadcasts sent to its switch port.  Tcpdump on the master
> > would see all sent and non-VLAN received traffic, and tcpdump of the
> > VLAN interface over the master would see just the VLAN traffic.
> > 
> > 	After this change, tcpdump of the active slave or of the bonding
> > master (bond0) sees both sent and received traffic for the VLAN, but
> > nothing for the non-VLAN traffic other than incoming broadcast /
> > multicasts.  This holds true whether or not a VLAN is configured.
> > 
> > 	I added a "ptype->dev == orig_dev->master" test to the ptype_all
> > receive block in netif_receive_skb, but it didn't help.  At the moment,
> > I'm not exactly sure why tcpdump breaks.
> > 
> 
> Jay,
> 
> The issue was that that orig_dev was getting set to the active slave, so
> your running tcpdump on the active slave made the conditional inside
> this loop:
> 
>         list_for_each_entry_rcu(ptype, &ptype_all, list) {
>                 if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
>                     ptype->dev == orig_dev) {
>                         if (pt_prev)
>                                 ret = deliver_skb(skb, pt_prev, orig_dev);
>                         pt_prev = ptype;
>                 }
>         }
> 
> hit and deliver_skb was being called for all traffic coming toward
> bond0.<vid>.  I'm not completely happy with this solutoin, but I think
> it resolves both the original problem I was trying to solve and the
> regression you discovered with your original patch.  Let me know if you
> see everything working now like I do.
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 726bd75..b1e3b2f 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
>  	bond = netdev_priv(dev);
>  	read_lock(&bond->lock);
>  
> +	/*
> +	 * We may have dev passed in as a vlan device, so make sure to get to the
> +	 * core netdev before continuing.
> +	 */
> +	if (dev->priv_flags & IFF_802_1Q_VLAN) {
> +		dev = vlan_dev_real_dev(dev);
> +		/*
> +		 * Don't necessarily trust passed in orig_dev since vlan accelerated
> +		 * netdevs and bonding don't play well together.
> +		 */
> +		orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> +	}
> +
>  	pr_debug("bond_arp_rcv: bond %s skb->dev %s orig_dev %s\n",
>  		bond->dev->name, skb->dev ? skb->dev->name : "NULL",
>  		orig_dev ? orig_dev->name : "NULL");
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index e75a2f3..8d8a778 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>  	if (skb_bond_should_drop(skb))
>  		goto drop;
>  
> +	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>  
> @@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
>  	if (skb_bond_should_drop(skb))
>  		goto drop;
>  
> +	skb->skb_iif = skb->dev->ifindex;
>  	__vlan_hwaccel_put_tag(skb, vlan_tci);
>  	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
>  
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5d131c2..9c3ba0d 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2421,6 +2421,7 @@ int netif_receive_skb(struct sk_buff *skb)
>  {
>  	struct packet_type *ptype, *pt_prev;
>  	struct net_device *orig_dev;
> +	struct net_device *bond_dev;
>  	struct net_device *null_or_orig;
>  	int ret = NET_RX_DROP;
>  	__be16 type;
> @@ -2487,12 +2488,24 @@ ncls:
>  	if (!skb)
>  		goto out;
>  
> +	/*
> +	 * A bonding interface with a VLAN on top doesn't play nicely when the
> +	 * netdev in the bond is capable of stripping the VLAN tag for us.
> +	 * Knowing the base bond device is important in the event that bond
> +	 * control frames arrive with a VLAN tag, but need to be serviced by
> +	 * a hook installed for the base bond device.
> +	 */
> +	bond_dev = skb->dev;
> +	if ((bond_dev->priv_flags & IFF_802_1Q_VLAN) &&
> +	    (vlan_dev_real_dev(bond_dev)->priv_flags & IFF_BONDING))
> +		bond_dev = vlan_dev_real_dev(bond_dev);
> +
>  	type = skb->protocol;
>  	list_for_each_entry_rcu(ptype,
>  			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>  		if (ptype->type == type &&
>  		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> -		     ptype->dev == orig_dev)) {
> +		     ptype->dev == orig_dev || ptype->dev == bond_dev)) {
>  			if (pt_prev)
>  				ret = deliver_skb(skb, pt_prev, orig_dev);
>  			pt_prev = ptype;

Any thoughts on the updated patch, Jay?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-07 18:13             ` Andy Gospodarek
@ 2009-12-07 18:24               ` Eric Dumazet
  2009-12-09 22:01                 ` [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation Andy Gospodarek
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2009-12-07 18:24 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev

Andy Gospodarek a écrit :

>> Jay,
>>
>> The issue was that that orig_dev was getting set to the active slave, so
>> your running tcpdump on the active slave made the conditional inside
>> this loop:
>>
>>         list_for_each_entry_rcu(ptype, &ptype_all, list) {
>>                 if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
>>                     ptype->dev == orig_dev) {
>>                         if (pt_prev)
>>                                 ret = deliver_skb(skb, pt_prev, orig_dev);
>>                         pt_prev = ptype;
>>                 }
>>         }
>>
>> hit and deliver_skb was being called for all traffic coming toward
>> bond0.<vid>.  I'm not completely happy with this solutoin, but I think
>> it resolves both the original problem I was trying to solve and the
>> regression you discovered with your original patch.  Let me know if you
>> see everything working now like I do.
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 726bd75..b1e3b2f 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
>>  	bond = netdev_priv(dev);
>>  	read_lock(&bond->lock);
>>  
>> +	/*
>> +	 * We may have dev passed in as a vlan device, so make sure to get to the
>> +	 * core netdev before continuing.
>> +	 */
>> +	if (dev->priv_flags & IFF_802_1Q_VLAN) {
>> +		dev = vlan_dev_real_dev(dev);
>> +		/*
>> +		 * Don't necessarily trust passed in orig_dev since vlan accelerated
>> +		 * netdevs and bonding don't play well together.
>> +		 */
>> +		orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
>> +	}
>> +

> 
> Any thoughts on the updated patch, Jay?

Unfortunately you still use __dev_get_by_index()
in a non safe context.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-07 18:24               ` Eric Dumazet
@ 2009-12-09 22:01                 ` Andy Gospodarek
  2009-12-11  5:17                   ` Jay Vosburgh
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-09 22:01 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andy Gospodarek, Jay Vosburgh, netdev

On Mon, Dec 07, 2009 at 07:24:57PM +0100, Eric Dumazet wrote:
> Andy Gospodarek a écrit :
> 
> >> Jay,
> >>
> >> The issue was that that orig_dev was getting set to the active slave, so
> >> your running tcpdump on the active slave made the conditional inside
> >> this loop:
> >>
> >>         list_for_each_entry_rcu(ptype, &ptype_all, list) {
> >>                 if (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
> >>                     ptype->dev == orig_dev) {
> >>                         if (pt_prev)
> >>                                 ret = deliver_skb(skb, pt_prev, orig_dev);
> >>                         pt_prev = ptype;
> >>                 }
> >>         }
> >>
> >> hit and deliver_skb was being called for all traffic coming toward
> >> bond0.<vid>.  I'm not completely happy with this solutoin, but I think
> >> it resolves both the original problem I was trying to solve and the
> >> regression you discovered with your original patch.  Let me know if you
> >> see everything working now like I do.
> >>
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index 726bd75..b1e3b2f 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -2697,6 +2697,19 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
> >>  	bond = netdev_priv(dev);
> >>  	read_lock(&bond->lock);
> >>  
> >> +	/*
> >> +	 * We may have dev passed in as a vlan device, so make sure to get to the
> >> +	 * core netdev before continuing.
> >> +	 */
> >> +	if (dev->priv_flags & IFF_802_1Q_VLAN) {
> >> +		dev = vlan_dev_real_dev(dev);
> >> +		/*
> >> +		 * Don't necessarily trust passed in orig_dev since vlan accelerated
> >> +		 * netdevs and bonding don't play well together.
> >> +		 */
> >> +		orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> >> +	}
> >> +
> 
> > 
> > Any thoughts on the updated patch, Jay?
> 
> Unfortunately you still use __dev_get_by_index()
> in a non safe context.
> 

I wasn't completely happy with the last patch, so I reworked it a bit
and addressed your concern regarding __dev_get_by_index usage.  I
replaced it with dev_get_by_index_rcu. This should be safe since all
calls to deliver_skb are protected by rcu_read_lock.  I also decided to
have the device modification happen in the frame handler rather than in
netif_receive_skb.  Here is the updated patch:

[PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation

This allows a bond device to specify an arp_ip_target as a host that is
not on the same vlan as the base bond device and still use arp
validation.  A configuration like this, now works:

BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3" 

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
    
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1
    
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:40:05:30:ff:30
    
Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:13:21:be:33:e9

---
 drivers/net/bonding/bond_main.c |   11 +++++++++++
 net/8021q/vlan_core.c           |    2 ++
 net/core/dev.c                  |   17 +++++++++++++++--
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index af9b9c4..537e365 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2691,6 +2691,17 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
 	unsigned char *arp_ptr;
 	__be32 sip, tip;
 
+	if (dev->priv_flags & IFF_802_1Q_VLAN) {
+		/*
+		 * When using VLANS and bonding, dev and oriv_dev may be
+		 * incorrect if the physical interface supports VLAN
+		 * acceleration.  With this change ARP validation now
+		 * works for hosts only reachable on the VLAN interface.
+		 */
+		dev = vlan_dev_real_dev(dev);
+		orig_dev = dev_get_by_index_rcu(dev_net(skb->dev),skb->skb_iif);
+	}
+
 	if (!(dev->priv_flags & IFF_BONDING) || !(dev->flags & IFF_MASTER))
 		goto out;
 
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..c0316e0 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index c36a17a..ef62d2d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2422,6 +2422,7 @@ int netif_receive_skb(struct sk_buff *skb)
 	struct packet_type *ptype, *pt_prev;
 	struct net_device *orig_dev;
 	struct net_device *null_or_orig;
+	struct net_device *null_or_bond;
 	int ret = NET_RX_DROP;
 	__be16 type;
 
@@ -2487,12 +2488,24 @@ ncls:
 	if (!skb)
 		goto out;
 
+	/*
+	 * Make sure frames received on VLAN interfaces stacked on
+	 * bonding interfaces still make their way to any base bonding
+	 * device that may have registered for a specific ptype.  The
+	 * handler will have to adjust skb->dev and orig_dev though.
+	 */
+	null_or_bond = NULL;
+	if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
+	    (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
+		null_or_bond = vlan_dev_real_dev(skb->dev);
+	}
+
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
 		if (ptype->type == type &&
-		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		    (ptype->dev == null_or_orig || ptype->dev == null_or_bond ||
+		     ptype->dev == skb->dev || ptype->dev == orig_dev)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;
-- 
1.6.2.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-09 22:01                 ` [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation Andy Gospodarek
@ 2009-12-11  5:17                   ` Jay Vosburgh
  2009-12-14 20:48                     ` [PATCH net-next-2.6 v4] " Andy Gospodarek
  0 siblings, 1 reply; 25+ messages in thread
From: Jay Vosburgh @ 2009-12-11  5:17 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Eric Dumazet, netdev

Andy Gospodarek <andy@greyhouse.net> wrote:

>[...]  Here is the updated patch:
>
>[PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation
>
>This allows a bond device to specify an arp_ip_target as a host that is
>not on the same vlan as the base bond device and still use arp
>validation.  A configuration like this, now works:
[...]

	I'm testing with one modification to your patch (the change from
your patch is below).  The gist of this change is to use "null_or_orig"
instead of adding a new variable "null_or_bond."  I believe this is
safe, as null_or_orig should currently only be set for non-VLAN traffic
(VLAN traffic won't pass the "orig_dev->master" test; the VLAN code
itself does the skb_bond_should_drop stuff), and the null_or_bond is/was
only used for VLAN traffic.

	This patch has a debug printk in it right now for testing until
I'm sure I'm not confused.

	Thoughts?

	-J

diff --git a/net/core/dev.c b/net/core/dev.c
index 0c96321..ac47be9 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2422,7 +2422,6 @@ int netif_receive_skb(struct sk_buff *skb)
 	struct packet_type *ptype, *pt_prev;
 	struct net_device *orig_dev;
 	struct net_device *null_or_orig;
-	struct net_device *null_or_bond;
 	int ret = NET_RX_DROP;
 	__be16 type;
 
@@ -2494,17 +2493,18 @@ ncls:
 	 * device that may have registered for a specific ptype.  The
 	 * handler will have to adjust skb->dev and orig_dev though.
 	 */
-	null_or_bond = NULL;
 	if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
 	    (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
-		null_or_bond = vlan_dev_real_dev(skb->dev);
+		if (null_or_orig)
+			printk(KERN_ERR "BAD: n_o_o %p %s\n", null_or_orig,
+			       null_or_orig->name);
+		null_or_orig = vlan_dev_real_dev(skb->dev);
 	}
 
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
-		if (ptype->type == type &&
-		    (ptype->dev == null_or_orig || ptype->dev == null_or_bond ||
+		if (ptype->type == type && (ptype->dev == null_or_orig ||
 		     ptype->dev == skb->dev || ptype->dev == orig_dev)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);


---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-11  5:17                   ` Jay Vosburgh
@ 2009-12-14 20:48                     ` Andy Gospodarek
  2009-12-26  2:22                       ` David Miller
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-14 20:48 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, Eric Dumazet, netdev

On Thu, Dec 10, 2009 at 09:17:12PM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> 
> >[...]  Here is the updated patch:
> >
> >[PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation
> >
> >This allows a bond device to specify an arp_ip_target as a host that is
> >not on the same vlan as the base bond device and still use arp
> >validation.  A configuration like this, now works:
> [...]
> 
> 	I'm testing with one modification to your patch (the change from
> your patch is below).  The gist of this change is to use "null_or_orig"
> instead of adding a new variable "null_or_bond."  I believe this is
> safe, as null_or_orig should currently only be set for non-VLAN traffic
> (VLAN traffic won't pass the "orig_dev->master" test; the VLAN code
> itself does the skb_bond_should_drop stuff), and the null_or_bond is/was
> only used for VLAN traffic.
> 

Overall I think this is a reasonable way to do it.  I hesitated to
overload null_or_orig, but this certainly seems to pass my tests.  Here
is an updated patch with the printk removed (I'm not sure how you want
to attribute this, so I'll let you add your own signed-off-by line).

[PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation

This allows a bond device to specify an arp_ip_target as a host that is
not on the same vlan as the base bond device and still use arp
validation.  A configuration like this, now works:

BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3" 

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
    inet6 fe80::213:21ff:febe:33e9/64 scope link
       valid_lft forever preferred_lft forever
    
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 10.0.100.1
    
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:40:05:30:ff:30
    
Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:13:21:be:33:e9


Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

---
 drivers/net/bonding/bond_main.c |   11 +++++++++++
 net/8021q/vlan_core.c           |    2 ++
 net/core/dev.c                  |   20 +++++++++++++++++---
 3 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index af9b9c4..537e365 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2691,6 +2691,17 @@ static int bond_arp_rcv(struct sk_buff *skb, struct net_device *dev, struct pack
 	unsigned char *arp_ptr;
 	__be32 sip, tip;
 
+	if (dev->priv_flags & IFF_802_1Q_VLAN) {
+		/*
+		 * When using VLANS and bonding, dev and oriv_dev may be
+		 * incorrect if the physical interface supports VLAN
+		 * acceleration.  With this change ARP validation now
+		 * works for hosts only reachable on the VLAN interface.
+		 */
+		dev = vlan_dev_real_dev(dev);
+		orig_dev = dev_get_by_index_rcu(dev_net(skb->dev),skb->skb_iif);
+	}
+
 	if (!(dev->priv_flags & IFF_BONDING) || !(dev->flags & IFF_MASTER))
 		goto out;
 
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index e75a2f3..c0316e0 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -14,6 +14,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
@@ -85,6 +86,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
 	if (skb_bond_should_drop(skb))
 		goto drop;
 
+	skb->skb_iif = skb->dev->ifindex;
 	__vlan_hwaccel_put_tag(skb, vlan_tci);
 	skb->dev = vlan_group_get_device(grp, vlan_tci & VLAN_VID_MASK);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index c36a17a..a674f84 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2487,12 +2487,26 @@ ncls:
 	if (!skb)
 		goto out;
 
+	/*
+	 * Make sure frames received on VLAN interfaces stacked on
+	 * bonding interfaces still make their way to any base bonding
+	 * device that may have registered for a specific ptype.  The
+	 * handler may have to adjust skb->dev and orig_dev.
+	 *
+	 * null_or_orig can be overloaded since it will not be set when
+	 * using VLANs on top of bonding.  Putting it here prevents
+	 * disturbing the ptype_all handlers above.
+	 */
+	if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
+	    (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
+		null_or_orig = vlan_dev_real_dev(skb->dev);
+	}
+
 	type = skb->protocol;
 	list_for_each_entry_rcu(ptype,
 			&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
-		if (ptype->type == type &&
-		    (ptype->dev == null_or_orig || ptype->dev == skb->dev ||
-		     ptype->dev == orig_dev)) {
+		if (ptype->type == type && (ptype->dev == null_or_orig ||
+		     ptype->dev == skb->dev || ptype->dev == orig_dev)) {
 			if (pt_prev)
 				ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = ptype;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-14 20:48                     ` [PATCH net-next-2.6 v4] " Andy Gospodarek
@ 2009-12-26  2:22                       ` David Miller
  2009-12-28 15:26                         ` Andy Gospodarek
  0 siblings, 1 reply; 25+ messages in thread
From: David Miller @ 2009-12-26  2:22 UTC (permalink / raw)
  To: andy; +Cc: fubar, eric.dumazet, netdev

From: Andy Gospodarek <andy@greyhouse.net>
Date: Mon, 14 Dec 2009 15:48:58 -0500

> [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
 ...
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

What is the status of this patch?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-26  2:22                       ` David Miller
@ 2009-12-28 15:26                         ` Andy Gospodarek
  2009-12-28 15:33                           ` David Miller
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-28 15:26 UTC (permalink / raw)
  To: David Miller; +Cc: andy, fubar, eric.dumazet, netdev

On Fri, Dec 25, 2009 at 06:22:30PM -0800, David Miller wrote:
> From: Andy Gospodarek <andy@greyhouse.net>
> Date: Mon, 14 Dec 2009 15:48:58 -0500
> 
> > [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
>  ...
> > Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
> 
> What is the status of this patch?

Dave,

This patch should be ready for inclusion.

I've addressed Jay's concerns related tcpdumps on the various
interfaces, so that there is no difference between the way things behave
before and after the patch.

I also addressed Eric's concerns regarding the liability of calling
__dev_get_by_index and switched to the shiny, new dev_get_by_index_rcu.

A backported version has also seen some testing on a 2.6.18-based kernel
and the results have been positive.

Thanks for following up,

-andy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-28 15:26                         ` Andy Gospodarek
@ 2009-12-28 15:33                           ` David Miller
  2009-12-28 21:55                             ` Jay Vosburgh
  0 siblings, 1 reply; 25+ messages in thread
From: David Miller @ 2009-12-28 15:33 UTC (permalink / raw)
  To: andy; +Cc: fubar, eric.dumazet, netdev

From: Andy Gospodarek <andy@greyhouse.net>
Date: Mon, 28 Dec 2009 10:26:32 -0500

> This patch should be ready for inclusion.

Yes, but I haven't seen ACK's from Jay or Eric.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-28 15:33                           ` David Miller
@ 2009-12-28 21:55                             ` Jay Vosburgh
  2009-12-29  0:51                               ` Andy Gospodarek
  2010-01-04  5:19                               ` David Miller
  0 siblings, 2 replies; 25+ messages in thread
From: Jay Vosburgh @ 2009-12-28 21:55 UTC (permalink / raw)
  To: David Miller; +Cc: andy, eric.dumazet, netdev

David Miller <davem@davemloft.net> wrote:

>From: Andy Gospodarek <andy@greyhouse.net>
>Date: Mon, 28 Dec 2009 10:26:32 -0500
>
>> This patch should be ready for inclusion.
>
>Yes, but I haven't seen ACK's from Jay or Eric.

	I was testing with the final patch just before my personal
holiday adventures began, and was having one problem: the ARP monitor
part worked fine on the VLAN (the slaves were up) but regular traffic on
that same VLAN, e.g., ping, was dropped on receive at the bond.  The
peer (without bonding) saw the traffic, and responded, but it appeared
that ARP replies from the peer (again, over the VLAN) weren't making it
to the ARP table (at the bond end), so everything that depended on the
ARP table wasn't working.

	I'm willing to believe this is a personal problem at my end, but
I haven't determined the cause, and won't have the opportunity until
next week (after my house has drained of relatives).

	If Andy isn't seeing the above misbehavior, I'm fine with
assuming it's my problem and applying the patch; if I find something
actually wrong with the patch next week we can address it then.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-28 21:55                             ` Jay Vosburgh
@ 2009-12-29  0:51                               ` Andy Gospodarek
  2010-01-04  5:19                               ` David Miller
  1 sibling, 0 replies; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-29  0:51 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: David Miller, andy, eric.dumazet, netdev

On Mon, Dec 28, 2009 at 01:55:33PM -0800, Jay Vosburgh wrote:
> David Miller <davem@davemloft.net> wrote:
> 
> >From: Andy Gospodarek <andy@greyhouse.net>
> >Date: Mon, 28 Dec 2009 10:26:32 -0500
> >
> >> This patch should be ready for inclusion.
> >
> >Yes, but I haven't seen ACK's from Jay or Eric.
> 
> 	I was testing with the final patch just before my personal
> holiday adventures began, and was having one problem: the ARP monitor
> part worked fine on the VLAN (the slaves were up) but regular traffic on
> that same VLAN, e.g., ping, was dropped on receive at the bond.  The
> peer (without bonding) saw the traffic, and responded, but it appeared
> that ARP replies from the peer (again, over the VLAN) weren't making it
> to the ARP table (at the bond end), so everything that depended on the
> ARP table wasn't working.
> 
> 	I'm willing to believe this is a personal problem at my end, but
> I haven't determined the cause, and won't have the opportunity until
> next week (after my house has drained of relatives).
> 
> 	If Andy isn't seeing the above misbehavior, I'm fine with
> assuming it's my problem and applying the patch; if I find something
> actually wrong with the patch next week we can address it then.
> 

>From time to time, I would see something similar would be if I was
testing without VLANs and ended up with a stale entry in the remote
hosts ARP table that would cause the response traffic to get sent
without a VLAN tag.

I was able to pass tagged traffic when the bond was up and always made
sure to scp my tested patches over that link to the machine that would
post them.  It seems like good practice to exercise the changes I'm
making during the posting process. :)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6 v4] bonding: allow arp_ip_targets on separate vlans to use arp validation
  2009-12-28 21:55                             ` Jay Vosburgh
  2009-12-29  0:51                               ` Andy Gospodarek
@ 2010-01-04  5:19                               ` David Miller
  1 sibling, 0 replies; 25+ messages in thread
From: David Miller @ 2010-01-04  5:19 UTC (permalink / raw)
  To: fubar; +Cc: andy, eric.dumazet, netdev

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Mon, 28 Dec 2009 13:55:33 -0800

> 	If Andy isn't seeing the above misbehavior, I'm fine with
> assuming it's my problem and applying the patch; if I find something
> actually wrong with the patch next week we can address it then.
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Fair enough, applied to net-next-2.6, thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-01  1:21   ` Andy Gospodarek
  2009-12-01  1:57     ` Jay Vosburgh
@ 2009-12-01  4:11     ` Eric Dumazet
  1 sibling, 0 replies; 25+ messages in thread
From: Eric Dumazet @ 2009-12-01  4:11 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev

Andy Gospodarek a écrit :
> On Mon, Nov 30, 2009 at 04:00:38PM -0800, Jay Vosburgh wrote:
>> Andy Gospodarek <andy@greyhouse.net> wrote:
>>
>>> This allows a bond device to specify an arp_ip_target as a host that is
>>> not on the same vlan as the base bond device.  A configuration like
>>> this, now works:
>>>
>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
>>>    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>    inet 127.0.0.1/8 scope host lo
>>>    inet6 ::1/128 scope host
>>>       valid_lft forever preferred_lft forever
>>> 2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>>>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>>> 3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
>>>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>>> 8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>>>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>>>    inet6 fe80::213:21ff:febe:33e9/64 scope link
>>>       valid_lft forever preferred_lft forever
>>> 9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
>>>    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
>>>    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
>>>    inet6 fe80::213:21ff:febe:33e9/64 scope link
>>>       valid_lft forever preferred_lft forever
>> 	I'm not quite clear here on exactly what it is that doesn't
>> work.
>>
>> 	Putting the arp_ip_target on a VLAN destination already works
>> (and has for a long time); I just checked against a 2.6.32-rc to make
>> sure I wasn't misremembering.
>>
>> 	Perhaps there's some nuance of "not on the same vlan as the base
>> bond device" that I'm missing.  What I see working before me is, e.g., a
>> bond0.777 VLAN interface atop a regular bond0 active-backup with a
>> couple of slaves; bond0 may or may not have an IP address of its own.
>> The arp_ip_target destination is on VLAN 777 somewhere.
> 
> Do you have net.ipv4.conf.all.arp_ignore set to 0 and/or an IP address
> assigned on bond0?  I can easily reproduce this with no IP on bond0 and
> net.ipv4.conf.all.arp_ignore = 1.
> 
> I can't say for sure that the sysctl setting makes a difference, but I
> have that on all my test rigs, so it's worth mentioning.
> 
>> 	Is this what your patch is meant to enable, or is it something
>> different?  I'm pulling down today's net-next to see if this is
>> something that broke recently.
>>
> 
> I first tested and found the problem while running 2.6.30-rc series
> after it was reported to be a problem on RHEL5.  It's not clear how long
> it has been broken, but this situation is odd enough that it probably
> never worked as it was never tested.
> 

Strange, I cannot reproduce the problem here, with bnx2/tg3 adapters,
and net-next-2.6. What is your NIC driver ?

# ifconfig bond0
bond0     Link encap:Ethernet  HWaddr 00:1E:0B:EC:D3:D2
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:61787693 errors:0 dropped:0 overruns:0 frame:0
          TX packets:61894413 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:750916661 (716.1 Mb)  TX bytes:991607878 (945.6 Mb)

# ifconfig vlan.103
vlan.103  Link encap:Ethernet  HWaddr 00:1E:0B:EC:D3:D2
          inet addr:192.168.20.110  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:236770 errors:0 dropped:0 overruns:0 frame:0
          TX packets:182838 errors:0 dropped:6 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:13697089 (13.0 Mb)  TX bytes:206093300 (196.5 Mb)

# ifconfig vlan.825
vlan.825  Link encap:Ethernet  HWaddr 00:1E:0B:EC:D3:D2
          inet addr:10.170.73.123  Bcast:0.0.0.0  Mask:255.255.255.128
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:61549795 errors:0 dropped:0 overruns:0 frame:0
          TX packets:61584427 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3672807415 (3502.6 Mb)  TX bytes:283329099 (270.2 Mb)

# ifconfig eth1   (bnx2)
eth1      Link encap:Ethernet  HWaddr 00:1E:0B:EC:D3:D2
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:1596 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:110466 (107.8 Kb)  TX bytes:512 (512.0 b)
          Interrupt:16 Memory:fa000000-fa012800

# ifconfig eth2    (tg3)
eth2      Link encap:Ethernet  HWaddr 00:1E:0B:EC:D3:D2
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:61786578 errors:0 dropped:0 overruns:0 frame:0
          TX packets:61894510 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:750842426 (716.0 Mb)  TX bytes:991618552 (945.6 Mb)
          Interrupt:19

# cat /proc/sys/net/ipv4/conf/all/arp_ignore
1
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth2
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0
ARP Polling Interval (ms): 1000
ARP IP target/s (n.n.n.n form): 192.168.20.254

Slave Interface: eth1
MII Status: up
Link Failure Count: 9
Permanent HW addr: 00:1e:0b:ec:d3:d2

Slave Interface: eth2
MII Status: up
Link Failure Count: 6
Permanent HW addr: 00:1e:0b:92:78:50

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-11-30 20:14 [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Andy Gospodarek
  2009-11-30 20:22 ` Patrick McHardy
  2009-12-01  0:00 ` Jay Vosburgh
@ 2009-12-02  5:13 ` Eric Dumazet
  2009-12-02 16:38   ` Andy Gospodarek
  2 siblings, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2009-12-02  5:13 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: netdev, fubar

Andy Gospodarek a écrit :
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5d131c2..eeee269 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
>  		skb->skb_iif = skb->dev->ifindex;
>  
>  	null_or_orig = NULL;
> -	orig_dev = skb->dev;
> -	if (orig_dev->master) {
> +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
>  		if (skb_bond_should_drop(skb))
>  			null_or_orig = orig_dev; /* deliver only exact match */
>  		else


Please dont add a __dev_get_by_index() call in netif_receive_skb()

First, this is not safe against writers.

Second its very expensive on some setups.

Eventually, you could do something like :


Change prototypes of netif_receive_skb() to include the orig_dev in arguments.
Change vlan code to call __netif_receive_skb(skb, orig_dev)
instead of netif_received_skb(skb)

int __netif_receive_skb(struct sk_buff *skb, struct net_device *orig_dev)
{
...
	if (!skb->skb_iif) {
		skb->skb_iif = orig_dev->ifindex;
	} else {
		if (orig_dev->master &&  ...) 
	}

}

int netif_receive_skb(struct sk_buff *skb)
{
	return __netif_receive_skb(skb, skb->dev);
}


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device
  2009-12-02  5:13 ` Eric Dumazet
@ 2009-12-02 16:38   ` Andy Gospodarek
  0 siblings, 0 replies; 25+ messages in thread
From: Andy Gospodarek @ 2009-12-02 16:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andy Gospodarek, netdev, fubar

On Wed, Dec 02, 2009 at 06:13:15AM +0100, Eric Dumazet wrote:
> Andy Gospodarek a écrit :
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 5d131c2..eeee269 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2439,8 +2439,8 @@ int netif_receive_skb(struct sk_buff *skb)
> >  		skb->skb_iif = skb->dev->ifindex;
> >  
> >  	null_or_orig = NULL;
> > -	orig_dev = skb->dev;
> > -	if (orig_dev->master) {
> > +	orig_dev = __dev_get_by_index(dev_net(skb->dev),skb->skb_iif);
> > +	if (orig_dev->master && !(skb->dev->priv_flags & IFF_802_1Q_VLAN)) {
> >  		if (skb_bond_should_drop(skb))
> >  			null_or_orig = orig_dev; /* deliver only exact match */
> >  		else
> 
> 
> Please dont add a __dev_get_by_index() call in netif_receive_skb()
> 
> First, this is not safe against writers.
> 
> Second its very expensive on some setups.
> 
> Eventually, you could do something like :
> 
> 
> Change prototypes of netif_receive_skb() to include the orig_dev in arguments.
> Change vlan code to call __netif_receive_skb(skb, orig_dev)
> instead of netif_received_skb(skb)
> 
> int __netif_receive_skb(struct sk_buff *skb, struct net_device *orig_dev)
> {
> ...
> 	if (!skb->skb_iif) {
> 		skb->skb_iif = orig_dev->ifindex;
> 	} else {
> 		if (orig_dev->master &&  ...) 
> 	}
> 
> }
> 
> int netif_receive_skb(struct sk_buff *skb)
> {
> 	return __netif_receive_skb(skb, skb->dev);
> }
> 

That is an interesting thought.  I'll investigate this as well.


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2010-01-04  5:19 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-30 20:14 [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Andy Gospodarek
2009-11-30 20:22 ` Patrick McHardy
2009-11-30 20:53   ` Andy Gospodarek
2009-12-01  9:42     ` Patrick McHardy
2009-12-01  0:00 ` Jay Vosburgh
2009-12-01  1:21   ` Andy Gospodarek
2009-12-01  1:57     ` Jay Vosburgh
2009-12-01 14:44       ` Andy Gospodarek
2009-12-01 21:28         ` Jay Vosburgh
2009-12-01 23:07           ` Andy Gospodarek
2009-12-02 21:24           ` Andy Gospodarek
2009-12-07 18:13             ` Andy Gospodarek
2009-12-07 18:24               ` Eric Dumazet
2009-12-09 22:01                 ` [PATCH net-next-2.6 v3] bonding: allow arp_ip_targets on separate vlans to use arp validation Andy Gospodarek
2009-12-11  5:17                   ` Jay Vosburgh
2009-12-14 20:48                     ` [PATCH net-next-2.6 v4] " Andy Gospodarek
2009-12-26  2:22                       ` David Miller
2009-12-28 15:26                         ` Andy Gospodarek
2009-12-28 15:33                           ` David Miller
2009-12-28 21:55                             ` Jay Vosburgh
2009-12-29  0:51                               ` Andy Gospodarek
2010-01-04  5:19                               ` David Miller
2009-12-01  4:11     ` [PATCH net-next-2.6] bonding: allow arp_ip_targets to be on a separate vlan from bond device Eric Dumazet
2009-12-02  5:13 ` Eric Dumazet
2009-12-02 16:38   ` Andy Gospodarek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).