* Re: more findings/questions on vlans/bonds
2009-04-07 9:14 more findings/questions on vlans/bonds Or Gerlitz
2009-04-07 9:16 ` Or Gerlitz
@ 2009-04-07 14:13 ` Jay Vosburgh
2009-04-16 14:17 ` Or Gerlitz
2009-04-23 21:04 ` Jay Vosburgh
2 siblings, 1 reply; 8+ messages in thread
From: Jay Vosburgh @ 2009-04-07 14:13 UTC (permalink / raw)
To: Or Gerlitz; +Cc: netdev
Or Gerlitz <ogerlitz@voltaire.com> wrote:
[...]
>I hope that you can help clarify what's the correct/supported method
>to work with vlans and bonds, with 2.6.29 I see that one can either
>
>[I pasted in the fixed text here]
> - vlan a bond (bond0.4001 over bond0 over eth0/1)
> - bond vlans (e.g bond0 over eth0/1.4001 over eth0/1)
I'll have to look into this, but my recollection is that you
should be able to do either of the above. I haven't tested the second
flavor lately, but I believe it should work. It's arguably only better
than the first alternative if, for some reason, the user wants to either
load balance or fail over across multiple VLANs, but (if all slaves are
on the same VLAN) the two above scenarios should be functionally
equivalent.
>I played a bit with bonding vlans (2.6.29 active-backup mode) and it
>doesn't seem to work - specifically, I noted that bonding doesn't issue
>fail-over after I changed the current slave link status to down ("ifconfig
>eth0.4001 down"). I suspect that the carrier based link monitoring scheme
>is broken wrt to vlan devices - e.g I found that at least from sysfs
>perspective the vlan device carrier isn't available:
>
>$ cat /sys/class/net/eth0.4001/carrier
>cat: /sys/class/net/eth0.4001/carrier: Invalid argument
>
>I also played with a post 2.6.29 cut of Linus tree, and I couldn't bond vlans.
>I thought that the reason may be that a vlan device has already master -
>but I found out that its wrong and vlans don't have the IFF_SLAVE bit set
>in their flags bit mask. So this may be either something wrong on my side
>or a bug introduced after 2.6.29 or just something which isn't supported,
>e.g following my findings with 2.6.29 above...
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: more findings/questions on vlans/bonds
2009-04-07 9:14 more findings/questions on vlans/bonds Or Gerlitz
2009-04-07 9:16 ` Or Gerlitz
2009-04-07 14:13 ` Jay Vosburgh
@ 2009-04-23 21:04 ` Jay Vosburgh
2009-04-24 15:22 ` Patrick McHardy
` (2 more replies)
2 siblings, 3 replies; 8+ messages in thread
From: Jay Vosburgh @ 2009-04-23 21:04 UTC (permalink / raw)
To: Or Gerlitz; +Cc: netdev, kaber
[ Added VLAN maintainer to cc, patch enclosed ]
Or Gerlitz <ogerlitz@voltaire.com> wrote:
>I hope that you can help clarify what's the correct/supported method
>to work with vlans and bonds, with 2.6.29 I see that one can either
>
> - vlan a bond (bond0.4001 over bond0 over eth0/1.4001)
> - bond vlans (e.g bond0 over eth0/1.4001)
>
>I played a bit with bonding vlans (2.6.29 active-backup mode) and it
>doesn't seem to work - specifically, I noted that bonding doesn't issue
>fail-over after I changed the current slave link status to down ("ifconfig
>eth0.4001 down"). I suspect that the carrier based link monitoring scheme
>is broken wrt to vlan devices - e.g I found that at least from sysfs
>perspective the vlan device carrier isn't available:
>
>$ cat /sys/class/net/eth0.4001/carrier
>cat: /sys/class/net/eth0.4001/carrier: Invalid argument
Ok, I just got a minute to fool with this, and it appears to
sort of work for me on a relatively recent cut of net-2.6 (approximately
2.6.30-rc1). I didn't try 2.6.29 exactly, but the only issues I had
with the version I used are as described below.
I suspect the "carrier: Invalid argument" response you describe
above is a result of the show_carrier function in net-sysfs.c, which
returns EINVAL if the device is not netif_running. If the device (vlan
or not) is running, the carrier is returned.
Anyway, configuring a bond over two VLAN devices (bond0 ->
eth{0,1}.600 -> eth0{0,1}), it configures and comes up fine.
If I pull the cable for eth0 or eth1, the vlan device's carrier
state updates correctly, and bonding fails over.
However, if I set eth0.VLAN or eth0 itself administratively
down, no failover takes place. This is apparently because the VLAN
device doesn't adjust its carrier state for UP or DOWN events from the
underlying real device or for open/close of the VLAN device itself. It
does update the carrier state for CHANGE events (from linkwatch, for
example).
So, here's a patch with a description. This resolves the
bonding problem for me; it might have other undesirable side effects,
though. It's debatable whether it's really necessary, but it does make
the VLAN device behave more like a real network device in this case:
[PATCH] vlan: update vlan carrier state for admin up/down
Currently, the VLAN event handler does not adjust the VLAN
device's carrier state when the real device or the VLAN device is set
administratively up or down.
The following patch adds a transfer of operating state from the
real device to the VLAN device when the real device is administratively
set up or down, and sets the carrier state up or down during init, open
and close of the VLAN device.
This permits observers above the VLAN device that care about the
carrier state (bonding's link monitor, for example) to receive updates
for administrative changes by more closely mimicing the behavior of real
devices.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 2b7390e..d1e1054 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -492,6 +492,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs & ~IFF_UP);
+ vlan_transfer_operstate(dev, vlandev);
}
break;
@@ -507,6 +508,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
continue;
dev_change_flags(vlandev, flgs | IFF_UP);
+ vlan_transfer_operstate(dev, vlandev);
}
break;
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 1b34135..2ce6658 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -462,6 +462,7 @@ static int vlan_dev_open(struct net_device *dev)
if (vlan->flags & VLAN_FLAG_GVRP)
vlan_gvrp_request_join(dev);
+ netif_carrier_on(dev);
return 0;
clear_allmulti:
@@ -471,6 +472,7 @@ del_unicast:
if (compare_ether_addr(dev->dev_addr, real_dev->dev_addr))
dev_unicast_delete(real_dev, dev->dev_addr, ETH_ALEN);
out:
+ netif_carrier_off(dev);
return err;
}
@@ -492,6 +494,7 @@ static int vlan_dev_stop(struct net_device *dev)
if (compare_ether_addr(dev->dev_addr, real_dev->dev_addr))
dev_unicast_delete(real_dev, dev->dev_addr, dev->addr_len);
+ netif_carrier_off(dev);
return 0;
}
@@ -612,6 +615,8 @@ static int vlan_dev_init(struct net_device *dev)
struct net_device *real_dev = vlan_dev_info(dev)->real_dev;
int subclass = 0;
+ netif_carrier_off(dev);
+
/* IFF_BROADCAST|IFF_MULTICAST; ??? */
dev->flags = real_dev->flags & ~(IFF_UP | IFF_PROMISC | IFF_ALLMULTI);
dev->iflink = real_dev->ifindex;
^ permalink raw reply related [flat|nested] 8+ messages in thread