netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver
@ 2007-08-20 15:34 Moni Shoua
  2007-08-20 15:42 ` [ofa-general] [PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag Moni Shoua
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:34 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

This patch series is the fourth version (see below link to V3) of the 
suggested changes to the bonding driver so it would be able to support 
non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode. 

The motivation is to enable the bonding driver on its HA mode to work with 
the IP over Infiniband (IPoIB) driver. With these patches I was able to enslave 
IPoIB netdevices and run TCP, UDP, IP (UDP) Multicast and ICMP traffic with 
fail-over and fail-back working fine. The working environment was the net-2.6 git. 

More over, as IPoIB is also the IB ARP provider for the RDMA CM driver which 
is used by native IB ULPs whose addressing scheme is based on IP (e.g. iSER, 
SDP, Lustre, NFSoRDMA, RDS), bonding support for IPoIB devices **enables** HA 
for these ULPs. This holds as when the ULP is informed by the IB HW on the 
failure of the current IB connection, it just need to reconnect, where the 
bonding device will now issue the IB ARP over the active IPoIB slave. 

This series also includes patches to the IPoIB driver that fix some fix 
some neighboring related issues. 

Major changes from the previous version:

1) Addressing the issue of safety when unloading the IPoIB module before
the bonding module
2) style changes


Links to earlier discussion: 

1. A discussion in netdev about bonding support for IPoIB.
http://lists.openwall.net/netdev/2006/11/30/46

2. A discussion in openfabrics regarding changes in the IPoIB that 
enable using it as a slave for bonding.
http://lists.openfabrics.org/pipermail/general/2007-July/038914.html


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
@ 2007-08-20 15:42 ` Moni Shoua
  2007-08-20 15:43 ` [ofa-general] [PATCH V4 2/10] IB/ipoib: Notify the world before doing unregister Moni Shoua
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:42 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

Export the call to raw_notifier_call_chain so modules can send notifications
on netdev events to the netdev_chain.
Add IFF_SLAVE_DETACH to the list of priv_flags for net_device.
This flag is set by a slave that is about to unregisster from the kernel.

Both changes are used in bonding slaves that wish to inform the bonding master
about coming detachment.

Signed-off-by: Moni Shoua <monis@voltaire.com>
---
 include/linux/if.h |    1 +
 net/core/dev.c     |    1 +
 2 files changed, 2 insertions(+)

Index: net-2.6/net/core/dev.c
===================================================================
--- net-2.6.orig/net/core/dev.c	2007-08-15 10:09:02.000000000 +0300
+++ net-2.6/net/core/dev.c	2007-08-15 10:53:00.832543390 +0300
@@ -1148,6 +1148,7 @@ int call_netdevice_notifiers(unsigned lo
 {
 	return raw_notifier_call_chain(&netdev_chain, val, v);
 }
+EXPORT_SYMBOL(call_netdevice_notifiers);
 
 /* When > 0 there are consumers of rx skb time stamps */
 static atomic_t netstamp_needed = ATOMIC_INIT(0);
Index: net-2.6/include/linux/if.h
===================================================================
--- net-2.6.orig/include/linux/if.h	2007-08-20 14:30:39.000000000 +0300
+++ net-2.6/include/linux/if.h	2007-08-20 14:31:06.625174369 +0300
@@ -61,6 +61,7 @@
 #define IFF_MASTER_ALB	0x10		/* bonding master, balance-alb.	*/
 #define IFF_BONDING	0x20		/* bonding master or slave	*/
 #define IFF_SLAVE_NEEDARP 0x40		/* need ARPs for validation	*/
+#define IFF_SLAVE_DETACH 0x80		/* slave is about to unregister */
 
 #define IF_GET_IFACE	0x0001		/* for querying only */
 #define IF_GET_PROTO	0x0002

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V4 2/10] IB/ipoib: Notify the world before doing unregister
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
  2007-08-20 15:42 ` [ofa-general] [PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag Moni Shoua
@ 2007-08-20 15:43 ` Moni Shoua
  2007-08-20 15:44 ` [ofa-general] [PATCH V4 3/10] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:43 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

When the bonding device enslaves IPoIB devices it takes pointers to
functions in the ib_ipoib module. This is fine as long as the ib_ipoib
nodule remains loaded while the references to its functions exist.
So, to help bonding do a cleanup on time, when the IPoIB net device is a 
slave of a bonding master, let the master know that the IPoIB device is
about to unregister (but before calling unregister).

Signed-off-by: Moni Shoua <monis@voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-20 14:29:29.522209580 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-20 14:43:03.432162133 +0300
@@ -48,6 +48,7 @@
 #include <linux/in.h>
 
 #include <net/dst.h>
+#include <linux/netdevice.h>
 
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("IP-over-InfiniBand net driver");
@@ -772,6 +773,18 @@ static void ipoib_timeout(struct net_dev
 	/* XXX reset QP, etc. */
 }
 
+static int ipoib_slave_detach(struct net_device *dev)
+{
+	int ret = 0;
+	if (dev->flags & IFF_SLAVE) {
+		dev->priv_flags |= IFF_SLAVE_DETACH;
+		rtnl_lock();
+		ret = call_netdevice_notifiers(NETDEV_CHANGE, dev);
+		rtnl_unlock();
+	}
+	return ret;
+}
+
 static int ipoib_hard_header(struct sk_buff *skb,
 			     struct net_device *dev,
 			     unsigned short type,
@@ -921,6 +934,7 @@ void ipoib_dev_cleanup(struct net_device
 
 	/* Delete any child interfaces first */
 	list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) {
+		ipoib_slave_detach(cpriv->dev);
 		unregister_netdev(cpriv->dev);
 		ipoib_dev_cleanup(cpriv->dev);
 		free_netdev(cpriv->dev);
@@ -1208,6 +1222,7 @@ static void ipoib_remove_one(struct ib_d
 		ib_unregister_event_handler(&priv->event_handler);
 		flush_scheduled_work();
 
+		ipoib_slave_detach(priv->dev);
 		unregister_netdev(priv->dev);
 		ipoib_dev_cleanup(priv->dev);
 		free_netdev(priv->dev);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] [PATCH V4 3/10] IB/ipoib: Bound the net device to the ipoib_neigh structue
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
  2007-08-20 15:42 ` [ofa-general] [PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag Moni Shoua
  2007-08-20 15:43 ` [ofa-general] [PATCH V4 2/10] IB/ipoib: Notify the world before doing unregister Moni Shoua
@ 2007-08-20 15:44 ` Moni Shoua
  2007-08-20 15:46 ` [PATCH V4 4/10] IB/ipoib: Verify address handle validity on send Moni Shoua
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:44 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Under this scheme, ipoib_neigh_destructor assumption that for each struct
neighbour it gets, n->dev is an ipoib device and hence netdev_priv(n->dev)
can be casted to struct ipoib_dev_priv is buggy.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one, when n->dev->flags has the
IFF_MASTER bit set.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib.h           |    4 +++-
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |   17 +++++++++++++++--
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |    3 ++-
 3 files changed, 20 insertions(+), 4 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib.h	2007-08-15 10:09:00.000000000 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib.h	2007-08-15 10:53:52.756348574 +0300
@@ -328,6 +328,7 @@ struct ipoib_neigh {
 	struct sk_buff_head queue;
 
 	struct neighbour   *neighbour;
+	struct net_device *dev;
 
 	struct list_head    list;
 };
@@ -344,7 +345,8 @@ static inline struct ipoib_neigh **to_ip
 				     INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+				      struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-15 10:53:28.000000000 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-15 10:53:52.757348397 +0300
@@ -511,7 +511,7 @@ static void neigh_add_path(struct sk_buf
 	struct ipoib_path *path;
 	struct ipoib_neigh *neigh;
 
-	neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+	neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
 	if (!neigh) {
 		++priv->stats.tx_dropped;
 		dev_kfree_skb_any(skb);
@@ -830,6 +830,17 @@ static void ipoib_neigh_cleanup(struct n
 	unsigned long flags;
 	struct ipoib_ah *ah = NULL;
 
+	if (n->dev->flags & IFF_MASTER) {
+		/* n->dev is not an IPoIB device and we have
+			to take priv from elsewhere */
+		neigh = *to_ipoib_neigh(n);
+		if (neigh) {
+			priv = netdev_priv(neigh->dev);
+			ipoib_dbg(priv, "neigh_destructor for bonding device: %s\n",
+				  n->dev->name);
+		} else
+			return;
+	}
 	ipoib_dbg(priv,
 		  "neigh_cleanup for %06x " IPOIB_GID_FMT "\n",
 		  IPOIB_QPN(n->ha),
@@ -851,7 +862,8 @@ static void ipoib_neigh_cleanup(struct n
 		ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+				      struct net_device *dev)
 {
 	struct ipoib_neigh *neigh;
 
@@ -860,6 +872,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
 		return NULL;
 
 	neigh->neighbour = neighbour;
+	neigh->dev = dev;
 	*to_ipoib_neigh(neighbour) = neigh;
 	skb_queue_head_init(&neigh->queue);
 	ipoib_cm_set(neigh, NULL);
Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2007-08-15 10:09:00.000000000 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_multicast.c	2007-08-15 10:53:52.758348220 +0300
@@ -727,7 +727,8 @@ out:
 		if (skb->dst            &&
 		    skb->dst->neighbour &&
 		    !*to_ipoib_neigh(skb->dst->neighbour)) {
-			struct ipoib_neigh *neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+			struct ipoib_neigh *neigh = ipoib_neigh_alloc(skb->dst->neighbour,
+									skb->dev);
 
 			if (neigh) {
 				kref_get(&mcast->ah->ref);

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 4/10] IB/ipoib: Verify address handle validity on send
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (2 preceding siblings ...)
  2007-08-20 15:44 ` [ofa-general] [PATCH V4 3/10] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
@ 2007-08-20 15:46 ` Moni Shoua
  2007-08-20 15:48 ` [PATCH V4 5/10] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:46 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

When the bonding device senses a carrier loss of its active slave it replaces
that slave with a new one. In between the times when the carrier of an IPoIB
device goes down and ipoib_neigh is destroyed, it is possible that the
bonding driver will send a packet on a new slave that uses an old ipoib_neigh.
This patch detects and prevents this from happenning.

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

Index: net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c
===================================================================
--- net-2.6.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-15 10:53:52.000000000 +0300
+++ net-2.6/drivers/infiniband/ulp/ipoib/ipoib_main.c	2007-08-15 10:54:03.959364640 +0300
@@ -686,9 +686,10 @@ static int ipoib_start_xmit(struct sk_bu
 				goto out;
 			}
 		} else if (neigh->ah) {
-			if (unlikely(memcmp(&neigh->dgid.raw,
+			if (unlikely((memcmp(&neigh->dgid.raw,
 					    skb->dst->neighbour->ha + 4,
-					    sizeof(union ib_gid)))) {
+					    sizeof(union ib_gid))) ||
+						 (neigh->dev != dev))) {
 				spin_lock(&priv->lock);
 				/*
 				 * It's safe to call ipoib_put_ah() inside


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 5/10] net/bonding: Enable bonding to enslave non ARPHRD_ETHER
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (3 preceding siblings ...)
  2007-08-20 15:46 ` [PATCH V4 4/10] IB/ipoib: Verify address handle validity on send Moni Shoua
@ 2007-08-20 15:48 ` Moni Shoua
  2007-08-20 15:49 ` [PATCH V4 6/10] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:48 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

This patch changes some of the bond netdevice attributes and functions
to be that of the active slave for the case of the enslaved device not being
of ARPHRD_ETHER type. Basically it overrides those setting done by ether_setup(),
which are netdevice **type** dependent and hence might be not appropriate for
devices of other types. It also enforces mutual exclusion on bonding slaves
from dissimilar ether types, as was concluded over the v1 discussion.

IPoIB (see Documentation/infiniband/ipoib.txt) MAC address is made of a 3 bytes
IB QP (Queue Pair) number and 16 bytes IB port GID (Global ID) of the port this
IPoIB device is bounded to. The QP is a resource created by the IB HW and the
GID is an identifier burned into the HCA (i have omitted here some details which
are not important for the bonding RFC).

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
 drivers/net/bonding/bond_main.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 39 insertions(+)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:08:59.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:54:13.424688411 +0300
@@ -1237,6 +1237,26 @@ static int bond_compute_features(struct 
 	return 0;
 }
 
+
+static void bond_setup_by_slave(struct net_device *bond_dev,
+				struct net_device *slave_dev)
+{
+	bond_dev->hard_header	        = slave_dev->hard_header;
+	bond_dev->rebuild_header        = slave_dev->rebuild_header;
+	bond_dev->hard_header_cache	= slave_dev->hard_header_cache;
+	bond_dev->header_cache_update   = slave_dev->header_cache_update;
+	bond_dev->hard_header_parse	= slave_dev->hard_header_parse;
+
+	bond_dev->neigh_setup           = slave_dev->neigh_setup;
+
+	bond_dev->type		    = slave_dev->type;
+	bond_dev->hard_header_len   = slave_dev->hard_header_len;
+	bond_dev->addr_len	    = slave_dev->addr_len;
+
+	memcpy(bond_dev->broadcast, slave_dev->broadcast,
+		slave_dev->addr_len);
+}
+
 /* enslave device <slave> to bond device <master> */
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
@@ -1311,6 +1331,25 @@ int bond_enslave(struct net_device *bond
 		goto err_undo_flags;
 	}
 
+	/* set bonding device ether type by slave - bonding netdevices are
+	 * created with ether_setup, so when the slave type is not ARPHRD_ETHER
+	 * there is a need to override some of the type dependent attribs/funcs.
+	 *
+	 * bond ether type mutual exclusion - don't allow slaves of dissimilar
+	 * ether type (eg ARPHRD_ETHER and ARPHRD_INFINIBAND) share the same bond
+	 */
+	if (bond->slave_cnt == 0) {
+		if (slave_dev->type != ARPHRD_ETHER)
+			bond_setup_by_slave(bond_dev, slave_dev);
+	} else if (bond_dev->type != slave_dev->type) {
+		printk(KERN_ERR DRV_NAME ": %s ether type (%d) is different "
+			"from other slaves (%d), can not enslave it.\n",
+			slave_dev->name,
+			slave_dev->type, bond_dev->type);
+			res = -EINVAL;
+			goto err_undo_flags;
+	}
+
 	if (slave_dev->set_mac_address == NULL) {
 		printk(KERN_ERR DRV_NAME
 			": %s: Error: The slave device you specified does "


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 6/10] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address()
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (4 preceding siblings ...)
  2007-08-20 15:48 ` [PATCH V4 5/10] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
@ 2007-08-20 15:49 ` Moni Shoua
  2007-08-20 15:51 ` [PATCH V4 7/10] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:49 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

This patch allows for enslaving netdevices which do not support
the set_mac_address() function. In that case the bond mac address is the one
of the active slave, where remote peers are notified on the mac address
(neighbour) change by Gratuitous ARP sent by bonding when fail-over occurs
(this is already done by the bonding code).

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
 drivers/net/bonding/bond_main.c |   87 +++++++++++++++++++++++++++-------------
 drivers/net/bonding/bonding.h   |    1 
 2 files changed, 60 insertions(+), 28 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:54:13.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:54:41.971632881 +0300
@@ -1095,6 +1095,14 @@ void bond_change_active_slave(struct bon
 		if (new_active) {
 			bond_set_slave_active_flags(new_active);
 		}
+
+		/* when bonding does not set the slave MAC address, the bond MAC
+		 * address is the one of the active slave.
+		 */
+		if (new_active && !bond->do_set_mac_addr)
+			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
+				new_active->dev->addr_len);
+
 		bond_send_gratuitous_arp(bond);
 	}
 }
@@ -1351,13 +1359,22 @@ int bond_enslave(struct net_device *bond
 	}
 
 	if (slave_dev->set_mac_address == NULL) {
-		printk(KERN_ERR DRV_NAME
-			": %s: Error: The slave device you specified does "
-			"not support setting the MAC address. "
-			"Your kernel likely does not support slave "
-			"devices.\n", bond_dev->name);
-  		res = -EOPNOTSUPP;
-		goto err_undo_flags;
+		if (bond->slave_cnt == 0) {
+			printk(KERN_WARNING DRV_NAME
+				": %s: Warning: The first slave device you "
+				"specified does not support setting the MAC "
+				"address. This bond MAC address would be that "
+				"of the active slave.\n", bond_dev->name);
+			bond->do_set_mac_addr = 0;
+		} else if (bond->do_set_mac_addr) {
+			printk(KERN_ERR DRV_NAME
+				": %s: Error: The slave device you specified "
+				"does not support setting the MAC addres,."
+				"but this bond uses this practice. \n"
+				, bond_dev->name);
+			res = -EOPNOTSUPP;
+			goto err_undo_flags;
+		}
 	}
 
 	new_slave = kzalloc(sizeof(struct slave), GFP_KERNEL);
@@ -1378,16 +1395,18 @@ int bond_enslave(struct net_device *bond
 	 */
 	memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN);
 
-	/*
-	 * Set slave to master's mac address.  The application already
-	 * set the master's mac address to that of the first slave
-	 */
-	memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len);
-	addr.sa_family = slave_dev->type;
-	res = dev_set_mac_address(slave_dev, &addr);
-	if (res) {
-		dprintk("Error %d calling set_mac_address\n", res);
-		goto err_free;
+	if (bond->do_set_mac_addr) {
+		/*
+		 * Set slave to master's mac address.  The application already
+		 * set the master's mac address to that of the first slave
+		 */
+		memcpy(addr.sa_data, bond_dev->dev_addr, bond_dev->addr_len);
+		addr.sa_family = slave_dev->type;
+		res = dev_set_mac_address(slave_dev, &addr);
+		if (res) {
+			dprintk("Error %d calling set_mac_address\n", res);
+			goto err_free;
+		}
 	}
 
 	res = netdev_set_master(slave_dev, bond_dev);
@@ -1612,9 +1631,11 @@ err_close:
 	dev_close(slave_dev);
 
 err_restore_mac:
-	memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN);
-	addr.sa_family = slave_dev->type;
-	dev_set_mac_address(slave_dev, &addr);
+	if (bond->do_set_mac_addr) {
+		memcpy(addr.sa_data, new_slave->perm_hwaddr, ETH_ALEN);
+		addr.sa_family = slave_dev->type;
+		dev_set_mac_address(slave_dev, &addr);
+	}
 
 err_free:
 	kfree(new_slave);
@@ -1792,10 +1813,12 @@ int bond_release(struct net_device *bond
 	/* close slave before restoring its mac address */
 	dev_close(slave_dev);
 
-	/* restore original ("permanent") mac address */
-	memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-	addr.sa_family = slave_dev->type;
-	dev_set_mac_address(slave_dev, &addr);
+	if (bond->do_set_mac_addr) {
+		/* restore original ("permanent") mac address */
+		memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
+		addr.sa_family = slave_dev->type;
+		dev_set_mac_address(slave_dev, &addr);
+	}
 
 	slave_dev->priv_flags &= ~(IFF_MASTER_8023AD | IFF_MASTER_ALB |
 				   IFF_SLAVE_INACTIVE | IFF_BONDING |
@@ -1882,10 +1905,12 @@ static int bond_release_all(struct net_d
 		/* close slave before restoring its mac address */
 		dev_close(slave_dev);
 
-		/* restore original ("permanent") mac address*/
-		memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
-		addr.sa_family = slave_dev->type;
-		dev_set_mac_address(slave_dev, &addr);
+		if (bond->do_set_mac_addr) {
+			/* restore original ("permanent") mac address*/
+			memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
+			addr.sa_family = slave_dev->type;
+			dev_set_mac_address(slave_dev, &addr);
+		}
 
 		slave_dev->priv_flags &= ~(IFF_MASTER_8023AD | IFF_MASTER_ALB |
 					   IFF_SLAVE_INACTIVE);
@@ -3922,6 +3947,9 @@ static int bond_set_mac_address(struct n
 
 	dprintk("bond=%p, name=%s\n", bond, (bond_dev ? bond_dev->name : "None"));
 
+	if (!bond->do_set_mac_addr)
+		return -EOPNOTSUPP;
+
 	if (!is_valid_ether_addr(sa->sa_data)) {
 		return -EADDRNOTAVAIL;
 	}
@@ -4312,6 +4340,9 @@ static int bond_init(struct net_device *
 	bond_create_proc_entry(bond);
 #endif
 
+	/* set do_set_mac_addr to true on startup */
+	bond->do_set_mac_addr = 1;
+
 	list_add_tail(&bond->bond_list, &bond_dev_list);
 
 	return 0;
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:08:58.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-15 10:55:34.359354833 +0300
@@ -185,6 +185,7 @@ struct bonding {
 	struct   timer_list mii_timer;
 	struct   timer_list arp_timer;
 	s8       kill_timers;
+	s8       do_set_mac_addr;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 7/10] net/bonding: Enable IP multicast for bonding IPoIB devices
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (5 preceding siblings ...)
  2007-08-20 15:49 ` [PATCH V4 6/10] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
@ 2007-08-20 15:51 ` Moni Shoua
  2007-08-20 15:52 ` [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:51 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

Allow to enslave devices when the bonding device is not up. Over the discussion
held at the previous post this seemed to be the most clean way to go, where it
is not expected to cause instabilities.

Normally, the bonding driver is UP before any enslavement takes place.
Once a netdevice is UP, the network stack acts to have it join some multicast groups
(eg the all-hosts 224.0.0.1). Now, since ether_setup() have set the bonding device
type to be ARPHRD_ETHER and address len to be ETHER_ALEN, the net core code
computes a wrong multicast link address. This is b/c ip_eth_mc_map() is called
where for multicast joins taking place after the enslavement another ip_xxx_mc_map()
is called (eg ip_ib_mc_map() when the bond type is ARPHRD_INFINIBAND)

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
---
 drivers/net/bonding/bond_main.c  |    5 +++--
 drivers/net/bonding/bond_sysfs.c |    6 ++----
 2 files changed, 5 insertions(+), 6 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:54:41.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 10:55:48.431862446 +0300
@@ -1285,8 +1285,9 @@ int bond_enslave(struct net_device *bond
 
 	/* bond must be initialized by bond_open() before enslaving */
 	if (!(bond_dev->flags & IFF_UP)) {
-		dprintk("Error, master_dev is not up\n");
-		return -EPERM;
+		printk(KERN_WARNING DRV_NAME
+			" %s: master_dev is not up in bond_enslave\n",
+			bond_dev->name);
 	}
 
 	/* already enslaved */
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:08:58.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:55:48.432862269 +0300
@@ -266,11 +266,9 @@ static ssize_t bonding_store_slaves(stru
 
 	/* Quick sanity check -- is the bond interface up? */
 	if (!(bond->dev->flags & IFF_UP)) {
-		printk(KERN_ERR DRV_NAME
-		       ": %s: Unable to update slaves because interface is down.\n",
+		printk(KERN_WARNING DRV_NAME
+		       ": %s: doing slave updates when interface is down.\n",
 		       bond->dev->name);
-		ret = -EPERM;
-		goto out;
 	}
 
 	/* Note:  We can't hold bond->lock here, as bond_create grabs it. */


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (6 preceding siblings ...)
  2007-08-20 15:51 ` [PATCH V4 7/10] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
@ 2007-08-20 15:52 ` Moni Shoua
       [not found]   ` <416.1188343604@death>
  2007-08-20 15:53 ` [ofa-general] PATCH V4 9/10] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:52 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

bonding sometimes uses Ethernet constants (such as MTU and address length) which
are not good when it enslaves non Ethernet devices (such as InfiniBand).

Signed-off-by: Moni Shoua <monis@voltaire.com>
---
 drivers/net/bonding/bond_main.c  |    3 ++-
 drivers/net/bonding/bond_sysfs.c |   19 +++++++++++++------
 drivers/net/bonding/bonding.h    |    1 +
 3 files changed, 16 insertions(+), 7 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:55:48.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-20 14:29:11.911298577 +0300
@@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
 	struct slave *slave;
 	struct net_device *bond_dev = bond->dev;
 	unsigned long features = bond_dev->features;
-	unsigned short max_hard_header_len = ETH_HLEN;
+	unsigned short max_hard_header_len = max((u16)ETH_HLEN,
+						bond_dev->hard_header_len);
 	int i;
 
 	features &= ~(NETIF_F_ALL_CSUM | BOND_VLAN_FEATURES);
Index: net-2.6/drivers/net/bonding/bond_sysfs.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_sysfs.c	2007-08-15 10:55:48.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_sysfs.c	2007-08-15 12:14:41.152469089 +0300
@@ -164,9 +164,7 @@ static ssize_t bonding_store_bonds(struc
 				printk(KERN_INFO DRV_NAME
 					": %s is being deleted...\n",
 					bond->dev->name);
-				bond_deinit(bond->dev);
-		        	bond_destroy_sysfs_entry(bond);
-				unregister_netdevice(bond->dev);
+				bond_destroy(bond);
 				rtnl_unlock();
 				goto out;
 			}
@@ -260,6 +258,7 @@ static ssize_t bonding_store_slaves(stru
 	char command[IFNAMSIZ + 1] = { 0, };
 	char *ifname;
 	int i, res, found, ret = count;
+	u32 original_mtu;
 	struct slave *slave;
 	struct net_device *dev = NULL;
 	struct bonding *bond = to_bond(d);
@@ -325,6 +324,7 @@ static ssize_t bonding_store_slaves(stru
 		}
 
 		/* Set the slave's MTU to match the bond */
+		original_mtu = dev->mtu;
 		if (dev->mtu != bond->dev->mtu) {
 			if (dev->change_mtu) {
 				res = dev->change_mtu(dev,
@@ -339,6 +339,9 @@ static ssize_t bonding_store_slaves(stru
 		}
 		rtnl_lock();
 		res = bond_enslave(bond->dev, dev);
+		bond_for_each_slave(bond, slave, i)
+			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0)
+				slave->original_mtu = original_mtu;
 		rtnl_unlock();
 		if (res) {
 			ret = res;
@@ -351,13 +354,17 @@ static ssize_t bonding_store_slaves(stru
 		bond_for_each_slave(bond, slave, i)
 			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
 				dev = slave->dev;
+				original_mtu = slave->original_mtu;
 				break;
 			}
 		if (dev) {
 			printk(KERN_INFO DRV_NAME ": %s: Removing slave %s\n",
 				bond->dev->name, dev->name);
 			rtnl_lock();
-			res = bond_release(bond->dev, dev);
+			if (bond->setup_by_slave)
+				res = bond_release_and_destroy(bond->dev, dev);
+			else
+				res = bond_release(bond->dev, dev);
 			rtnl_unlock();
 			if (res) {
 				ret = res;
@@ -365,9 +372,9 @@ static ssize_t bonding_store_slaves(stru
 			}
 			/* set the slave MTU to the default */
 			if (dev->change_mtu) {
-				dev->change_mtu(dev, 1500);
+				dev->change_mtu(dev, original_mtu);
 			} else {
-				dev->mtu = 1500;
+				dev->mtu = original_mtu;
 			}
 		}
 		else {
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:55:34.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-20 14:29:11.912298402 +0300
@@ -156,6 +156,7 @@ struct slave {
 	s8     link;    /* one of BOND_LINK_XXXX */
 	s8     state;   /* one of BOND_STATE_XXXX */
 	u32    original_flags;
+	u32    original_mtu;
 	u32    link_failure_count;
 	u16    speed;
 	u8     duplex;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [ofa-general] PATCH V4 9/10] net/bonding: Delay sending of gratuitous ARP to avoid failure
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (7 preceding siblings ...)
  2007-08-20 15:52 ` [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
@ 2007-08-20 15:53 ` Moni Shoua
  2007-08-20 15:58 ` [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:53 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
in dev->state field is on. This improves the chances for the arp packet to
be transmitted.

Signed-off-by: Moni Shoua <monis@voltaire.com>
---
 drivers/net/bonding/bond_main.c |   24 +++++++++++++++++++++---
 drivers/net/bonding/bonding.h   |    1 +
 2 files changed, 22 insertions(+), 3 deletions(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:56:33.000000000 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-15 11:04:37.221123652 +0300
@@ -1102,8 +1102,14 @@ void bond_change_active_slave(struct bon
 		if (new_active && !bond->do_set_mac_addr)
 			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
 				new_active->dev->addr_len);
-
-		bond_send_gratuitous_arp(bond);
+		if (bond->curr_active_slave &&
+			test_bit(__LINK_STATE_LINKWATCH_PENDING,
+					&bond->curr_active_slave->dev->state)) {
+			dprintk("delaying gratuitous arp on %s\n",
+				bond->curr_active_slave->dev->name);
+			bond->send_grat_arp = 1;
+		} else
+			bond_send_gratuitous_arp(bond);
 	}
 }
 
@@ -2083,6 +2089,17 @@ void bond_mii_monitor(struct net_device 
 	 * program could monitor the link itself if needed.
 	 */
 
+	if (bond->send_grat_arp) {
+		if (bond->curr_active_slave && test_bit(__LINK_STATE_LINKWATCH_PENDING,
+				&bond->curr_active_slave->dev->state))
+			dprintk("Needs to send gratuitous arp but not yet\n");
+		else {
+			dprintk("sending delayed gratuitous arp on on %s\n",
+				bond->curr_active_slave->dev->name);
+			bond_send_gratuitous_arp(bond);
+			bond->send_grat_arp = 0;
+		}
+	}
 	read_lock(&bond->curr_slave_lock);
 	oldcurrent = bond->curr_active_slave;
 	read_unlock(&bond->curr_slave_lock);
@@ -2484,7 +2501,7 @@ static void bond_send_gratuitous_arp(str
 
 	if (bond->master_ip) {
 		bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
-				  bond->master_ip, 0);
+				bond->master_ip, 0);
 	}
 
 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
@@ -4293,6 +4310,7 @@ static int bond_init(struct net_device *
 	bond->current_arp_slave = NULL;
 	bond->primary_slave = NULL;
 	bond->dev = bond_dev;
+	bond->send_grat_arp = 0;
 	INIT_LIST_HEAD(&bond->vlan_list);
 
 	/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-15 10:56:33.000000000 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-15 11:05:41.516451497 +0300
@@ -187,6 +187,7 @@ struct bonding {
 	struct   timer_list arp_timer;
 	s8       kill_timers;
 	s8       do_set_mac_addr;
+	s8	 send_grat_arp;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (8 preceding siblings ...)
  2007-08-20 15:53 ` [ofa-general] PATCH V4 9/10] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
@ 2007-08-20 15:58 ` Moni Shoua
       [not found]   ` <3403.1188343986@death>
  2007-09-01 20:19 ` [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Or Gerlitz
  2007-09-10 14:31 ` Moni Shoua
  11 siblings, 1 reply; 17+ messages in thread
From: Moni Shoua @ 2007-08-20 15:58 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: netdev, general

When bonding enslaves non Ethernet devices it takes pointers to functions 
in the module that owns the slaves. In this case it becomes unsafe
to keep the bonding master registered after last slave was unenslaved 
because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
ensures that these functions be used anymore.

Signed-off-by: Moni Shoua <monis@voltaire.com>
---
 drivers/net/bonding/bond_main.c |   45 +++++++++++++++++++++++++++++++++++++++-
 drivers/net/bonding/bonding.h   |    3 ++
 2 files changed, 47 insertions(+), 1 deletion(-)

Index: net-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-20 14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-20 14:43:17.850571535 +0300
@@ -1256,6 +1256,7 @@ static int bond_compute_features(struct 
 static void bond_setup_by_slave(struct net_device *bond_dev,
 				struct net_device *slave_dev)
 {
+	struct bonding *bond = bond_dev->priv;
 	bond_dev->hard_header	        = slave_dev->hard_header;
 	bond_dev->rebuild_header        = slave_dev->rebuild_header;
 	bond_dev->hard_header_cache	= slave_dev->hard_header_cache;
@@ -1270,6 +1271,7 @@ static void bond_setup_by_slave(struct n
 
 	memcpy(bond_dev->broadcast, slave_dev->broadcast,
 		slave_dev->addr_len);
+	bond->setup_by_slave = 1;
 }
 
 /* enslave device <slave> to bond device <master> */
@@ -1838,6 +1840,35 @@ int bond_release(struct net_device *bond
 }
 
 /*
+* Destroy a bonding device.
+* Must be under rtnl_lock when this function is called.
+*/
+void bond_destroy(struct bonding *bond)
+{
+	bond_deinit(bond->dev);
+	bond_destroy_sysfs_entry(bond);
+	unregister_netdevice(bond->dev);
+}
+
+/*
+* First release a slave and than destroy the bond if no more slaves iare left.
+* Must be under rtnl_lock when this function is called.
+*/
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device *slave_dev)
+{
+	struct bonding *bond = bond_dev->priv;
+	int ret;
+
+	ret = bond_release(bond_dev, slave_dev);
+	if ((ret == 0) && (bond->slave_cnt == 0)) {
+		printk(KERN_INFO DRV_NAME " %s: destroying bond for.\n",
+					bond_dev->name);
+		bond_destroy(bond);
+	}
+	return ret;
+}
+
+/*
  * This function releases all slaves.
  */
 static int bond_release_all(struct net_device *bond_dev)
@@ -3322,7 +3353,11 @@ static int bond_slave_netdev_event(unsig
 	switch (event) {
 	case NETDEV_UNREGISTER:
 		if (bond_dev) {
-			bond_release(bond_dev, slave_dev);
+			dprintk("slave %s unregisters\n", slave_dev->name);
+			if (bond->setup_by_slave)
+				bond_release_and_destroy(bond_dev, slave_dev);
+			else
+				bond_release(bond_dev, slave_dev);
 		}
 		break;
 	case NETDEV_CHANGE:
@@ -3331,6 +3366,13 @@ static int bond_slave_netdev_event(unsig
 		 * sets up a hierarchical bond, then rmmod's
 		 * one of the slave bonding devices?
 		 */
+		if (slave_dev->priv_flags & IFF_SLAVE_DETACH) {
+			dprintk("slave %s detaching\n", slave_dev->name);
+			if (bond->setup_by_slave)
+				bond_release_and_destroy(bond_dev, slave_dev);
+			else
+				bond_release(bond_dev, slave_dev);
+		}
 		break;
 	case NETDEV_DOWN:
 		/*
@@ -4311,6 +4353,7 @@ static int bond_init(struct net_device *
 	bond->primary_slave = NULL;
 	bond->dev = bond_dev;
 	bond->send_grat_arp = 0;
+	bond->setup_by_slave = 0;
 	INIT_LIST_HEAD(&bond->vlan_list);
 
 	/* Initialize the device entry points */
Index: net-2.6/drivers/net/bonding/bonding.h
===================================================================
--- net-2.6.orig/drivers/net/bonding/bonding.h	2007-08-20 14:43:17.123702132 +0300
+++ net-2.6/drivers/net/bonding/bonding.h	2007-08-20 14:47:52.845180870 +0300
@@ -188,6 +188,7 @@ struct bonding {
 	s8       kill_timers;
 	s8       do_set_mac_addr;
 	s8	 send_grat_arp;
+	s8	 setup_by_slave;
 	struct   net_device_stats stats;
 #ifdef CONFIG_PROC_FS
 	struct   proc_dir_entry *proc_entry;
@@ -295,6 +296,8 @@ static inline void bond_unset_master_alb
 struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr);
 int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev);
 int bond_create(char *name, struct bond_params *params, struct bonding **newbond);
+void bond_destroy(struct bonding *bond);
+int  bond_release_and_destroy(struct net_device *bond_dev, struct net_device *slave_dev);
 void bond_deinit(struct net_device *bond_dev);
 int bond_create_sysfs(void);
 void bond_destroy_sysfs(void);


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device
       [not found]   ` <416.1188343604@death>
@ 2007-08-29 13:37     ` Moni Shoua
  0 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-08-29 13:37 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Moni Shoua, netdev, rdreier, davem, general

Jay Vosburgh wrote:
> Moni Shoua <monis@voltaire.com> wrote:
> 
>> bonding sometimes uses Ethernet constants (such as MTU and address length) which
>> are not good when it enslaves non Ethernet devices (such as InfiniBand).
>>
>> Signed-off-by: Moni Shoua <monis@voltaire.com>
>> ---
>> drivers/net/bonding/bond_main.c  |    3 ++-
>> drivers/net/bonding/bond_sysfs.c |   19 +++++++++++++------
>> drivers/net/bonding/bonding.h    |    1 +
>> 3 files changed, 16 insertions(+), 7 deletions(-)
>>
>> Index: net-2.6/drivers/net/bonding/bond_main.c
>> ===================================================================
>> --- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-08-15 10:55:48.000000000 +0300
>> +++ net-2.6/drivers/net/bonding/bond_main.c	2007-08-20 14:29:11.911298577 +0300
>> @@ -1224,7 +1224,8 @@ static int bond_compute_features(struct 
>> 	struct slave *slave;
>> 	struct net_device *bond_dev = bond->dev;
>> 	unsigned long features = bond_dev->features;
>> -	unsigned short max_hard_header_len = ETH_HLEN;
>> +	unsigned short max_hard_header_len = max((u16)ETH_HLEN,
>> +						bond_dev->hard_header_len);
> 
> 	Since non-IB bonding masters are run through ether_setup, which
> sets hard_header_len to ETH_HLEN, the max() is probably unnecessary, and
> I think this could just be bond_dev->hard_header_len.
> 
This is true except for the case where there are no slaves left.
In that case max_hard_header_len has equals to the initialization value.


       bond_for_each_slave(bond, slave, i) {
                features = netdev_compute_features(features,
                                                   slave->dev->features);
                if (slave->dev->hard_header_len > max_hard_header_len)
                        max_hard_header_len = slave->dev->hard_header_len;
        }

        features |= (bond_dev->features & BOND_VLAN_FEATURES);
        bond_dev->features = features;
        bond_dev->hard_header_len = max_hard_header_len;



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone
       [not found]   ` <3403.1188343986@death>
@ 2007-08-29 14:06     ` Moni Shoua
  2007-08-29 19:50       ` Jay Vosburgh
  0 siblings, 1 reply; 17+ messages in thread
From: Moni Shoua @ 2007-08-29 14:06 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: rdreier, davem, general, netdev

Jay Vosburgh wrote:
> Moni Shoua <monis@voltaire.com> wrote:
> 
>> When bonding enslaves non Ethernet devices it takes pointers to functions 
>> in the module that owns the slaves. In this case it becomes unsafe
>> to keep the bonding master registered after last slave was unenslaved 
>> because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
>> ensures that these functions be used anymore.
> 
> 	Would it not be simpler to run the bonding master through
> ether_setup() again when the final slave is released (to reset all of
> the pointers to their "ethernet" values)?  I'm presuming here the
> pointers of questionable validity are the ones set in the
> bond_setup_by_slave() copied from the slave_dev->hard_header, et al.
> 
> 	Having the bonding master disappear (but only sometimes) after
> the last slave is removed is a semantic change I'd rather not introduce
> if it's not necessary.

Thanks for the comments.

Having the master disappear is one way I could think of to solve the problem of leaving
the bonding module with pointers to illegal addresses.
The other way is to increase the usage count, with try_module_get(), of the module which owns of the slave.
To do that I  have to restore the field  owner in structure net_device (it was removed in 2.6).
Do you prefer the second approach? I wasn't sure about that.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone
  2007-08-29 14:06     ` [ofa-general] " Moni Shoua
@ 2007-08-29 19:50       ` Jay Vosburgh
  2007-09-02 11:32         ` Moni Shoua
  0 siblings, 1 reply; 17+ messages in thread
From: Jay Vosburgh @ 2007-08-29 19:50 UTC (permalink / raw)
  To: Moni Shoua; +Cc: Moni Shoua, netdev, rdreier, davem, general

Moni Shoua <monisonlists@gmail.com> wrote:

>Jay Vosburgh wrote:
>> Moni Shoua <monis@voltaire.com> wrote:
>> 
>>> When bonding enslaves non Ethernet devices it takes pointers to functions 
>>> in the module that owns the slaves. In this case it becomes unsafe
>>> to keep the bonding master registered after last slave was unenslaved 
>>> because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
>>> ensures that these functions be used anymore.
>> 
>> 	Would it not be simpler to run the bonding master through
>> ether_setup() again when the final slave is released (to reset all of
>> the pointers to their "ethernet" values)?  I'm presuming here the
>> pointers of questionable validity are the ones set in the
>> bond_setup_by_slave() copied from the slave_dev->hard_header, et al.
>> 
>> 	Having the bonding master disappear (but only sometimes) after
>> the last slave is removed is a semantic change I'd rather not introduce
>> if it's not necessary.
>
>Thanks for the comments.
>
>Having the master disappear is one way I could think of to solve the problem of leaving
>the bonding module with pointers to illegal addresses.
>The other way is to increase the usage count, with try_module_get(), of the module which owns of the slave.
>To do that I  have to restore the field  owner in structure net_device (it was removed in 2.6).

	What I was asking above is really whether or not it's feasible
to simply reset the affected pointers back to the "ethernet" values from
ether_setup().  I would think this should return the bonding master back
to the original state it started in before any slaves were added.
Unless I'm missing something; I'm willing to believe there's some
IB-specific tidbit I'm unaware of that makes this more complicated than
it seems.

	This presumes that I'm correct in thinking that the pointers
you're talking about (as being unsafe after removal of last slave) are
the ones copied in your new function bond_setup_by_slave().

	I don't think it's desirable to acquire a reference to the slave
driver module.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (9 preceding siblings ...)
  2007-08-20 15:58 ` [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
@ 2007-09-01 20:19 ` Or Gerlitz
  2007-09-10 14:31 ` Moni Shoua
  11 siblings, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2007-09-01 20:19 UTC (permalink / raw)
  To: rdreier, davem, fubar; +Cc: general, netdev

On 8/20/07, Moni Shoua <monis@voltaire.com> wrote:
> This patch series is the fourth version (see below link to V3) of the
> suggested changes to the bonding driver so it would be able to support
> non ARPHRD_ETHER netdevices for its High-Availability (active-backup) mode.
>
> The motivation is to enable the bonding driver on its HA mode to work with
> the IP over Infiniband (IPoIB) driver. With these patches I was able to enslave
> IPoIB netdevices and run TCP, UDP, IP (UDP) Multicast and ICMP traffic with
> fail-over and fail-back working fine. The working environment was the net-2.6 git.
>
> This series also includes patches to the IPoIB driver that fix some fix
> some neighboring related issues.
>
> Major changes from the previous version:
>
> 1) Addressing the issue of safety when unloading the IPoIB module before
> the bonding module
> 2) style changes
>
>
> Links to earlier discussion:
>
> 1. A discussion in netdev about bonding support for IPoIB.
> http://lists.openwall.net/netdev/2006/11/30/46
>
> 2. A discussion in openfabrics regarding changes in the IPoIB that
> enable using it as a slave for bonding.
> http://lists.openfabrics.org/pipermail/general/2007-July/038914.html

Roland, Jay, Dave

These patches hang around for about a year now where the V4 took care
of the open issues pointed during the review process. Aiming for
2.6.24, the merge window becomes close, can you provide your take
here?

thanks,

Or.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] Re: [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone
  2007-08-29 19:50       ` Jay Vosburgh
@ 2007-09-02 11:32         ` Moni Shoua
  0 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-02 11:32 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: rdreier, davem, general, netdev

Jay Vosburgh wrote:
> Moni Shoua <monisonlists@gmail.com> wrote:
> 
>> Jay Vosburgh wrote:
>>> Moni Shoua <monis@voltaire.com> wrote:
>>>
>>>> When bonding enslaves non Ethernet devices it takes pointers to functions 
>>>> in the module that owns the slaves. In this case it becomes unsafe
>>>> to keep the bonding master registered after last slave was unenslaved 
>>>> because we don't know if the pointers are still valid.  Destroying the bond when slave_cnt is zero
>>>> ensures that these functions be used anymore.
>>> 	Would it not be simpler to run the bonding master through
>>> ether_setup() again when the final slave is released (to reset all of
>>> the pointers to their "ethernet" values)?  I'm presuming here the
>>> pointers of questionable validity are the ones set in the
>>> bond_setup_by_slave() copied from the slave_dev->hard_header, et al.
>>>
>>> 	Having the bonding master disappear (but only sometimes) after
>>> the last slave is removed is a semantic change I'd rather not introduce
>>> if it's not necessary.
>> Thanks for the comments.
>>
>> Having the master disappear is one way I could think of to solve the problem of leaving
>> the bonding module with pointers to illegal addresses.
>> The other way is to increase the usage count, with try_module_get(), of the module which owns of the slave.
>> To do that I  have to restore the field  owner in structure net_device (it was removed in 2.6).
> 
> 	What I was asking above is really whether or not it's feasible
> to simply reset the affected pointers back to the "ethernet" values from
> ether_setup().  I would think this should return the bonding master back
> to the original state it started in before any slaves were added.
> Unless I'm missing something; I'm willing to believe there's some
> IB-specific tidbit I'm unaware of that makes this more complicated than
> it seems.

It's possible to reset the bonding master by calling ether_setup but with one exception: its neighbors.
When enslaving IPoIB devices, the bonding master neighbors point to a destructor function in the ib_ipoib module.
When ib_ipoib goes down the neighbors of the bonding master still exist and when their turn come to die they will
try to access this function and the kernel will crash. This is why I want to destroy the bonding master before ib_ipoib
is unloaded (to kill its neighbors). For any other issue (i.e. taken pointer), ether_setup would solve the problem.

> 
> 	This presumes that I'm correct in thinking that the pointers
> you're talking about (as being unsafe after removal of last slave) are
> the ones copied in your new function bond_setup_by_slave().
> 
> 	I don't think it's desirable to acquire a reference to the slave
> driver module.
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver
  2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
                   ` (10 preceding siblings ...)
  2007-09-01 20:19 ` [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Or Gerlitz
@ 2007-09-10 14:31 ` Moni Shoua
  11 siblings, 0 replies; 17+ messages in thread
From: Moni Shoua @ 2007-09-10 14:31 UTC (permalink / raw)
  To: Michael S. Tsirkin, rdreier, davem, fubar; +Cc: netdev, general

Hi all,
This patch series is a bit neglected.
Since our goal is to have bonding support for IPoIB in kernel 2.6.24 it is 
very important for us to get comments soon.

We would appreciate if you take some time to look at this and help us push this code upstream.

thanks

 MoniS  



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-09-10 14:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-20 15:34 [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Moni Shoua
2007-08-20 15:42 ` [ofa-general] [PATCH V4 1/10] IB/ipoib: Export call to call_netdevice_notifiers and add new private flag Moni Shoua
2007-08-20 15:43 ` [ofa-general] [PATCH V4 2/10] IB/ipoib: Notify the world before doing unregister Moni Shoua
2007-08-20 15:44 ` [ofa-general] [PATCH V4 3/10] IB/ipoib: Bound the net device to the ipoib_neigh structue Moni Shoua
2007-08-20 15:46 ` [PATCH V4 4/10] IB/ipoib: Verify address handle validity on send Moni Shoua
2007-08-20 15:48 ` [PATCH V4 5/10] net/bonding: Enable bonding to enslave non ARPHRD_ETHER Moni Shoua
2007-08-20 15:49 ` [PATCH V4 6/10] net/bonding: Enable bonding to enslave netdevices not supporting set_mac_address() Moni Shoua
2007-08-20 15:51 ` [PATCH V4 7/10] net/bonding: Enable IP multicast for bonding IPoIB devices Moni Shoua
2007-08-20 15:52 ` [PATCH V4 8/10] net/bonding: Handlle wrong assumptions that slave is always an Ethernet device Moni Shoua
     [not found]   ` <416.1188343604@death>
2007-08-29 13:37     ` [ofa-general] " Moni Shoua
2007-08-20 15:53 ` [ofa-general] PATCH V4 9/10] net/bonding: Delay sending of gratuitous ARP to avoid failure Moni Shoua
2007-08-20 15:58 ` [PATCH V4 10/10] net/bonding: Destroy bonding master when last slave is gone Moni Shoua
     [not found]   ` <3403.1188343986@death>
2007-08-29 14:06     ` [ofa-general] " Moni Shoua
2007-08-29 19:50       ` Jay Vosburgh
2007-09-02 11:32         ` Moni Shoua
2007-09-01 20:19 ` [ofa-general] [PATCH V4 0/10] net/bonding: ADD IPoIB support for the bonding driver Or Gerlitz
2007-09-10 14:31 ` Moni Shoua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).