[PATCH RFC] net/bonding: announce fail-over for the active-backup mode

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
@ 2008-05-21 12:50 Or Gerlitz
  2008-05-27  7:33 ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2008-05-21 12:50 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev

Enhance bonding to announce fail-over for the active-backup mode through the
netdev events notifier chain mechanism. Such an event can be of use for the
RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
always be aligned with the IP stack, in the sense that they use the same
ports/links as the stack does. More usages can be done to allow monitoring
tools based on netlink events be aware to bonding failover.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>

Index: linux-2.6.26-rc2/drivers/net/bonding/bond_main.c
===================================================================
--- linux-2.6.26-rc2.orig/drivers/net/bonding/bond_main.c	2008-05-13 10:02:22.000000000 +0300
+++ linux-2.6.26-rc2/drivers/net/bonding/bond_main.c	2008-05-15 12:29:44.000000000 +0300
@@ -1117,6 +1117,7 @@ void bond_change_active_slave(struct bon
 			bond->send_grat_arp = 1;
 		} else
 			bond_send_gratuitous_arp(bond);
+		netdev_bonding_change(bond->dev);
 	}
 }

Index: linux-2.6.26-rc2/include/linux/notifier.h
===================================================================
--- linux-2.6.26-rc2.orig/include/linux/notifier.h	2008-05-13 10:02:30.000000000 +0300
+++ linux-2.6.26-rc2/include/linux/notifier.h	2008-05-13 11:50:44.000000000 +0300
@@ -197,6 +197,7 @@ static inline int notifier_to_errno(int
 #define NETDEV_GOING_DOWN	0x0009
 #define NETDEV_CHANGENAME	0x000A
 #define NETDEV_FEAT_CHANGE	0x000B
+#define NETDEV_BONDING_FAILOVER 0x000C

 #define SYS_DOWN	0x0001	/* Notify of system down */
 #define SYS_RESTART	SYS_DOWN
Index: linux-2.6.26-rc2/include/linux/netdevice.h
===================================================================
--- linux-2.6.26-rc2.orig/include/linux/netdevice.h	2008-05-13 10:02:30.000000000 +0300
+++ linux-2.6.26-rc2/include/linux/netdevice.h	2008-05-13 11:50:20.000000000 +0300
@@ -1459,6 +1459,7 @@ extern void		__dev_addr_unsync(struct de
 extern void		dev_set_promiscuity(struct net_device *dev, int inc);
 extern void		dev_set_allmulti(struct net_device *dev, int inc);
 extern void		netdev_state_change(struct net_device *dev);
+extern void		netdev_bonding_change(struct net_device *dev);
 extern void		netdev_features_change(struct net_device *dev);
 /* Load a device via the kmod */
 extern void		dev_load(struct net *net, const char *name);
Index: linux-2.6.26-rc2/net/core/dev.c
===================================================================
--- linux-2.6.26-rc2.orig/net/core/dev.c	2008-05-13 10:02:31.000000000 +0300
+++ linux-2.6.26-rc2/net/core/dev.c	2008-05-13 11:50:49.000000000 +0300
@@ -956,6 +956,12 @@ void netdev_state_change(struct net_devi
 	}
 }

+void netdev_bonding_change(struct net_device *dev)
+{
+	call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, dev);
+}
+EXPORT_SYMBOL(netdev_bonding_change);
+
 /**
  *	dev_load 	- load a network module
  *	@net: the applicable net namespace

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-21 12:50 [PATCH RFC] net/bonding: announce fail-over for the active-backup mode Or Gerlitz
@ 2008-05-27  7:33 ` Or Gerlitz
  2008-05-27 12:34   ` Or Gerlitz
  2008-05-28 22:19   ` Jay Vosburgh
  0 siblings, 2 replies; 7+ messages in thread
From: Or Gerlitz @ 2008-05-27  7:33 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Jeff Garzik, netdev

> Enhance bonding to announce fail-over for the active-backup mode through the
> netdev events notifier chain mechanism. Such an event can be of use for the
> RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
> always be aligned with the IP stack, in the sense that they use the same
> ports/links as the stack does. More usages can be done to allow monitoring
> tools based on netlink events be aware to bonding failover.
>
> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
[...]
> --- linux-2.6.26-rc2.orig/drivers/net/bonding/bond_main.c	2008-05-13 10:02:22.000000000 +0300
> +++ linux-2.6.26-rc2/drivers/net/bonding/bond_main.c	2008-05-15 12:29:44.000000000 +0300
> @@ -1117,6 +1117,7 @@ void bond_change_active_slave(struct bon
>  			bond->send_grat_arp = 1;
>  		} else
>  			bond_send_gratuitous_arp(bond);
> +		netdev_bonding_change(bond->dev);
>  	}
>  }
[...]
> --- linux-2.6.26-rc2.orig/net/core/dev.c	2008-05-13 10:02:31.000000000 +0300
> +++ linux-2.6.26-rc2/net/core/dev.c	2008-05-13 11:50:49.000000000 +0300
> @@ -956,6 +956,12 @@ void netdev_state_change(struct net_devi
>  	}
>  }
>
> +void netdev_bonding_change(struct net_device *dev)
> +{
> +	call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, dev);
> +}
> +EXPORT_SYMBOL(netdev_bonding_change);

Hi Jay,

First, I'll be happy to get a comment from you on the patch.

Second, bond_change_active_slave() is called on few occasions under write_lock_bh()
which means it runs in atomic context. I was thinking that it might be better to
invoke call_netdevice_notifiers() from thread context, since some components may
need to allocate memory in order to deliver event (eg netlink) and may assume the
context they are being called is not atomic.

For example I sometimes get the below "scheduling while atomic" warning which points
on ipoib code, but the stack trace has some netlink calls which allocate skb. I am
not sure yet what triggers this, however I wanted to check with you and Jeff if/what
there are some convensions for the context of call_netdevice_notifiers(), thanks.

Or.


bonding: bond0: link status definitely down for interface ib0, disabling it
bonding: bond0: making interface ib1 the new active one.
BUG: scheduling while atomic: bond0/14237/0x10000100
Pid: 14237, comm: bond0 Not tainted 2.6.26-rc3 #4

Call Trace:
 [<ffffffff804777d7>] schedule+0x98/0x57b
 [<ffffffff80277836>] dbg_redzone1+0x16/0x1f
 [<ffffffffa0106f22>] :ib_ipoib:ipoib_start_xmit+0x445/0x459
 [<ffffffff802799c2>] kmem_cache_alloc_node+0x147/0x177
 [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
 [<ffffffff8022c99b>] __cond_resched+0x1c/0x43
 [<ffffffff80477e11>] _cond_resched+0x2d/0x38
 [<ffffffff802798a0>] kmem_cache_alloc_node+0x25/0x177
 [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
 [<ffffffff8041825e>] rtmsg_ifinfo+0x3a/0xd4
 [<ffffffff80418335>] rtnetlink_event+0x3d/0x41
 [<ffffffff8047b925>] notifier_call_chain+0x30/0x54
 [<ffffffffa00a3d4b>] :bonding:bond_select_active_slave+0xb9/0xe8
 [<ffffffffa00a495e>] :bonding:__bond_mii_monitor+0x43a/0x464
 [<ffffffffa00a49e6>] :bonding:bond_mii_monitor+0x5e/0xaa
 [<ffffffffa00a4988>] :bonding:bond_mii_monitor+0x0/0xaa
 [<ffffffff8023d6fa>] run_workqueue+0x7f/0x107
 [<ffffffff8023d782>] worker_thread+0x0/0xef
 [<ffffffff8023d867>] worker_thread+0xe5/0xef
 [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8024055a>] kthread+0x3d/0x63
 [<ffffffff8020c068>] child_rip+0xa/0x12
 [<ffffffff8024051d>] kthread+0x0/0x63
 [<ffffffff8020c05e>] child_rip+0x0/0x12


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-27  7:33 ` Or Gerlitz
@ 2008-05-27 12:34   ` Or Gerlitz
  2008-05-28 22:19   ` Jay Vosburgh
  1 sibling, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2008-05-27 12:34 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Jeff Garzik, netdev

> Second, bond_change_active_slave() is called on few occasions under write_lock_bh()
> which means it runs in atomic context. I was thinking that it might be better to
> invoke call_netdevice_notifiers() from thread context,

Sorry if I wasn't clear. As seen from the trace, current code indeed delivers the event
from a thread but the context is made atomic as of the _bh locking, so my question is whether
the context should be made not atomic before delivering the event (eg through queuing a work
struct to be executed later by the kernel work queue thread, etc)

Or


> Call Trace:
>  [<ffffffff804777d7>] schedule+0x98/0x57b
>  [<ffffffff80277836>] dbg_redzone1+0x16/0x1f
>  [<ffffffffa0106f22>] :ib_ipoib:ipoib_start_xmit+0x445/0x459
>  [<ffffffff802799c2>] kmem_cache_alloc_node+0x147/0x177
>  [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
>  [<ffffffff8022c99b>] __cond_resched+0x1c/0x43
>  [<ffffffff80477e11>] _cond_resched+0x2d/0x38
>  [<ffffffff802798a0>] kmem_cache_alloc_node+0x25/0x177
>  [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
>  [<ffffffff8041825e>] rtmsg_ifinfo+0x3a/0xd4
>  [<ffffffff80418335>] rtnetlink_event+0x3d/0x41
>  [<ffffffff8047b925>] notifier_call_chain+0x30/0x54
>  [<ffffffffa00a3d4b>] :bonding:bond_select_active_slave+0xb9/0xe8
>  [<ffffffffa00a495e>] :bonding:__bond_mii_monitor+0x43a/0x464
>  [<ffffffffa00a49e6>] :bonding:bond_mii_monitor+0x5e/0xaa
>  [<ffffffffa00a4988>] :bonding:bond_mii_monitor+0x0/0xaa
>  [<ffffffff8023d6fa>] run_workqueue+0x7f/0x107
>  [<ffffffff8023d782>] worker_thread+0x0/0xef
>  [<ffffffff8023d867>] worker_thread+0xe5/0xef
>  [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff8024055a>] kthread+0x3d/0x63
>  [<ffffffff8020c068>] child_rip+0xa/0x12
>  [<ffffffff8024051d>] kthread+0x0/0x63
>  [<ffffffff8020c05e>] child_rip+0x0/0x12

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-27  7:33 ` Or Gerlitz
  2008-05-27 12:34   ` Or Gerlitz
@ 2008-05-28 22:19   ` Jay Vosburgh
  2008-05-29 10:24     ` Or Gerlitz
  1 sibling, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2008-05-28 22:19 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Jeff Garzik, netdev

Or Gerlitz <ogerlitz@voltaire.com> wrote:

>> Enhance bonding to announce fail-over for the active-backup mode through the
>> netdev events notifier chain mechanism. Such an event can be of use for the
>> RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER)
>> always be aligned with the IP stack, in the sense that they use the same
>> ports/links as the stack does. More usages can be done to allow monitoring
>> tools based on netlink events be aware to bonding failover.
>>
>> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
>[...]
>> --- linux-2.6.26-rc2.orig/drivers/net/bonding/bond_main.c	2008-05-13 10:02:22.000000000 +0300
>> +++ linux-2.6.26-rc2/drivers/net/bonding/bond_main.c	2008-05-15 12:29:44.000000000 +0300
>> @@ -1117,6 +1117,7 @@ void bond_change_active_slave(struct bon
>>  			bond->send_grat_arp = 1;
>>  		} else
>>  			bond_send_gratuitous_arp(bond);
>> +		netdev_bonding_change(bond->dev);
>>  	}
>>  }
>[...]
>> --- linux-2.6.26-rc2.orig/net/core/dev.c	2008-05-13 10:02:31.000000000 +0300
>> +++ linux-2.6.26-rc2/net/core/dev.c	2008-05-13 11:50:49.000000000 +0300
>> @@ -956,6 +956,12 @@ void netdev_state_change(struct net_devi
>>  	}
>>  }
>>
>> +void netdev_bonding_change(struct net_device *dev)
>> +{
>> +	call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, dev);
>> +}
>> +EXPORT_SYMBOL(netdev_bonding_change);
>
>Hi Jay,
>
>First, I'll be happy to get a comment from you on the patch.

	I've been away from the office, so I'm just now looking at this.

	Philosophically speaking, I don't see a problem with adding a
notifier like this, but others higher in the food chain may have
thoughts.

	From a technical point of view, however, you'll need a much more
extensive change to deal with the locking requirements, which I'll
describe below.

>Second, bond_change_active_slave() is called on few occasions under write_lock_bh()
>which means it runs in atomic context. I was thinking that it might be better to
>invoke call_netdevice_notifiers() from thread context, since some components may
>need to allocate memory in order to deliver event (eg netlink) and may assume the
>context they are being called is not atomic.

	Yes, this is one of the locking cases that has been slowly
changing.  To make the change you're requesting, I suspect that all of
the remaining callers of bond_change_active_slave will have to be
converted as follows.

	The "proper" locking for calls to bond_change_active_slave is
RTNL, read_lock of bond->lock and write_lock_bh of curr_slave_lock.
Some places in the code still have other locking, and get away with it
due to special knowledge (e.g., bond_release sets the active slave to
NULL, which won't go down any paths that need the special locking).

	The reason for the locking requirements is that there are places
that must release some of these locks to get to the correct locking
context for the notifier calls (which is RTNL and nothing else).  The
main case for this is in bond_alb_handle_active_change, wherein the MAC
addresses of the slaves are moved around.  The dev_set_mac_address
function needs RTNL and nothing else, because it will issue a notifier
call.

	Your case is similar: you want to issue a notifier call during
an active-backup failover, so that notifier call will have to be made
holding RTNL and no other locks.

	I think the most maintainable way to do that is to convert the
remaining callers of bond_change_active_slave to hold the correct set of
locks, and then have bond_change_active_slave drop down to just RTNL at
the appropriate place to make the notifier call.  That may not be as
simple as it sounds, as it may open race windows.

>
>For example I sometimes get the below "scheduling while atomic" warning which points
>on ipoib code, but the stack trace has some netlink calls which allocate skb. I am
>not sure yet what triggers this, however I wanted to check with you and Jeff if/what
>there are some convensions for the context of call_netdevice_notifiers(), thanks.

>bonding: bond0: link status definitely down for interface ib0, disabling it
>bonding: bond0: making interface ib1 the new active one.
>BUG: scheduling while atomic: bond0/14237/0x10000100
>Pid: 14237, comm: bond0 Not tainted 2.6.26-rc3 #4
>
>Call Trace:
> [<ffffffff804777d7>] schedule+0x98/0x57b
> [<ffffffff80277836>] dbg_redzone1+0x16/0x1f
> [<ffffffffa0106f22>] :ib_ipoib:ipoib_start_xmit+0x445/0x459
> [<ffffffff802799c2>] kmem_cache_alloc_node+0x147/0x177
> [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
> [<ffffffff8022c99b>] __cond_resched+0x1c/0x43
> [<ffffffff80477e11>] _cond_resched+0x2d/0x38
> [<ffffffff802798a0>] kmem_cache_alloc_node+0x25/0x177
> [<ffffffff8040a939>] __alloc_skb+0x35/0x12b
> [<ffffffff8041825e>] rtmsg_ifinfo+0x3a/0xd4
> [<ffffffff80418335>] rtnetlink_event+0x3d/0x41
> [<ffffffff8047b925>] notifier_call_chain+0x30/0x54
> [<ffffffffa00a3d4b>] :bonding:bond_select_active_slave+0xb9/0xe8
> [<ffffffffa00a495e>] :bonding:__bond_mii_monitor+0x43a/0x464
> [<ffffffffa00a49e6>] :bonding:bond_mii_monitor+0x5e/0xaa
> [<ffffffffa00a4988>] :bonding:bond_mii_monitor+0x0/0xaa
> [<ffffffff8023d6fa>] run_workqueue+0x7f/0x107
> [<ffffffff8023d782>] worker_thread+0x0/0xef
> [<ffffffff8023d867>] worker_thread+0xe5/0xef
> [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
> [<ffffffff8024088f>] autoremove_wake_function+0x0/0x2e
> [<ffffffff8024055a>] kthread+0x3d/0x63
> [<ffffffff8020c068>] child_rip+0xa/0x12
> [<ffffffff8024051d>] kthread+0x0/0x63
> [<ffffffff8020c05e>] child_rip+0x0/0x12

	It's from the call to nlmsg_new (an inline that calls alloc_skb)
in rtmsg_ifinfo, which allocates at GFP_KERNEL.  As I recall, there are
other similar cases, so it's not simply a matter of changing
rtmsg_ifinfo.  The notifier calls have to happen with RTNL and no other
locks.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-28 22:19   ` Jay Vosburgh
@ 2008-05-29 10:24     ` Or Gerlitz
  2008-05-30  7:15       ` Jay Vosburgh
  0 siblings, 1 reply; 7+ messages in thread
From: Or Gerlitz @ 2008-05-29 10:24 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Jeff Garzik, netdev

Jay Vosburgh wrote:
> Or Gerlitz <ogerlitz@voltaire.com> wrote:
>
> 	Philosophically speaking, I don't see a problem with adding a
> notifier like this, but others higher in the food chain may have
> thoughts.
OK. I will be happy to get more feedback.

> 	Your case is similar: you want to issue a notifier call during
> an active-backup failover, so that notifier call will have to be made
> holding RTNL and no other locks.
>
> 	I think the most maintainable way to do that is to convert the
> remaining callers of bond_change_active_slave to hold the correct set of
> locks, and then have bond_change_active_slave drop down to just RTNL at
> the appropriate place to make the notifier call.  That may not be as
> simple as it sounds, as it may open race windows.
>
Lets say that everyone calls bond_change_active_slave with the correct 
locks taken and the code that delivers the event, unlocks these two 
locks, call to the notifier chain through dev_set_xxx() and then locks 
them again. These locks were there in the first place to protect on 
something, so generally speaking I don't see why its allowed to unlock 
them for some window of time... is it some "best effort" compromise?

Second, if it makes sense to have this window at time where the other 
two locks are not taken and only the RTNL one is taken. Is there any 
reason I can't take the approach of bond_alb_handle_active_change() 
which as you pointed out, releases the locks, delivers the event and 
take them again? is there something different between the possible calls 
under the active-backup mode vs the ALB mode that requires to do this 
deeper fix?

>> bonding: bond0: link status definitely down for interface ib0, disabling it
>> bonding: bond0: making interface ib1 the new active one.
>> BUG: scheduling while atomic: bond0/14237/0x10000100
> It's from the call to nlmsg_new (an inline that calls alloc_skb) in rtmsg_ifinfo, which allocates at GFP_KERNEL.  As I recall, there are other similar cases, so it's not simply a matter of changing rtmsg_ifinfo.  The notifier calls have to happen with RTNL and no other locks.
Understood, thanks for clarifying this.

Or.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-29 10:24     ` Or Gerlitz
@ 2008-05-30  7:15       ` Jay Vosburgh
  2008-06-02 10:39         ` Or Gerlitz
  0 siblings, 1 reply; 7+ messages in thread
From: Jay Vosburgh @ 2008-05-30  7:15 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Jeff Garzik, netdev

Or Gerlitz <ogerlitz@voltaire.com> wrote:

>Jay Vosburgh wrote:
[...]
>> 	Your case is similar: you want to issue a notifier call during
>> an active-backup failover, so that notifier call will have to be made
>> holding RTNL and no other locks.
>>
>> 	I think the most maintainable way to do that is to convert the
>> remaining callers of bond_change_active_slave to hold the correct set of
>> locks, and then have bond_change_active_slave drop down to just RTNL at
>> the appropriate place to make the notifier call.  That may not be as
>> simple as it sounds, as it may open race windows.
>>
>Lets say that everyone calls bond_change_active_slave with the correct
>locks taken and the code that delivers the event, unlocks these two locks,
>call to the notifier chain through dev_set_xxx() and then locks them
>again. These locks were there in the first place to protect on something,
>so generally speaking I don't see why its allowed to unlock them for some
>window of time... is it some "best effort" compromise?

	Not exactly.  The unlock/lock dance doesn't release RTNL, so we
know that slaves cannot be added or removed (because that requires
RTNL).  Generally speaking, slaves can't have their up/down status
changed, either, since that would also acquire RTNL.

	However, the curr_slave_lock is still needed, particularly for
the alb mode.  All of the paths that can change the curr_active_slave
now hold RTNL (which is reasonable, as it's generally a pretty rare
event).  However, not all of the paths that inspect curr_active_slave
hold RTNL, so some places (alb mode in particular) need a mutex to keep
the curr_active_slave from changing during various activities, but
holding RTNL for those things is unreasonable (e.g., bond_alb_xmit would
need it for every packet).

	It's probably possible to do away with the curr_slave_lock
entirely (and just use the bond->lock).  That will still have some
windows like this, since there will be places that hold bond->lock for
read and have to temporarily upgrade it to a write lock (which I think
would be all of the places that now hold curr_slave_lock for write).

>Second, if it makes sense to have this window at time where the other two
>locks are not taken and only the RTNL one is taken. Is there any reason I
>can't take the approach of bond_alb_handle_active_change() which as you
>pointed out, releases the locks, delivers the event and take them again?
>is there something different between the possible calls under the
>active-backup mode vs the ALB mode that requires to do this deeper fix?

	I'm suggesting you do exactly that: release the locks, do your
stuff, then reacquire the locks.

	The concern for you is that there are still a few calls to
bond_change_active_slave that will never enter the ALB path, and those
calls to bond_change_active slave don't necessarily have the correct set
of locks at present.  

	The location of your new notifier call is such that there's no
way for it to know whether bond_change_active_slave was called with the
full set of locks (e.g., from one of the link monitors), or some other
set (as from bond_release_all), so some of the remaining calls to
bond_change_active_slave will likely need to be updated with regard to
locking.

	I took a look, and I think this is how it stands: I recently
refactored the active-backup ARP monitor, and I believe that all of its
calls are now correct; this would have been the difficult one to fix, so
you timed this just right.  The load-balance ARP monitor is still wrong,
but you don't care about that one since it isn't used for active-backup
mode.

	I think the only cases that will require fixing for you are the
bond_release and bond_release_all.

	For bond_release, there are two calls to change_active: one sets
it to NULL, and the second sets to to a slave.  The first of those calls
has nominally incorrect locking; the second has proper locking.
Changing the first call to mimic the methodology of the second call
should be pretty straightforward.  Alternately, if your notifier doesn't
want to log "change to nothing" events (i.e., put it inside the "if
(new_active)" block), then you could probably get away with not fixing
the first call.

	For bond_release_all, there's just one call, and it sets the
current slave to NULL (and then goes off and frees all of the slaves).
The same caveats from the bond_release case apply here.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] net/bonding: announce fail-over for the active-backup mode
  2008-05-30  7:15       ` Jay Vosburgh
@ 2008-06-02 10:39         ` Or Gerlitz
  0 siblings, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2008-06-02 10:39 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Jeff Garzik, netdev

Jay Vosburgh wrote:
> 	I'm suggesting you do exactly that: release the locks, do your
> stuff, then reacquire the locks.
OK
> 	I took a look, and I think this is how it stands: I recently
> refactored the active-backup ARP monitor, and I believe that all of its
> calls are now correct; this would have been the difficult one to fix, so
> you timed this just right.  The load-balance ARP monitor is still wrong,
> but you don't care about that one since it isn't used for active-backup
> mode.
sounds like I am lucky.
> 	I think the only cases that will require fixing for you are the
> bond_release and bond_release_all.
>
> 	For bond_release, there are two calls to change_active: one sets
> it to NULL, and the second sets to to a slave.  The first of those calls
> has nominally incorrect locking; the second has proper locking.
> Changing the first call to mimic the methodology of the second call
> should be pretty straightforward.  Alternately, if your notifier doesn't
> want to log "change to nothing" events (i.e., put it inside the "if
> (new_active)" block), then you could probably get away with not fixing
> the first call.
>
> 	For bond_release_all, there's just one call, and it sets the
> current slave to NULL (and then goes off and frees all of the slaves).
> The same caveats from the bond_release case apply here.

Thanks for this deep dive and guidance. I don't think there's a need to 
deliver "change to nothing" event and as such I took the approach of 
setting this event to happen only (under the acrtive-backup mode AND) 
when new_active is not NULL, will send now the patches which I tested today.

Or.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-06-02 10:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-21 12:50 [PATCH RFC] net/bonding: announce fail-over for the active-backup mode Or Gerlitz
2008-05-27  7:33 ` Or Gerlitz
2008-05-27 12:34   ` Or Gerlitz
2008-05-28 22:19   ` Jay Vosburgh
2008-05-29 10:24     ` Or Gerlitz
2008-05-30  7:15       ` Jay Vosburgh
2008-06-02 10:39         ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).