netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] bonding: 3 fixes for 2.6.24
@ 2008-01-08  1:56 Jay Vosburgh
  2008-01-08  1:56 ` [PATCH 1/3] bonding: fix locking in sysfs primary/active selection Jay Vosburgh
  2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
  0 siblings, 2 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08  1:56 UTC (permalink / raw)
  To: netdev; +Cc: Jeff Garzik, David Miller, Andy Gospodarek, Krzysztof Oledzki

	Following are three fixes to fix locking problems and
silence locking-related warnings in the current 2.6.24-rc.

	patch 1: fix locking in sysfs primary/active selection

	Call core network functions with expected locks to
eliminate potential deadlock and silence warnings.

	patch 2: fix ASSERT_RTNL that produces spurious warnings

	Relocate ASSERT_RTNL to remove a false warning; after patch,
ASSERT is located in code that holds only RTNL (additional locks were
causing the ASSERT to trip)

	patch 3: fix locking during alb failover and slave removal

	Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
Eliminates deadlock and silences warnings.

	Patches are against the current netdev-2.6#upstream branch.

	Please apply for 2.6.24.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/3] bonding: fix locking in sysfs primary/active selection
  2008-01-08  1:56 [PATCH 0/3] bonding: 3 fixes for 2.6.24 Jay Vosburgh
@ 2008-01-08  1:56 ` Jay Vosburgh
  2008-01-08  1:56   ` [PATCH 2/3] bonding: fix ASSERT_RTNL that produces spurious warnings Jay Vosburgh
  2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
  1 sibling, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08  1:56 UTC (permalink / raw)
  To: netdev
  Cc: Jeff Garzik, David Miller, Andy Gospodarek, Krzysztof Oledzki,
	Jay Vosburgh

	Fix the functions that store the primary and active slave
options via sysfs to hold the correct locks in the correct order.

	The bond_change_active_slave and bond_select_active_slave
functions both require rtnl, bond->lock for read and curr_slave_lock for
write_bh, and no other locks.  This is so that the lower level
mode-specific functions (notably for balance-alb mode) can release locks
down to just rtnl in order to call, e.g., dev_set_mac_address with the
locks it expects (rtnl only).

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
---
 drivers/net/bonding/bond_sysfs.c |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 11b76b3..28a2d80 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1075,7 +1075,10 @@ static ssize_t bonding_store_primary(struct device *d,
 	struct slave *slave;
 	struct bonding *bond = to_bond(d);
 
-	write_lock_bh(&bond->lock);
+	rtnl_lock();
+	read_lock(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
+
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
 		       ": %s: Unable to set primary slave; %s is in mode %d\n",
@@ -1109,8 +1112,8 @@ static ssize_t bonding_store_primary(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
-
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock(&bond->lock);
 	rtnl_unlock();
 
 	return count;
@@ -1190,7 +1193,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 	struct bonding *bond = to_bond(d);
 
 	rtnl_lock();
-	write_lock_bh(&bond->lock);
+	read_lock(&bond->lock);
+	write_lock_bh(&bond->curr_slave_lock);
 
 	if (!USES_PRIMARY(bond->params.mode)) {
 		printk(KERN_INFO DRV_NAME
@@ -1247,7 +1251,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 		}
 	}
 out:
-	write_unlock_bh(&bond->lock);
+	write_unlock_bh(&bond->curr_slave_lock);
+	read_unlock(&bond->lock);
 	rtnl_unlock();
 
 	return count;
-- 
1.5.3.4.206.g58ba4-dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/3] bonding: fix ASSERT_RTNL that produces spurious warnings
  2008-01-08  1:56 ` [PATCH 1/3] bonding: fix locking in sysfs primary/active selection Jay Vosburgh
@ 2008-01-08  1:56   ` Jay Vosburgh
  2008-01-08  1:57     ` [PATCH 3/3] bonding: fix locking during alb failover and slave removal Jay Vosburgh
  0 siblings, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08  1:56 UTC (permalink / raw)
  To: netdev
  Cc: Jeff Garzik, David Miller, Andy Gospodarek, Krzysztof Oledzki,
	Jay Vosburgh

	Move an ASSERT_RTNL down to where we should hold only RTNL;
the existing check produces spurious warnings because we hold additional
locks at _bh, tripping a debug warning in spin_lock_mutex().

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
---
 drivers/net/bonding/bond_alb.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 25b8dbf..9b55a12 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -1601,9 +1601,6 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
 	struct slave *swap_slave;
 	int i;
 
-	if (new_slave)
-		ASSERT_RTNL();
-
 	if (bond->curr_active_slave == new_slave) {
 		return;
 	}
@@ -1649,6 +1646,8 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
 	write_unlock_bh(&bond->curr_slave_lock);
 	read_unlock(&bond->lock);
 
+	ASSERT_RTNL();
+
 	/* curr_active_slave must be set before calling alb_swap_mac_addr */
 	if (swap_slave) {
 		/* swap mac address */
-- 
1.5.3.4.206.g58ba4-dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/3] bonding: fix locking during alb failover and slave removal
  2008-01-08  1:56   ` [PATCH 2/3] bonding: fix ASSERT_RTNL that produces spurious warnings Jay Vosburgh
@ 2008-01-08  1:57     ` Jay Vosburgh
  0 siblings, 0 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08  1:57 UTC (permalink / raw)
  To: netdev
  Cc: Jeff Garzik, David Miller, Andy Gospodarek, Krzysztof Oledzki,
	Jay Vosburgh

	alb_fasten_mac_swap (actually rlb_teach_disabled_mac_on_primary)
requries RTNL and no other locks.  This could cause dev_set_promiscuity
and/or dev_set_mac_address to be called with improper locking.

	Changed callers to hold only RTNL during calls to alb_fasten_mac_swap
or functions calling it.  Updated header comments in affected functions to
reflect proper reality of locking requirements.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
---
 drivers/net/bonding/bond_alb.c  |   18 ++++++++++++------
 drivers/net/bonding/bond_main.c |   14 ++++++++------
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 9b55a12..b57bc94 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -979,7 +979,7 @@ static void alb_swap_mac_addr(struct bonding *bond, struct slave *slave1, struct
 /*
  * Send learning packets after MAC address swap.
  *
- * Called with RTNL and bond->lock held for read.
+ * Called with RTNL and no other locks
  */
 static void alb_fasten_mac_swap(struct bonding *bond, struct slave *slave1,
 				struct slave *slave2)
@@ -987,6 +987,8 @@ static void alb_fasten_mac_swap(struct bonding *bond, struct slave *slave1,
 	int slaves_state_differ = (SLAVE_IS_OK(slave1) != SLAVE_IS_OK(slave2));
 	struct slave *disabled_slave = NULL;
 
+	ASSERT_RTNL();
+
 	/* fasten the change in the switch */
 	if (SLAVE_IS_OK(slave1)) {
 		alb_send_learning_packets(slave1, slave1->dev->dev_addr);
@@ -1031,7 +1033,7 @@ static void alb_fasten_mac_swap(struct bonding *bond, struct slave *slave1,
  * a slave that has @slave's permanet address as its current address.
  * We'll make sure that that slave no longer uses @slave's permanent address.
  *
- * Caller must hold bond lock
+ * Caller must hold RTNL and no other locks
  */
 static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
 {
@@ -1542,7 +1544,12 @@ int bond_alb_init_slave(struct bonding *bond, struct slave *slave)
 	return 0;
 }
 
-/* Caller must hold bond lock for write */
+/*
+ * Remove slave from tlb and rlb hash tables, and fix up MAC addresses
+ * if necessary.
+ *
+ * Caller must hold RTNL and no other locks
+ */
 void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
 {
 	if (bond->slave_cnt > 1) {
@@ -1658,12 +1665,11 @@ void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave
 				       bond->alb_info.rlb_enabled);
 	}
 
-	read_lock(&bond->lock);
-
 	if (swap_slave) {
 		alb_fasten_mac_swap(bond, swap_slave, new_slave);
+		read_lock(&bond->lock);
 	} else {
-		/* fasten bond mac on new current slave */
+		read_lock(&bond->lock);
 		alb_send_learning_packets(new_slave, bond->dev->dev_addr);
 	}
 
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b0b2603..77d004d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1746,7 +1746,9 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 		 * has been cleared (if our_slave == old_current),
 		 * but before a new active slave is selected.
 		 */
+		write_unlock_bh(&bond->lock);
 		bond_alb_deinit_slave(bond, slave);
+		write_lock_bh(&bond->lock);
 	}
 
 	if (oldcurrent == slave) {
@@ -1905,6 +1907,12 @@ static int bond_release_all(struct net_device *bond_dev)
 		slave_dev = slave->dev;
 		bond_detach_slave(bond, slave);
 
+		/* now that the slave is detached, unlock and perform
+		 * all the undo steps that should not be called from
+		 * within a lock.
+		 */
+		write_unlock_bh(&bond->lock);
+
 		if ((bond->params.mode == BOND_MODE_TLB) ||
 		    (bond->params.mode == BOND_MODE_ALB)) {
 			/* must be called only after the slave
@@ -1915,12 +1923,6 @@ static int bond_release_all(struct net_device *bond_dev)
 
 		bond_compute_features(bond);
 
-		/* now that the slave is detached, unlock and perform
-		 * all the undo steps that should not be called from
-		 * within a lock.
-		 */
-		write_unlock_bh(&bond->lock);
-
 		bond_destroy_slave_symlinks(bond_dev, slave_dev);
 		bond_del_vlans_from_slave(bond, slave_dev);
 
-- 
1.5.3.4.206.g58ba4-dirty


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08  1:56 [PATCH 0/3] bonding: 3 fixes for 2.6.24 Jay Vosburgh
  2008-01-08  1:56 ` [PATCH 1/3] bonding: fix locking in sysfs primary/active selection Jay Vosburgh
@ 2008-01-08 18:50 ` Krzysztof Oledzki
  2008-01-08 19:17   ` Andy Gospodarek
  2008-01-08 19:30   ` Jay Vosburgh
  1 sibling, 2 replies; 30+ messages in thread
From: Krzysztof Oledzki @ 2008-01-08 18:50 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 56733 bytes --]



On Mon, 7 Jan 2008, Jay Vosburgh wrote:

> 	Following are three fixes to fix locking problems and
> silence locking-related warnings in the current 2.6.24-rc.
>
> 	patch 1: fix locking in sysfs primary/active selection
>
> 	Call core network functions with expected locks to
> eliminate potential deadlock and silence warnings.
>
> 	patch 2: fix ASSERT_RTNL that produces spurious warnings
>
> 	Relocate ASSERT_RTNL to remove a false warning; after patch,
> ASSERT is located in code that holds only RTNL (additional locks were
> causing the ASSERT to trip)
>
> 	patch 3: fix locking during alb failover and slave removal
>
> 	Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
> Eliminates deadlock and silences warnings.
>
> 	Patches are against the current netdev-2.6#upstream branch.
>
> 	Please apply for 2.6.24.

2.6.24-rc7 + patches #1, #2, #3:

bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting MII monitoring interval to 100.
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave eth0.
e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
bonding: bond0: making interface eth0 the new active one.
bonding: bond0: first active interface up!
bonding: bond0: enslaving eth0 as an active interface with an up link.
bonding: bond0: Adding slave eth1.
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.24-rc7 #1
---------------------------------------------------------
events/0/9 just changed the state of lock:
  (&mc->mca_lock){-+..}, at: [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
but this lock took another, soft-read-irq-unsafe lock in the past:
  (&bond->lock){-.--}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
4 locks held by events/0/9:
  #0:  (events){--..}, at: [<c0133d33>] run_workqueue+0x87/0x1b6
  #1:  ((linkwatch_work).work){--..}, at: [<c0133d33>] run_workqueue+0x87/0x1b6
  #2:  (rtnl_mutex){--..}, at: [<c03ac678>] linkwatch_event+0x5/0x22
  #3:  (&ndev->lock){-.-+}, at: [<c0412475>] mld_ifc_timer_expire+0x17/0x1fb

the first lock's dependencies:
-> (&mc->mca_lock){-+..} ops: 10 {
    initial-use  at:
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c0142890>] __lock_acquire+0x4ba/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142f95>] __lock_acquire+0xbbf/0xc07
                         [<c0412d66>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                         [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412d66>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0402168>] ipv6_add_dev+0x21c/0x24b
                         [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5ae9>] addrconf_init+0x13/0x193
                         [<c019a04b>] proc_net_fops_create+0x10/0x21
                         [<c041a44c>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c59c9>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    in-softirq-W at:
                         [<c014197a>] mark_lock+0x64/0x451
                         [<c0142816>] __lock_acquire+0x440/0xc07
                         [<c0103f7b>] restore_nocheck+0x12/0x15
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                         [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
                         [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
                         [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
                         [<c012e02e>] run_timer_softirq+0xfa/0x15d
                         [<c012a982>] __do_softirq+0x56/0xdb
                         [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
                         [<c012a994>] __do_softirq+0x68/0xdb
                         [<c012aa3d>] do_softirq+0x36/0x51
                         [<c012af26>] local_bh_enable_ip+0xad/0xed
                         [<c03bfa37>] rt_run_flush+0x64/0x8b
                         [<c03e9bbe>] fib_netdev_event+0x61/0x65
                         [<c013ac34>] notifier_call_chain+0x2a/0x52
                         [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a340d>] netdev_state_change+0x18/0x29
                         [<c03ac645>] __linkwatch_run_queue+0x150/0x17e
                         [<c03ac690>] linkwatch_event+0x1d/0x22
                         [<c0133d87>] run_workqueue+0xdb/0x1b6
                         [<c0133d33>] run_workqueue+0x87/0x1b6
                         [<c03ac673>] linkwatch_event+0x0/0x22
                         [<c01347a7>] worker_thread+0x0/0x85
                         [<c0134820>] worker_thread+0x79/0x85
                         [<c0137255>] autoremove_wake_function+0x0/0x35
                         [<c013719e>] kthread+0x38/0x5e
                         [<c0137166>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c01417e2>] find_usage_backwards+0xbb/0xe2
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014285e>] __lock_acquire+0x488/0xc07
                         [<c0109ef2>] save_stack_trace+0x20/0x3a
                         [<c0142f95>] __lock_acquire+0xbbf/0xc07
                         [<c0412d66>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c04129ea>] igmp6_group_added+0x18/0x11d
                         [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                         [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                         [<c0412d66>] ipv6_dev_mc_inc+0x24d/0x31c
                         [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                         [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                         [<c0402168>] ipv6_add_dev+0x21c/0x24b
                         [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                         [<c05c5ae9>] addrconf_init+0x13/0x193
                         [<c019a04b>] proc_net_fops_create+0x10/0x21
                         [<c041a44c>] ip6_flowlabel_init+0x1e/0x20
                         [<c05c59c9>] inet6_init+0x1f0/0x2ad
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c087e2d8>] __key.30803+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c0142890>] __lock_acquire+0x4ba/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c014197a>] mark_lock+0x64/0x451
                           [<c014285e>] __lock_acquire+0x488/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043a8aa>] _spin_lock_bh+0x3b/0x64
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c0412a28>] igmp6_group_added+0x56/0x11d
    [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] icmpv6_rcv+0x6a4/0x828
    [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0402168>] ipv6_add_dev+0x21c/0x24b
    [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c040274b>] addrconf_notify+0x60/0x7b7
    [<c0142f95>] __lock_acquire+0xbbf/0xc07
    [<c0141da0>] mark_held_locks+0x39/0x53
    [<c043997e>] mutex_lock_nested+0x286/0x2ac
    [<c0141f93>] trace_hardirqs_on+0x122/0x14c
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a4875>] register_netdevice_notifier+0xe/0x126
    [<c03a4875>] register_netdevice_notifier+0xe/0x126
    [<c03a48b0>] register_netdevice_notifier+0x49/0x126
    [<c05c5b83>] addrconf_init+0xad/0x193
    [<c05c5af1>] addrconf_init+0x1b/0x193
    [<c05c59c9>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff

  -> (&bonding_netdev_xmit_lock_key){-...} ops: 6 {
     initial-use  at:
                           [<c0142890>] __lock_acquire+0x4ba/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c014197a>] mark_lock+0x64/0x451
                           [<c014285e>] __lock_acquire+0x488/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c0877804>] bonding_netdev_xmit_lock_key+0x0/0x8
   -> (&bond->lock){-.--} ops: 99 {
      initial-use  at:
                             [<c013fe29>] put_lock_stats+0xa/0x1e
                             [<c0142890>] __lock_acquire+0x4ba/0xc07
                             [<c0141da0>] mark_held_locks+0x39/0x53
                             [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143056>] lock_acquire+0x79/0x93
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c043aa36>] _read_lock_bh+0x3b/0x64
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c03aad0f>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03ab28d>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03ab34e>] rtnetlink_event+0x3f/0x42
                             [<c013ac34>] notifier_call_chain+0x2a/0x52
                             [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a3b06>] register_netdevice+0x2a7/0x2e7
                             [<c02ed842>] bond_create+0x1f2/0x26a
                             [<c05bed76>] bonding_init+0x761/0x7ea
                             [<c05be5de>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-W at:
                             [<c014285e>] __lock_acquire+0x488/0xc07
                             [<c01208c9>] try_to_wake_up+0x2ce/0x2d8
                             [<c0142f95>] __lock_acquire+0xbbf/0xc07
                             [<c0143056>] lock_acquire+0x79/0x93
                             [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                             [<c043a96d>] _write_lock_bh+0x3b/0x64
                             [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                             [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                             [<c013fe29>] put_lock_stats+0xa/0x1e
                             [<c03a1e03>] __dev_set_rx_mode+0x7b/0x7d
                             [<c03a1f9d>] dev_set_rx_mode+0x23/0x36
                             [<c03a4678>] dev_open+0x5e/0x77
                             [<c03a3347>] dev_change_flags+0x9d/0x14b
                             [<c03a214b>] __dev_get_by_name+0x68/0x73
                             [<c03e4176>] devinet_ioctl+0x22b/0x536
                             [<c03a446d>] dev_ioctl+0x46f/0x5b7
                             [<c039a598>] sock_ioctl+0x167/0x18b
                             [<c039a431>] sock_ioctl+0x0/0x18b
                             [<c0172677>] do_ioctl+0x1f/0x62
                             [<c01728e7>] vfs_ioctl+0x22d/0x23f
                             [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                             [<c017292c>] sys_ioctl+0x33/0x4b
                             [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                             [<ffffffff>] 0xffffffff
      softirq-on-R at:
                             [<c014197a>] mark_lock+0x64/0x451
                             [<c013583a>] __kernel_text_address+0x5/0xe
                             [<c0104ee2>] dump_trace+0x83/0x8d
                             [<c014287d>] __lock_acquire+0x4a7/0xc07
                             [<c013fc6a>] save_trace+0x37/0x89
                             [<c0133d33>] run_workqueue+0x87/0x1b6
                             [<c0143056>] lock_acquire+0x79/0x93
                             [<c02eee43>] bond_mii_monitor+0x19/0x85
                             [<c043aa95>] _read_lock+0x36/0x5f
                             [<c02eee43>] bond_mii_monitor+0x19/0x85
                             [<c02eee43>] bond_mii_monitor+0x19/0x85
                             [<c0133d87>] run_workqueue+0xdb/0x1b6
                             [<c0133d33>] run_workqueue+0x87/0x1b6
                             [<c02eee2a>] bond_mii_monitor+0x0/0x85
                             [<c01347a7>] worker_thread+0x0/0x85
                             [<c0134820>] worker_thread+0x79/0x85
                             [<c0137255>] autoremove_wake_function+0x0/0x35
                             [<c013719e>] kthread+0x38/0x5e
                             [<c0137166>] kthread+0x0/0x5e
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
      hardirq-on-R at:
                             [<c013fdfe>] get_lock_stats+0xd/0x2e
                             [<c013fe29>] put_lock_stats+0xa/0x1e
                             [<c0142838>] __lock_acquire+0x462/0xc07
                             [<c0141da0>] mark_held_locks+0x39/0x53
                             [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                             [<c0143056>] lock_acquire+0x79/0x93
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c043aa36>] _read_lock_bh+0x3b/0x64
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c02edca1>] bond_get_stats+0x28/0xd0
                             [<c03aad0f>] rtnl_fill_ifinfo+0x2bf/0x563
                             [<c03ab28d>] rtmsg_ifinfo+0x5d/0xdf
                             [<c03ab34e>] rtnetlink_event+0x3f/0x42
                             [<c013ac34>] notifier_call_chain+0x2a/0x52
                             [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
                             [<c03a3b06>] register_netdevice+0x2a7/0x2e7
                             [<c02ed842>] bond_create+0x1f2/0x26a
                             [<c05bed76>] bonding_init+0x761/0x7ea
                             [<c05be5de>] e1000_init_module+0x45/0x7c
                             [<c05a9499>] kernel_init+0x150/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c05a9349>] kernel_init+0x0/0x2b7
                             [<c0104baf>] kernel_thread_helper+0x7/0x10
                             [<ffffffff>] 0xffffffff
    }
    ... key      at: [<c08777d0>] __key.32976+0x0/0x8
    -> (_xmit_ETHER){-...} ops: 8 {
       initial-use  at:
                               [<c0142890>] __lock_acquire+0x4ba/0xc07
                               [<c0143056>] lock_acquire+0x79/0x93
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c0412a28>] igmp6_group_added+0x56/0x11d
                               [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] icmpv6_rcv+0x6a4/0x828
                               [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0402168>] ipv6_add_dev+0x21c/0x24b
                               [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c040274b>] addrconf_notify+0x60/0x7b7
                               [<c0142f95>] __lock_acquire+0xbbf/0xc07
                               [<c0141da0>] mark_held_locks+0x39/0x53
                               [<c043997e>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                               [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a4875>] register_netdevice_notifier+0xe/0x126
                               [<c03a4875>] register_netdevice_notifier+0xe/0x126
                               [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                               [<c05c5b83>] addrconf_init+0xad/0x193
                               [<c05c5af1>] addrconf_init+0x1b/0x193
                               [<c05c59c9>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       hardirq-on-W at:
                               [<c014197a>] mark_lock+0x64/0x451
                               [<c014285e>] __lock_acquire+0x488/0xc07
                               [<c0143056>] lock_acquire+0x79/0x93
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c03a63c9>] dev_mc_add+0x1a/0x6a
                               [<c0412a28>] igmp6_group_added+0x56/0x11d
                               [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                               [<c0410100>] icmpv6_rcv+0x6a4/0x828
                               [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                               [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                               [<c0402168>] ipv6_add_dev+0x21c/0x24b
                               [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                               [<c040274b>] addrconf_notify+0x60/0x7b7
                               [<c0142f95>] __lock_acquire+0xbbf/0xc07
                               [<c0141da0>] mark_held_locks+0x39/0x53
                               [<c043997e>] mutex_lock_nested+0x286/0x2ac
                               [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                               [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                               [<c03a4875>] register_netdevice_notifier+0xe/0x126
                               [<c03a4875>] register_netdevice_notifier+0xe/0x126
                               [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                               [<c05c5b83>] addrconf_init+0xad/0x193
                               [<c05c5af1>] addrconf_init+0x1b/0x193
                               [<c05c59c9>] inet6_init+0x1f0/0x2ad
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
    ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043a8aa>] _spin_lock_bh+0x3b/0x64
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c02ee472>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7a3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee991>] bond_select_active_slave+0x97/0xd3
    [<c02ed20b>] bond_compute_features+0x45/0x84
    [<c02ef9b6>] bond_enslave+0x6a7/0x884
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f61af>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f6001>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8b7>] dev_attr_store+0x27/0x2c
    [<c019bda1>] sysfs_write_file+0xad/0xe0
    [<c019bcf4>] sysfs_write_file+0x0/0xe0
    [<c0168e5c>] vfs_write+0x8a/0x10c
    [<c0118562>] do_page_fault+0x0/0x54a
    [<c01693e1>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

    -> (lweventlist_lock){.+..} ops: 10 {
       initial-use  at:
                               [<c014197a>] mark_lock+0x64/0x451
                               [<c0142890>] __lock_acquire+0x4ba/0xc07
                               [<c02e363c>] e1000_read_phy_reg+0x1c7/0x1d3
                               [<c02e346b>] e1000_write_phy_reg+0xb9/0xc3
                               [<c024a82e>] delay_tsc+0x25/0x3b
                               [<c0143056>] lock_acquire+0x79/0x93
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
                               [<c02e1c23>] e1000_probe+0xad1/0xbe8
                               [<c0257f1b>] pci_device_probe+0x36/0x57
                               [<c02d0e3f>] driver_probe_device+0xe1/0x15f
                               [<c043ae41>] _spin_unlock+0x25/0x3b
                               [<c0437ec6>] klist_next+0x58/0x6d
                               [<c02d0f4f>] __driver_attach+0x0/0x7f
                               [<c02d0f98>] __driver_attach+0x49/0x7f
                               [<c02d03e3>] bus_for_each_dev+0x36/0x58
                               [<c02d0c97>] driver_attach+0x16/0x18
                               [<c02d0f4f>] __driver_attach+0x0/0x7f
                               [<c02d06da>] bus_add_driver+0x6d/0x18d
                               [<c0258065>] __pci_register_driver+0x53/0x7f
                               [<c05be5de>] e1000_init_module+0x45/0x7c
                               [<c05a9499>] kernel_init+0x150/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c05a9349>] kernel_init+0x0/0x2b7
                               [<c0104baf>] kernel_thread_helper+0x7/0x10
                               [<ffffffff>] 0xffffffff
       in-softirq-W at:
                               [<c011d1c0>] __wake_up_common+0x32/0x5c
                               [<c0142816>] __lock_acquire+0x440/0xc07
                               [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                               [<c0143056>] lock_acquire+0x79/0x93
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c02dfee1>] e1000_watchdog+0x0/0x5c9
                               [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                               [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
                               [<c03af46a>] netif_carrier_on+0x16/0x27
                               [<c02e0136>] e1000_watchdog+0x255/0x5c9
                               [<c02dfee1>] e1000_watchdog+0x0/0x5c9
                               [<c012e02e>] run_timer_softirq+0xfa/0x15d
                               [<c012a982>] __do_softirq+0x56/0xdb
                               [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
                               [<c012a994>] __do_softirq+0x68/0xdb
                               [<c012aa3d>] do_softirq+0x36/0x51
                               [<c012abe3>] irq_exit+0x43/0x4e
                               [<c0114122>] smp_apic_timer_interrupt+0x74/0x80
                               [<c0104a01>] apic_timer_interrupt+0x29/0x38
                               [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                               [<c01600d8>] sys_swapon+0x254/0x9aa
                               [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                               [<c0102447>] mwait_idle+0x0/0xf
                               [<c0102581>] cpu_idle+0x99/0xc6
                               [<c05a98c7>] start_kernel+0x2c7/0x2cf
                               [<c05a90e0>] unknown_bootoption+0x0/0x195
                               [<ffffffff>] 0xffffffff
     }
     ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
    ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c0142f95>] __lock_acquire+0xbbf/0xc07
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
    [<c03af46a>] netif_carrier_on+0x16/0x27
    [<c02ede0c>] bond_set_carrier+0x31/0x55
    [<c02ee998>] bond_select_active_slave+0x9e/0xd3
    [<c02ed20b>] bond_compute_features+0x45/0x84
    [<c02ef9b6>] bond_enslave+0x6a7/0x884
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f61af>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f6001>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8b7>] dev_attr_store+0x27/0x2c
    [<c019bda1>] sysfs_write_file+0xad/0xe0
    [<c019bcf4>] sysfs_write_file+0x0/0xe0
    [<c0168e5c>] vfs_write+0x8a/0x10c
    [<c0118562>] do_page_fault+0x0/0x54a
    [<c01693e1>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

   ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c02eda55>] bond_set_multicast_list+0x1d/0x241
    [<c0143056>] lock_acquire+0x79/0x93
    [<c02eda55>] bond_set_multicast_list+0x1d/0x241
    [<c043a96d>] _write_lock_bh+0x3b/0x64
    [<c02eda55>] bond_set_multicast_list+0x1d/0x241
    [<c02eda55>] bond_set_multicast_list+0x1d/0x241
    [<c013fe29>] put_lock_stats+0xa/0x1e
    [<c03a1e03>] __dev_set_rx_mode+0x7b/0x7d
    [<c03a1f9d>] dev_set_rx_mode+0x23/0x36
    [<c03a4678>] dev_open+0x5e/0x77
    [<c03a3347>] dev_change_flags+0x9d/0x14b
    [<c03a214b>] __dev_get_by_name+0x68/0x73
    [<c03e4176>] devinet_ioctl+0x22b/0x536
    [<c03a446d>] dev_ioctl+0x46f/0x5b7
    [<c039a598>] sock_ioctl+0x167/0x18b
    [<c039a431>] sock_ioctl+0x0/0x18b
    [<c0172677>] do_ioctl+0x1f/0x62
    [<c01728e7>] vfs_ioctl+0x22d/0x23f
    [<c0141f93>] trace_hardirqs_on+0x122/0x14c
    [<c017292c>] sys_ioctl+0x33/0x4b
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043a8aa>] _spin_lock_bh+0x3b/0x64
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c0412a28>] igmp6_group_added+0x56/0x11d
    [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
    [<c0410100>] icmpv6_rcv+0x6a4/0x828
    [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
    [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
    [<c0402168>] ipv6_add_dev+0x21c/0x24b
    [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
    [<c040274b>] addrconf_notify+0x60/0x7b7
    [<c0142f95>] __lock_acquire+0xbbf/0xc07
    [<c0141da0>] mark_held_locks+0x39/0x53
    [<c043997e>] mutex_lock_nested+0x286/0x2ac
    [<c0141f93>] trace_hardirqs_on+0x122/0x14c
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c03a4875>] register_netdevice_notifier+0xe/0x126
    [<c03a4875>] register_netdevice_notifier+0xe/0x126
    [<c03a48b0>] register_netdevice_notifier+0x49/0x126
    [<c05c5b83>] addrconf_init+0xad/0x193
    [<c05c5af1>] addrconf_init+0x1b/0x193
    [<c05c59c9>] inet6_init+0x1f0/0x2ad
    [<c05a9499>] kernel_init+0x150/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c05a9349>] kernel_init+0x0/0x2b7
    [<c0104baf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff


the second lock's dependencies:
-> (&bond->lock){-.--} ops: 99 {
    initial-use  at:
                         [<c013fe29>] put_lock_stats+0xa/0x1e
                         [<c0142890>] __lock_acquire+0x4ba/0xc07
                         [<c0141da0>] mark_held_locks+0x39/0x53
                         [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c043aa36>] _read_lock_bh+0x3b/0x64
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c03aad0f>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03ab28d>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03ab34e>] rtnetlink_event+0x3f/0x42
                         [<c013ac34>] notifier_call_chain+0x2a/0x52
                         [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a3b06>] register_netdevice+0x2a7/0x2e7
                         [<c02ed842>] bond_create+0x1f2/0x26a
                         [<c05bed76>] bonding_init+0x761/0x7ea
                         [<c05be5de>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-W at:
                         [<c014285e>] __lock_acquire+0x488/0xc07
                         [<c01208c9>] try_to_wake_up+0x2ce/0x2d8
                         [<c0142f95>] __lock_acquire+0xbbf/0xc07
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                         [<c043a96d>] _write_lock_bh+0x3b/0x64
                         [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                         [<c02eda55>] bond_set_multicast_list+0x1d/0x241
                         [<c013fe29>] put_lock_stats+0xa/0x1e
                         [<c03a1e03>] __dev_set_rx_mode+0x7b/0x7d
                         [<c03a1f9d>] dev_set_rx_mode+0x23/0x36
                         [<c03a4678>] dev_open+0x5e/0x77
                         [<c03a3347>] dev_change_flags+0x9d/0x14b
                         [<c03a214b>] __dev_get_by_name+0x68/0x73
                         [<c03e4176>] devinet_ioctl+0x22b/0x536
                         [<c03a446d>] dev_ioctl+0x46f/0x5b7
                         [<c039a598>] sock_ioctl+0x167/0x18b
                         [<c039a431>] sock_ioctl+0x0/0x18b
                         [<c0172677>] do_ioctl+0x1f/0x62
                         [<c01728e7>] vfs_ioctl+0x22d/0x23f
                         [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                         [<c017292c>] sys_ioctl+0x33/0x4b
                         [<c0103e92>] sysenter_past_esp+0x5f/0xa5
                         [<ffffffff>] 0xffffffff
    softirq-on-R at:
                         [<c014197a>] mark_lock+0x64/0x451
                         [<c013583a>] __kernel_text_address+0x5/0xe
                         [<c0104ee2>] dump_trace+0x83/0x8d
                         [<c014287d>] __lock_acquire+0x4a7/0xc07
                         [<c013fc6a>] save_trace+0x37/0x89
                         [<c0133d33>] run_workqueue+0x87/0x1b6
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c02eee43>] bond_mii_monitor+0x19/0x85
                         [<c043aa95>] _read_lock+0x36/0x5f
                         [<c02eee43>] bond_mii_monitor+0x19/0x85
                         [<c02eee43>] bond_mii_monitor+0x19/0x85
                         [<c0133d87>] run_workqueue+0xdb/0x1b6
                         [<c0133d33>] run_workqueue+0x87/0x1b6
                         [<c02eee2a>] bond_mii_monitor+0x0/0x85
                         [<c01347a7>] worker_thread+0x0/0x85
                         [<c0134820>] worker_thread+0x79/0x85
                         [<c0137255>] autoremove_wake_function+0x0/0x35
                         [<c013719e>] kthread+0x38/0x5e
                         [<c0137166>] kthread+0x0/0x5e
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
    hardirq-on-R at:
                         [<c013fdfe>] get_lock_stats+0xd/0x2e
                         [<c013fe29>] put_lock_stats+0xa/0x1e
                         [<c0142838>] __lock_acquire+0x462/0xc07
                         [<c0141da0>] mark_held_locks+0x39/0x53
                         [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                         [<c0143056>] lock_acquire+0x79/0x93
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c043aa36>] _read_lock_bh+0x3b/0x64
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c02edca1>] bond_get_stats+0x28/0xd0
                         [<c03aad0f>] rtnl_fill_ifinfo+0x2bf/0x563
                         [<c03ab28d>] rtmsg_ifinfo+0x5d/0xdf
                         [<c03ab34e>] rtnetlink_event+0x3f/0x42
                         [<c013ac34>] notifier_call_chain+0x2a/0x52
                         [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
                         [<c03a3b06>] register_netdevice+0x2a7/0x2e7
                         [<c02ed842>] bond_create+0x1f2/0x26a
                         [<c05bed76>] bonding_init+0x761/0x7ea
                         [<c05be5de>] e1000_init_module+0x45/0x7c
                         [<c05a9499>] kernel_init+0x150/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c05a9349>] kernel_init+0x0/0x2b7
                         [<c0104baf>] kernel_thread_helper+0x7/0x10
                         [<ffffffff>] 0xffffffff
  }
  ... key      at: [<c08777d0>] __key.32976+0x0/0x8
  -> (_xmit_ETHER){-...} ops: 8 {
     initial-use  at:
                           [<c0142890>] __lock_acquire+0x4ba/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     hardirq-on-W at:
                           [<c014197a>] mark_lock+0x64/0x451
                           [<c014285e>] __lock_acquire+0x488/0xc07
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c043a8aa>] _spin_lock_bh+0x3b/0x64
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c03a63c9>] dev_mc_add+0x1a/0x6a
                           [<c0412a28>] igmp6_group_added+0x56/0x11d
                           [<c0412dbc>] ipv6_dev_mc_inc+0x2a3/0x31c
                           [<c0410100>] icmpv6_rcv+0x6a4/0x828
                           [<c0412df1>] ipv6_dev_mc_inc+0x2d8/0x31c
                           [<c0412b19>] ipv6_dev_mc_inc+0x0/0x31c
                           [<c0402168>] ipv6_add_dev+0x21c/0x24b
                           [<c040b991>] ndisc_ifinfo_sysctl_change+0x0/0x1ef
                           [<c040274b>] addrconf_notify+0x60/0x7b7
                           [<c0142f95>] __lock_acquire+0xbbf/0xc07
                           [<c0141da0>] mark_held_locks+0x39/0x53
                           [<c043997e>] mutex_lock_nested+0x286/0x2ac
                           [<c0141f93>] trace_hardirqs_on+0x122/0x14c
                           [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a4875>] register_netdevice_notifier+0xe/0x126
                           [<c03a48b0>] register_netdevice_notifier+0x49/0x126
                           [<c05c5b83>] addrconf_init+0xad/0x193
                           [<c05c5af1>] addrconf_init+0x1b/0x193
                           [<c05c59c9>] inet6_init+0x1f0/0x2ad
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c087adc8>] netdev_xmit_lock_key+0x8/0x1c0
  ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
    [<c0109ef2>] save_stack_trace+0x20/0x3a
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c043a8aa>] _spin_lock_bh+0x3b/0x64
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c03a63c9>] dev_mc_add+0x1a/0x6a
    [<c02ee472>] bond_change_active_slave+0x1a9/0x3bf
    [<c02ec7a3>] bond_update_speed_duplex+0x26/0x65
    [<c02ee991>] bond_select_active_slave+0x97/0xd3
    [<c02ed20b>] bond_compute_features+0x45/0x84
    [<c02ef9b6>] bond_enslave+0x6a7/0x884
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f61af>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f6001>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8b7>] dev_attr_store+0x27/0x2c
    [<c019bda1>] sysfs_write_file+0xad/0xe0
    [<c019bcf4>] sysfs_write_file+0x0/0xe0
    [<c0168e5c>] vfs_write+0x8a/0x10c
    [<c0118562>] do_page_fault+0x0/0x54a
    [<c01693e1>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff

  -> (lweventlist_lock){.+..} ops: 10 {
     initial-use  at:
                           [<c014197a>] mark_lock+0x64/0x451
                           [<c0142890>] __lock_acquire+0x4ba/0xc07
                           [<c02e363c>] e1000_read_phy_reg+0x1c7/0x1d3
                           [<c02e346b>] e1000_write_phy_reg+0xb9/0xc3
                           [<c024a82e>] delay_tsc+0x25/0x3b
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
                           [<c02e1c23>] e1000_probe+0xad1/0xbe8
                           [<c0257f1b>] pci_device_probe+0x36/0x57
                           [<c02d0e3f>] driver_probe_device+0xe1/0x15f
                           [<c043ae41>] _spin_unlock+0x25/0x3b
                           [<c0437ec6>] klist_next+0x58/0x6d
                           [<c02d0f4f>] __driver_attach+0x0/0x7f
                           [<c02d0f98>] __driver_attach+0x49/0x7f
                           [<c02d03e3>] bus_for_each_dev+0x36/0x58
                           [<c02d0c97>] driver_attach+0x16/0x18
                           [<c02d0f4f>] __driver_attach+0x0/0x7f
                           [<c02d06da>] bus_add_driver+0x6d/0x18d
                           [<c0258065>] __pci_register_driver+0x53/0x7f
                           [<c05be5de>] e1000_init_module+0x45/0x7c
                           [<c05a9499>] kernel_init+0x150/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c05a9349>] kernel_init+0x0/0x2b7
                           [<c0104baf>] kernel_thread_helper+0x7/0x10
                           [<ffffffff>] 0xffffffff
     in-softirq-W at:
                           [<c011d1c0>] __wake_up_common+0x32/0x5c
                           [<c0142816>] __lock_acquire+0x440/0xc07
                           [<c043af0d>] _spin_unlock_irqrestore+0x40/0x58
                           [<c0143056>] lock_acquire+0x79/0x93
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c02dfee1>] e1000_watchdog+0x0/0x5c9
                           [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c03ac41f>] linkwatch_add_event+0xd/0x2c
                           [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
                           [<c03af46a>] netif_carrier_on+0x16/0x27
                           [<c02e0136>] e1000_watchdog+0x255/0x5c9
                           [<c02dfee1>] e1000_watchdog+0x0/0x5c9
                           [<c012e02e>] run_timer_softirq+0xfa/0x15d
                           [<c012a982>] __do_softirq+0x56/0xdb
                           [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
                           [<c012a994>] __do_softirq+0x68/0xdb
                           [<c012aa3d>] do_softirq+0x36/0x51
                           [<c012abe3>] irq_exit+0x43/0x4e
                           [<c0114122>] smp_apic_timer_interrupt+0x74/0x80
                           [<c0104a01>] apic_timer_interrupt+0x29/0x38
                           [<c0104a0b>] apic_timer_interrupt+0x33/0x38
                           [<c01600d8>] sys_swapon+0x254/0x9aa
                           [<c01021a6>] mwait_idle_with_hints+0x3b/0x3f
                           [<c0102447>] mwait_idle+0x0/0xf
                           [<c0102581>] cpu_idle+0x99/0xc6
                           [<c05a98c7>] start_kernel+0x2c7/0x2cf
                           [<c05a90e0>] unknown_bootoption+0x0/0x195
                           [<ffffffff>] 0xffffffff
   }
   ... key      at: [<c058a194>] lweventlist_lock+0x14/0x40
  ... acquired at:
    [<c0142df3>] __lock_acquire+0xa1d/0xc07
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c0142f95>] __lock_acquire+0xbbf/0xc07
    [<c0143056>] lock_acquire+0x79/0x93
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c043abc7>] _spin_lock_irqsave+0x3f/0x6c
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c03ac41f>] linkwatch_add_event+0xd/0x2c
    [<c03ac4e3>] linkwatch_fire_event+0x25/0x37
    [<c03af46a>] netif_carrier_on+0x16/0x27
    [<c02ede0c>] bond_set_carrier+0x31/0x55
    [<c02ee998>] bond_select_active_slave+0x9e/0xd3
    [<c02ed20b>] bond_compute_features+0x45/0x84
    [<c02ef9b6>] bond_enslave+0x6a7/0x884
    [<c043999c>] mutex_lock_nested+0x2a4/0x2ac
    [<c02f61af>] bonding_store_slaves+0x1ae/0x2fb
    [<c02f6001>] bonding_store_slaves+0x0/0x2fb
    [<c02ce8b7>] dev_attr_store+0x27/0x2c
    [<c019bda1>] sysfs_write_file+0xad/0xe0
    [<c019bcf4>] sysfs_write_file+0x0/0xe0
    [<c0168e5c>] vfs_write+0x8a/0x10c
    [<c0118562>] do_page_fault+0x0/0x54a
    [<c01693e1>] sys_write+0x41/0x67
    [<c0103e92>] sysenter_past_esp+0x5f/0xa5
    [<ffffffff>] 0xffffffff


stack backtrace:
Pid: 9, comm: events/0 Not tainted 2.6.24-rc7 #1
  [<c0140b2c>] print_irq_inversion_bug+0x108/0x112
  [<c0141911>] check_usage_forwards+0x3c/0x41
  [<c0141afd>] mark_lock+0x1e7/0x451
  [<c0142816>] __lock_acquire+0x440/0xc07
  [<c0103f7b>] restore_nocheck+0x12/0x15
  [<c0143056>] lock_acquire+0x79/0x93
  [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
  [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
  [<c043a8aa>] _spin_lock_bh+0x3b/0x64
  [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
  [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
  [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
  [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
  [<c041245e>] mld_ifc_timer_expire+0x0/0x1fb
  [<c012e02e>] run_timer_softirq+0xfa/0x15d
  [<c012a982>] __do_softirq+0x56/0xdb
  [<c0141f7d>] trace_hardirqs_on+0x10c/0x14c
  [<c012a994>] __do_softirq+0x68/0xdb
  [<c012aa3d>] do_softirq+0x36/0x51
  [<c012af26>] local_bh_enable_ip+0xad/0xed
  [<c03bfa37>] rt_run_flush+0x64/0x8b
  [<c03e9bbe>] fib_netdev_event+0x61/0x65
  [<c013ac34>] notifier_call_chain+0x2a/0x52
  [<c013ac7e>] raw_notifier_call_chain+0x17/0x1a
  [<c03a340d>] netdev_state_change+0x18/0x29
  [<c03ac645>] __linkwatch_run_queue+0x150/0x17e
  [<c03ac690>] linkwatch_event+0x1d/0x22
  [<c0133d87>] run_workqueue+0xdb/0x1b6
  [<c0133d33>] run_workqueue+0x87/0x1b6
  [<c03ac673>] linkwatch_event+0x0/0x22
  [<c01347a7>] worker_thread+0x0/0x85
  [<c0134820>] worker_thread+0x79/0x85
  [<c0137255>] autoremove_wake_function+0x0/0x35
  [<c013719e>] kthread+0x38/0x5e
  [<c0137166>] kthread+0x0/0x5e
  [<c0104baf>] kernel_thread_helper+0x7/0x10
  =======================
bonding: bond0: enslaving eth1 as a backup interface with a down link.
bond0: no IPv6 routers present



Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
@ 2008-01-08 19:17   ` Andy Gospodarek
  2008-01-08 20:28     ` Jay Vosburgh
  2008-01-09  6:08     ` Herbert Xu
  2008-01-08 19:30   ` Jay Vosburgh
  1 sibling, 2 replies; 30+ messages in thread
From: Andy Gospodarek @ 2008-01-08 19:17 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: Jay Vosburgh, netdev, Jeff Garzik, David Miller, Andy Gospodarek

On Tue, Jan 08, 2008 at 07:50:22PM +0100, Krzysztof Oledzki wrote:
> 
> 
> On Mon, 7 Jan 2008, Jay Vosburgh wrote:
> 
> >	Following are three fixes to fix locking problems and
> >silence locking-related warnings in the current 2.6.24-rc.
> >
> >	patch 1: fix locking in sysfs primary/active selection
> >
> >	Call core network functions with expected locks to
> >eliminate potential deadlock and silence warnings.
> >
> >	patch 2: fix ASSERT_RTNL that produces spurious warnings
> >
> >	Relocate ASSERT_RTNL to remove a false warning; after patch,
> >ASSERT is located in code that holds only RTNL (additional locks were
> >causing the ASSERT to trip)
> >
> >	patch 3: fix locking during alb failover and slave removal
> >
> >	Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
> >Eliminates deadlock and silences warnings.
> >
> >	Patches are against the current netdev-2.6#upstream branch.
> >
> >	Please apply for 2.6.24.
> 
> 2.6.24-rc7 + patches #1, #2, #3:
> 
> bonding: bond0: setting mode to active-backup (1).
> bonding: bond0: Setting MII monitoring interval to 100.
> ADDRCONF(NETDEV_UP): bond0: link is not ready
> bonding: bond0: Adding slave eth0.
> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: RX/TX
> bonding: bond0: making interface eth0 the new active one.
> bonding: bond0: first active interface up!
> bonding: bond0: enslaving eth0 as an active interface with an up link.
> bonding: bond0: Adding slave eth1.
> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> 
> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.24-rc7 #1
> ---------------------------------------------------------
> events/0/9 just changed the state of lock:
>  (&mc->mca_lock){-+..}, at: [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
> but this lock took another, soft-read-irq-unsafe lock in the past:
>  (&bond->lock){-.--}
> 
> and interrupts could create inverse lock ordering between them.
> 
> 

Jay's patches will not fix this issue.  I think something like this did
it for me, but as I mentioned to Jay in the last thread, I'm not
convinced it doesn't violate some of the locking expectations we have.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 423298c..3c6619a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3915,7 +3915,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
+	read_lock(&bond->lock);
 
 	/*
 	 * Do promisc before checking multicast_mode
@@ -3957,7 +3957,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-	write_unlock_bh(&bond->lock);
+	read_unlock(&bond->lock);
 }
 
 /*



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
  2008-01-08 19:17   ` Andy Gospodarek
@ 2008-01-08 19:30   ` Jay Vosburgh
  2008-01-09  6:35     ` Krzysztof Oledzki
  1 sibling, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08 19:30 UTC (permalink / raw)
  To: Krzysztof Oledzki; +Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek

Krzysztof Oledzki <olel@ans.pl> wrote:

>On Mon, 7 Jan 2008, Jay Vosburgh wrote:
>
>> 	Following are three fixes to fix locking problems and
>> silence locking-related warnings in the current 2.6.24-rc.
>>
>> 	patch 1: fix locking in sysfs primary/active selection
>>
>> 	Call core network functions with expected locks to
>> eliminate potential deadlock and silence warnings.
>>
>> 	patch 2: fix ASSERT_RTNL that produces spurious warnings
>>
>> 	Relocate ASSERT_RTNL to remove a false warning; after patch,
>> ASSERT is located in code that holds only RTNL (additional locks were
>> causing the ASSERT to trip)
>>
>> 	patch 3: fix locking during alb failover and slave removal
>>
>> 	Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
>> Eliminates deadlock and silences warnings.
>>
>> 	Patches are against the current netdev-2.6#upstream branch.
>>
>> 	Please apply for 2.6.24.
>
>2.6.24-rc7 + patches #1, #2, #3:
>
>bonding: bond0: setting mode to active-backup (1).
>bonding: bond0: Setting MII monitoring interval to 100.
>ADDRCONF(NETDEV_UP): bond0: link is not ready
>bonding: bond0: Adding slave eth0.
>e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>bonding: bond0: making interface eth0 the new active one.
>bonding: bond0: first active interface up!
>bonding: bond0: enslaving eth0 as an active interface with an up link.
>bonding: bond0: Adding slave eth1.
>ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>
>=========================================================
>[ INFO: possible irq lock inversion dependency detected ]
>2.6.24-rc7 #1
>---------------------------------------------------------
>events/0/9 just changed the state of lock:
> (&mc->mca_lock){-+..}, at: [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
>but this lock took another, soft-read-irq-unsafe lock in the past:
> (&bond->lock){-.--}
>
>and interrupts could create inverse lock ordering between them.

	Just to be clear: the patch set I posted yesterday was not
intended to resolve the lockdep problem; I haven't studied that one yet.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08 19:17   ` Andy Gospodarek
@ 2008-01-08 20:28     ` Jay Vosburgh
  2008-01-09  6:08     ` Herbert Xu
  1 sibling, 0 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-08 20:28 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Krzysztof Oledzki, netdev, Jeff Garzik, David Miller

Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
>Jay's patches will not fix this issue.  I think something like this did
>it for me, but as I mentioned to Jay in the last thread, I'm not
>convinced it doesn't violate some of the locking expectations we have.
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 423298c..3c6619a 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3915,7 +3915,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> 	struct bonding *bond = bond_dev->priv;
> 	struct dev_mc_list *dmi;
>
>-	write_lock_bh(&bond->lock);
>+	read_lock(&bond->lock);
>
> 	/*
> 	 * Do promisc before checking multicast_mode
>@@ -3957,7 +3957,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> 	bond_mc_list_destroy(bond);
> 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
>
>-	write_unlock_bh(&bond->lock);
>+	read_unlock(&bond->lock);
> }
>
> /*

	Actually, I think we might be good here with no locks at all, as
it appears that all of the accesses to and manipulations of the
bond->mc_list are protected under RTNL.  I haven't checked this 100%,
but it looks that way to me after 20 minutes of poking around.  I'm
pretty sure that bonding doesn't internally mess with the mc_lists
without RTNL, it's the outside callers that I'm not entirely sure of.

	I delve into "no locks" because bond_set_multicast_list should
do a bunch of things with no extra locks beyond RTNL (all of the calls
to bond_set_promisc, and _allmulti), so simply removing the acquisition
of bond->lock would help there, too.  I don't think we'll go down the
promisc or allmulti paths when called from ipv6 (which holds extra locks
in addition to RTNL) because those (apparently) won't alter the
IFF_PROMISC or IFF_ALLMULTI flags.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08 19:17   ` Andy Gospodarek
  2008-01-08 20:28     ` Jay Vosburgh
@ 2008-01-09  6:08     ` Herbert Xu
  1 sibling, 0 replies; 30+ messages in thread
From: Herbert Xu @ 2008-01-09  6:08 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: olel, fubar, netdev, jgarzik, davem, andy

Andy Gospodarek <andy@greyhouse.net> wrote:
> 
> Jay's patches will not fix this issue.  I think something like this did
> it for me, but as I mentioned to Jay in the last thread, I'm not
> convinced it doesn't violate some of the locking expectations we have.
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 423298c..3c6619a 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3915,7 +3915,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
>        struct bonding *bond = bond_dev->priv;
>        struct dev_mc_list *dmi;
> 
> -       write_lock_bh(&bond->lock);
> +       read_lock(&bond->lock);

This is wrong.  As I said before, the correct fix (until set_multicast
no longer gets called from BH context) is to change all you process
context read_lock's to read_lock_bh.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-08 19:30   ` Jay Vosburgh
@ 2008-01-09  6:35     ` Krzysztof Oledzki
  2008-01-09  7:58       ` Jay Vosburgh
  0 siblings, 1 reply; 30+ messages in thread
From: Krzysztof Oledzki @ 2008-01-09  6:35 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2403 bytes --]



On Tue, 8 Jan 2008, Jay Vosburgh wrote:

> Krzysztof Oledzki <olel@ans.pl> wrote:
>
>> On Mon, 7 Jan 2008, Jay Vosburgh wrote:
>>
>>> 	Following are three fixes to fix locking problems and
>>> silence locking-related warnings in the current 2.6.24-rc.
>>>
>>> 	patch 1: fix locking in sysfs primary/active selection
>>>
>>> 	Call core network functions with expected locks to
>>> eliminate potential deadlock and silence warnings.
>>>
>>> 	patch 2: fix ASSERT_RTNL that produces spurious warnings
>>>
>>> 	Relocate ASSERT_RTNL to remove a false warning; after patch,
>>> ASSERT is located in code that holds only RTNL (additional locks were
>>> causing the ASSERT to trip)
>>>
>>> 	patch 3: fix locking during alb failover and slave removal
>>>
>>> 	Fix all call paths into alb_fasten_mac_swap to hold only RTNL.
>>> Eliminates deadlock and silences warnings.
>>>
>>> 	Patches are against the current netdev-2.6#upstream branch.
>>>
>>> 	Please apply for 2.6.24.
>>
>> 2.6.24-rc7 + patches #1, #2, #3:
>>
>> bonding: bond0: setting mode to active-backup (1).
>> bonding: bond0: Setting MII monitoring interval to 100.
>> ADDRCONF(NETDEV_UP): bond0: link is not ready
>> bonding: bond0: Adding slave eth0.
>> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>> bonding: bond0: making interface eth0 the new active one.
>> bonding: bond0: first active interface up!
>> bonding: bond0: enslaving eth0 as an active interface with an up link.
>> bonding: bond0: Adding slave eth1.
>> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
>>
>> =========================================================
>> [ INFO: possible irq lock inversion dependency detected ]
>> 2.6.24-rc7 #1
>> ---------------------------------------------------------
>> events/0/9 just changed the state of lock:
>> (&mc->mca_lock){-+..}, at: [<c041258e>] mld_ifc_timer_expire+0x130/0x1fb
>> but this lock took another, soft-read-irq-unsafe lock in the past:
>> (&bond->lock){-.--}
>>
>> and interrupts could create inverse lock ordering between them.
>
> 	Just to be clear: the patch set I posted yesterday was not
> intended to resolve the lockdep problem; I haven't studied that one yet.

Fine. Just let you know that someone test your patches and everything 
works, except mentioned problem.

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09  6:35     ` Krzysztof Oledzki
@ 2008-01-09  7:58       ` Jay Vosburgh
  2008-01-09  9:36         ` Krzysztof Oledzki
  2008-01-09 15:27         ` Andy Gospodarek
  0 siblings, 2 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-09  7:58 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek, Herbert Xu

Krzysztof Oledzki <olel@ans.pl> wrote:

>Fine. Just let you know that someone test your patches and everything
>works, except mentioned problem.

	And I appreciate it; I just wanted to make sure our many fans
following along at home didn't misunderstand.

	Could you let me know if the patch below make the lockdep
warning go away?  This applies on top of the previous three, although it
should be trivial to do by hand.

	I'm still checking to make sure this is safe with regard to
mutexing the bonding structures, but it would be good to know if it
eliminates the warning.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..1baaadc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
-
 	/*
 	 * Do promisc before checking multicast_mode
 	 */
@@ -3978,8 +3976,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	/* save master's multicast list */
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
-
-	write_unlock_bh(&bond->lock);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09  7:58       ` Jay Vosburgh
@ 2008-01-09  9:36         ` Krzysztof Oledzki
  2008-01-09 15:27         ` Andy Gospodarek
  1 sibling, 0 replies; 30+ messages in thread
From: Krzysztof Oledzki @ 2008-01-09  9:36 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek, Herbert Xu

[-- Attachment #1: Type: TEXT/PLAIN, Size: 800 bytes --]



On Tue, 8 Jan 2008, Jay Vosburgh wrote:

> Krzysztof Oledzki <olel@ans.pl> wrote:
>
>> Fine. Just let you know that someone test your patches and everything
>> works, except mentioned problem.
>
> 	And I appreciate it; I just wanted to make sure our many fans
> following along at home didn't misunderstand.
>
> 	Could you let me know if the patch below make the lockdep
> warning go away?  This applies on top of the previous three, although it
> should be trivial to do by hand.
>
> 	I'm still checking to make sure this is safe with regard to
> mutexing the bonding structures, but it would be good to know if it
> eliminates the warning.

I can confirm that the warning went away.

Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09  7:58       ` Jay Vosburgh
  2008-01-09  9:36         ` Krzysztof Oledzki
@ 2008-01-09 15:27         ` Andy Gospodarek
  2008-01-09 17:54           ` Jay Vosburgh
  1 sibling, 1 reply; 30+ messages in thread
From: Andy Gospodarek @ 2008-01-09 15:27 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Krzysztof Oledzki, netdev, Jeff Garzik, David Miller,
	Andy Gospodarek, Herbert Xu

On Tue, Jan 08, 2008 at 11:58:34PM -0800, Jay Vosburgh wrote:
> Krzysztof Oledzki <olel@ans.pl> wrote:
> 
> >Fine. Just let you know that someone test your patches and everything
> >works, except mentioned problem.
> 
> 	And I appreciate it; I just wanted to make sure our many fans
> following along at home didn't misunderstand.
> 
> 	Could you let me know if the patch below make the lockdep
> warning go away?  This applies on top of the previous three, although it
> should be trivial to do by hand.
> 
> 	I'm still checking to make sure this is safe with regard to
> mutexing the bonding structures, but it would be good to know if it
> eliminates the warning.
> 
> 	-J
> 

Jay,

My initial concern was that a slave device could disappear out from
under us, but it seems like this certainly isn't the case since all
calls to bond_release are protected by rtnl-locks, so I think you are
correct that we are safe.  I'll test this on my setup here and let you
know if I see any problems.

-andy




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 15:27         ` Andy Gospodarek
@ 2008-01-09 17:54           ` Jay Vosburgh
  2008-01-09 20:17             ` Andy Gospodarek
  0 siblings, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-09 17:54 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Krzysztof Oledzki, netdev, Jeff Garzik, David Miller, Herbert Xu

Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
>My initial concern was that a slave device could disappear out from
>under us, but it seems like this certainly isn't the case since all
>calls to bond_release are protected by rtnl-locks, so I think you are
>correct that we are safe.  I'll test this on my setup here and let you
>know if I see any problems.

	Yep, all entries into enslave or remove come in with RTNL, so if
we have RTNL there then slaves can't vanish.

	On further inspection, I don't think it's safe to simply drop
the locks in bond_set_multicast_list, I'm seeing a couple of cases that
could be troublesome:

	bond_set_promiscuity and bond_set_allmulti both reference
curr_active_slave, which isn't protected from change by RTNL, so that
could conflict with a change_active_slave calling bond_mc_swap (which is
also holding the wrong locks for dev_set_promisc/allmulti).

	It also looks like there are paths (igmp6 for one) into
dev_mc_add that just hold a bunch of regular locks, and not RTNL, so
those wouldn't be safe from having slaves vanish due to concurrent
deslavement.

	Looks like read_lock_bh for bond-lock and curr_slave_lock is
needed in bond_set_multicast_list, and some dropping of locks is needed
inside bond_set_promisc/allmulti.  Methinks that without any locks,
bond_mc_add/delete could race with either a change of active slave or a
de-enslavement of the active slave.

	I'm wondering if this is worth trying to make perfect for 2.6.24
(and maybe making things worse), and, instead, just do this:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..8b9e33a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,7 +3937,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
+	read_lock_bh(&bond->lock);
 
 	/*
 	 * Do promisc before checking multicast_mode
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-	write_unlock_bh(&bond->lock);
+	read_unlock_bh(&bond->lock);
 }
 
 /*


	This should silence the lockdep (if I'm understanding what
everybody's saying), and keep the change set to a minimum.  This might
not even be worth pushing for 2.6.24; I'm not exactly sure how difficult
the lockdep problem would be to trigger.

	The other stuff I mention above can be dealt with later; they're
very low-probability races that would be pretty difficult to hit even on
purpose.

	Thoughts?

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 17:54           ` Jay Vosburgh
@ 2008-01-09 20:17             ` Andy Gospodarek
  2008-01-09 22:05               ` Herbert Xu
  2008-01-12 10:53               ` Krzysztof Oledzki
  0 siblings, 2 replies; 30+ messages in thread
From: Andy Gospodarek @ 2008-01-09 20:17 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Andy Gospodarek, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller, Herbert Xu

On Wed, Jan 09, 2008 at 09:54:56AM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> [...]
> >My initial concern was that a slave device could disappear out from
> >under us, but it seems like this certainly isn't the case since all
> >calls to bond_release are protected by rtnl-locks, so I think you are
> >correct that we are safe.  I'll test this on my setup here and let you
> >know if I see any problems.
> 
> 	Yep, all entries into enslave or remove come in with RTNL, so if
> we have RTNL there then slaves can't vanish.
> 
> 	On further inspection, I don't think it's safe to simply drop
> the locks in bond_set_multicast_list, I'm seeing a couple of cases that
> could be troublesome:
> 
> 	bond_set_promiscuity and bond_set_allmulti both reference
> curr_active_slave, which isn't protected from change by RTNL, so that
> could conflict with a change_active_slave calling bond_mc_swap (which is
> also holding the wrong locks for dev_set_promisc/allmulti).
> 
> 	It also looks like there are paths (igmp6 for one) into
> dev_mc_add that just hold a bunch of regular locks, and not RTNL, so
> those wouldn't be safe from having slaves vanish due to concurrent
> deslavement.

Eeeek!  I didn't realize that rtnl wasn't held for all those calls.  If
that's the case we can't drop all the locks.

> 	Looks like read_lock_bh for bond-lock and curr_slave_lock is
> needed in bond_set_multicast_list, and some dropping of locks is needed
> inside bond_set_promisc/allmulti.  Methinks that without any locks,
> bond_mc_add/delete could race with either a change of active slave or a
> de-enslavement of the active slave.
 
Agreed.  And despite Herbert's opinion that this isn't the correct fix,
I think this will work fine.  This is one of the cases where we can take
a write_lock(bond->lock) in softirq context, so we need to drop that (or
make sure all the read_lock's are read_lock_bh's).  The latter isn't
really an option since having a majority of the bonding code run in
softirq context was what we are trying to avoid with the workqueue
conversion.

> 	I'm wondering if this is worth trying to make perfect for 2.6.24
> (and maybe making things worse), and, instead, just do this:
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 77d004d..8b9e33a 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3937,7 +3937,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
>  	struct bonding *bond = bond_dev->priv;
>  	struct dev_mc_list *dmi;
>  
> -	write_lock_bh(&bond->lock);
> +	read_lock_bh(&bond->lock);
>  
>  	/*
>  	 * Do promisc before checking multicast_mode
> @@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
>  	bond_mc_list_destroy(bond);
>  	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
>  
> -	write_unlock_bh(&bond->lock);
> +	read_unlock_bh(&bond->lock);
>  }
>  
>  /*
> 
> 
> 	This should silence the lockdep (if I'm understanding what
> everybody's saying), and keep the change set to a minimum.  This might

The lockdep problem is easy to trigger.  The lockdep code does a good
job of noticing problems quickly regardless of how easy the deadlocks
are to create.

> not even be worth pushing for 2.6.24; I'm not exactly sure how difficult
> the lockdep problem would be to trigger.
> 

I'd like to see it go in there (for correct-ness) and to avoid hearing
about these lockdep issues for the next few months until it makes it
into 2.4.25.

> 	The other stuff I mention above can be dealt with later; they're
> very low-probability races that would be pretty difficult to hit even on
> purpose.
> 
> 	Thoughts?
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 20:17             ` Andy Gospodarek
@ 2008-01-09 22:05               ` Herbert Xu
  2008-01-09 23:19                 ` Jay Vosburgh
  2008-01-12 10:53               ` Krzysztof Oledzki
  1 sibling, 1 reply; 30+ messages in thread
From: Herbert Xu @ 2008-01-09 22:05 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Jay Vosburgh, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

On Wed, Jan 09, 2008 at 03:17:09PM -0500, Andy Gospodarek wrote:
>
> Agreed.  And despite Herbert's opinion that this isn't the correct fix,
> I think this will work fine.  This is one of the cases where we can take
> a write_lock(bond->lock) in softirq context, so we need to drop that (or
> make sure all the read_lock's are read_lock_bh's).  The latter isn't
> really an option since having a majority of the bonding code run in
> softirq context was what we are trying to avoid with the workqueue
> conversion.

No that's not the point.  The point is to move the majority of the code
into process context so that you can take the RTNL.  Once you have taken
the RTNL you can disable BH all you want and I don't care one bit.

In any case, fixing a known dead-lock is important.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 22:05               ` Herbert Xu
@ 2008-01-09 23:19                 ` Jay Vosburgh
  2008-01-10  0:58                   ` Herbert Xu
  0 siblings, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-09 23:19 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Andy Gospodarek, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

Herbert Xu <herbert@gondor.apana.org.au> wrote:

>On Wed, Jan 09, 2008 at 03:17:09PM -0500, Andy Gospodarek wrote:
>>
>> Agreed.  And despite Herbert's opinion that this isn't the correct fix,
>> I think this will work fine.  This is one of the cases where we can take
>> a write_lock(bond->lock) in softirq context, so we need to drop that (or
>> make sure all the read_lock's are read_lock_bh's).  The latter isn't
>> really an option since having a majority of the bonding code run in
>> softirq context was what we are trying to avoid with the workqueue
>> conversion.
>
>No that's not the point.  The point is to move the majority of the code
>into process context so that you can take the RTNL.  Once you have taken
>the RTNL you can disable BH all you want and I don't care one bit.

	I'm not sure how we could move more code into a process context;
much of the bonding driver is at the mercy of its callers, as in this
case.  The monitoring stuff and enslave / deslave is all in a process
context now (workqueue).  The transmit processing functions, for
example, can't be assumed to be in any particular context as they're
called by dev_queue_xmit.

	The function in question here is the dev->set_multicast_list
method function, which is sometimes called with RTNL, and sometimes with
other random locks, but seems to always be called with netif_tx_lock_bh.
Then, bonding's bond_set_multicast_list calls dev_set_promiscuity (on
slaves), which I believe is an "RTNL only" function (which is probably
another discussion).

>In any case, fixing a known dead-lock is important.

	Agreed.

	I'm now thinking for just the topic at hand (the previously
posted lockdep report), something like this will resolve it:

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..b7ac10b 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
-
 	/*
 	 * Do promisc before checking multicast_mode
 	 */
@@ -3959,6 +3957,8 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 		bond_set_allmulti(bond, -1);
 	}
 
+	read_lock_bh(&bond->lock);
+
 	bond->flags = bond_dev->flags;
 
 	/* looking for addresses to add to slaves' mc list */
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-	write_unlock_bh(&bond->lock);
+	read_unlock_bh(&bond->lock);
 }
 
 /*


	This (a) converts the write_lock_bh to a read_lock_bh, and moves
it down to not cover the promisc/allmulti code (which is protected by
RTNL).  I'm not sure that this is really a signficant alteration, but
it's a bit closer to "correct."

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 23:19                 ` Jay Vosburgh
@ 2008-01-10  0:58                   ` Herbert Xu
  2008-01-10 14:51                     ` Andy Gospodarek
  0 siblings, 1 reply; 30+ messages in thread
From: Herbert Xu @ 2008-01-10  0:58 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Andy Gospodarek, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

On Wed, Jan 09, 2008 at 03:19:10PM -0800, Jay Vosburgh wrote:
>
> >No that's not the point.  The point is to move the majority of the code
> >into process context so that you can take the RTNL.  Once you have taken
> >the RTNL you can disable BH all you want and I don't care one bit.
> 
> 	I'm not sure how we could move more code into a process context;
> much of the bonding driver is at the mercy of its callers, as in this
> case.  The monitoring stuff and enslave / deslave is all in a process
> context now (workqueue).  The transmit processing functions, for
> example, can't be assumed to be in any particular context as they're
> called by dev_queue_xmit.

No I'm not calling for you to move any more code into process context.
I was replying to the comment that changing the read_lock calls in
process context to read_lock_bh somehow undoes the benefit of moving
softirq code into process context.  It does not since the point of the
move is to be able to take the RTNL, which you can still do as long as
you do it before you disable BH.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10  0:58                   ` Herbert Xu
@ 2008-01-10 14:51                     ` Andy Gospodarek
  2008-01-10 20:36                       ` Herbert Xu
  2008-01-10 20:45                       ` Jay Vosburgh
  0 siblings, 2 replies; 30+ messages in thread
From: Andy Gospodarek @ 2008-01-10 14:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Jay Vosburgh, Andy Gospodarek, Krzysztof Oledzki, netdev,
	Jeff Garzik, David Miller

On Thu, Jan 10, 2008 at 11:58:09AM +1100, Herbert Xu wrote:
> On Wed, Jan 09, 2008 at 03:19:10PM -0800, Jay Vosburgh wrote:
> >
> > >No that's not the point.  The point is to move the majority of the code
> > >into process context so that you can take the RTNL.  Once you have taken
> > >the RTNL you can disable BH all you want and I don't care one bit.
> > 
> > 	I'm not sure how we could move more code into a process context;
> > much of the bonding driver is at the mercy of its callers, as in this
> > case.  The monitoring stuff and enslave / deslave is all in a process
> > context now (workqueue).  The transmit processing functions, for
> > example, can't be assumed to be in any particular context as they're
> > called by dev_queue_xmit.
> 
> No I'm not calling for you to move any more code into process context.
> I was replying to the comment that changing the read_lock calls in
> process context to read_lock_bh somehow undoes the benefit of moving
> softirq code into process context.  It does not since the point of the
> move is to be able to take the RTNL, which you can still do as long as
> you do it before you disable BH.
> 

That wasn't the only purpose, Herbert.  Making sure that calls to
dev_set_mac_address were called from process context was important at
the time of the coding as well since at least the tg3 driver took locks
that could not be taken reliably in soft-irq context.  Michael Chan
fixed this here:

commit 986e0aeb9ae09127b401c3baa66f15b7a31f354c
Author: Michael Chan <mchan@broadcom.com>
Date:   Sat May 5 12:10:20 2007 -0700

    [TG3]: Remove reset during MAC address changes.

so if wasn't as much of an issue after that, but moving as much of the
code to process context was important for that as well (hence the move
to not continue to try to not use bh-locks everywhere).



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 14:51                     ` Andy Gospodarek
@ 2008-01-10 20:36                       ` Herbert Xu
  2008-01-10 20:50                         ` Jay Vosburgh
  2008-01-10 20:45                       ` Jay Vosburgh
  1 sibling, 1 reply; 30+ messages in thread
From: Herbert Xu @ 2008-01-10 20:36 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Jay Vosburgh, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

On Thu, Jan 10, 2008 at 09:51:44AM -0500, Andy Gospodarek wrote:
>
> That wasn't the only purpose, Herbert.  Making sure that calls to
> dev_set_mac_address were called from process context was important at
> the time of the coding as well since at least the tg3 driver took locks
> that could not be taken reliably in soft-irq context.  Michael Chan
> fixed this here:

Sure, but where do you call that function while holding the bond lock?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 14:51                     ` Andy Gospodarek
  2008-01-10 20:36                       ` Herbert Xu
@ 2008-01-10 20:45                       ` Jay Vosburgh
  1 sibling, 0 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-10 20:45 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Herbert Xu, Krzysztof Oledzki, netdev, Jeff Garzik, David Miller

Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
>That wasn't the only purpose, Herbert.  Making sure that calls to
>dev_set_mac_address were called from process context was important at
>the time of the coding as well since at least the tg3 driver took locks
>that could not be taken reliably in soft-irq context.  Michael Chan
>fixed this here:
>
>commit 986e0aeb9ae09127b401c3baa66f15b7a31f354c
>Author: Michael Chan <mchan@broadcom.com>
>Date:   Sat May 5 12:10:20 2007 -0700
>
>    [TG3]: Remove reset during MAC address changes.
>
>so if wasn't as much of an issue after that, but moving as much of the
>code to process context was important for that as well (hence the move
>to not continue to try to not use bh-locks everywhere).

	Well, not for tg3 perhaps, but other network device drivers do
the same thing (if memory serves, any USB ethernet adapter will have
issues there).  Also, I believe the netlink notifier callback,
rtnetlink_event, which every dev_set_whatever calls, does a
possibly-sleeping memory allocation (rtmsg_ifinfo -> nlmsg_new ->
alloc_skb(GFP_KERNEL)); so we don't really want to hold extra locks for
that, either.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 20:36                       ` Herbert Xu
@ 2008-01-10 20:50                         ` Jay Vosburgh
  2008-01-10 21:03                           ` Andy Gospodarek
  0 siblings, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-10 20:50 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Andy Gospodarek, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

Herbert Xu <herbert@gondor.apana.org.au> wrote:

>On Thu, Jan 10, 2008 at 09:51:44AM -0500, Andy Gospodarek wrote:
>>
>> That wasn't the only purpose, Herbert.  Making sure that calls to
>> dev_set_mac_address were called from process context was important at
>> the time of the coding as well since at least the tg3 driver took locks
>> that could not be taken reliably in soft-irq context.  Michael Chan
>> fixed this here:
>
>Sure, but where do you call that function while holding the bond lock?

	If I recall correctly, the problem was that tg3, et al, did
things that might sleep, and bonding was calling from a timer context,
which couldn't sleep.  It wasn't about the lock.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 20:50                         ` Jay Vosburgh
@ 2008-01-10 21:03                           ` Andy Gospodarek
  2008-01-10 21:05                             ` Herbert Xu
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Gospodarek @ 2008-01-10 21:03 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Herbert Xu, Andy Gospodarek, Krzysztof Oledzki, netdev,
	Jeff Garzik, David Miller

On Thu, Jan 10, 2008 at 12:50:46PM -0800, Jay Vosburgh wrote:
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> >On Thu, Jan 10, 2008 at 09:51:44AM -0500, Andy Gospodarek wrote:
> >>
> >> That wasn't the only purpose, Herbert.  Making sure that calls to
> >> dev_set_mac_address were called from process context was important at
> >> the time of the coding as well since at least the tg3 driver took locks
> >> that could not be taken reliably in soft-irq context.  Michael Chan
> >> fixed this here:
> >
> >Sure, but where do you call that function while holding the bond lock?
> 
> 	If I recall correctly, the problem was that tg3, et al, did
> things that might sleep, and bonding was calling from a timer context,
> which couldn't sleep.  It wasn't about the lock.
> 

Exactly, I was just about to post the same.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 21:03                           ` Andy Gospodarek
@ 2008-01-10 21:05                             ` Herbert Xu
  2008-01-11  1:06                               ` Jay Vosburgh
  0 siblings, 1 reply; 30+ messages in thread
From: Herbert Xu @ 2008-01-10 21:05 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Jay Vosburgh, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

On Thu, Jan 10, 2008 at 04:03:53PM -0500, Andy Gospodarek wrote:
>
> > >Sure, but where do you call that function while holding the bond lock?
> > 
> > 	If I recall correctly, the problem was that tg3, et al, did
> > things that might sleep, and bonding was calling from a timer context,
> > which couldn't sleep.  It wasn't about the lock.
> 
> Exactly, I was just about to post the same.

In other words, changing read_lock on bond->lock to read_lock_bh doesn't
affect this one bit.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-10 21:05                             ` Herbert Xu
@ 2008-01-11  1:06                               ` Jay Vosburgh
  2008-01-11  4:55                                 ` Herbert Xu
  0 siblings, 1 reply; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-11  1:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Andy Gospodarek, Krzysztof Oledzki, netdev, Jeff Garzik,
	David Miller

Herbert Xu <herbert@gondor.apana.org.au> wrote:

>On Thu, Jan 10, 2008 at 04:03:53PM -0500, Andy Gospodarek wrote:
>>
>> > >Sure, but where do you call that function while holding the bond lock?
>> > 
>> > 	If I recall correctly, the problem was that tg3, et al, did
>> > things that might sleep, and bonding was calling from a timer context,
>> > which couldn't sleep.  It wasn't about the lock.
>> 
>> Exactly, I was just about to post the same.
>
>In other words, changing read_lock on bond->lock to read_lock_bh doesn't
>affect this one bit.

	For the case of the bond_set_multicast_list function, changing
the existing write_lock to a read_lock_bh doesn't affect any calls to
dev_set_mac_address (since it isn't called from there) and, apparently,
also resolves the lockdep warning.  I'm still trying to get lockdep to
generate the warning for me locally; I'm not sure which magic thing I'm
missing.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-11  1:06                               ` Jay Vosburgh
@ 2008-01-11  4:55                                 ` Herbert Xu
  0 siblings, 0 replies; 30+ messages in thread
From: Herbert Xu @ 2008-01-11  4:55 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: herbert, andy, olel, netdev, jgarzik, davem

Jay Vosburgh <fubar@us.ibm.com> wrote:
>
>        For the case of the bond_set_multicast_list function, changing
> the existing write_lock to a read_lock_bh doesn't affect any calls to

Right that should also resolve the lockdep issue.  So as long as you
guys are sure that this is safe with respect to the rest of the bonding
driver it's fine by me.

BTW you only need read_lock since set_multicast_list always gets called
with BH off.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-09 20:17             ` Andy Gospodarek
  2008-01-09 22:05               ` Herbert Xu
@ 2008-01-12 10:53               ` Krzysztof Oledzki
  2008-01-12 17:56                 ` Jay Vosburgh
  1 sibling, 1 reply; 30+ messages in thread
From: Krzysztof Oledzki @ 2008-01-12 10:53 UTC (permalink / raw)
  To: Andy Gospodarek
  Cc: Jay Vosburgh, netdev, Jeff Garzik, David Miller, Herbert Xu

[-- Attachment #1: Type: TEXT/PLAIN, Size: 947 bytes --]



On Wed, 9 Jan 2008, Andy Gospodarek wrote:
> On Wed, Jan 09, 2008 at 09:54:56AM -0800, Jay Vosburgh wrote:
<CUT>
>> 	This should silence the lockdep (if I'm understanding what
>> everybody's saying), and keep the change set to a minimum.  This might
>
> The lockdep problem is easy to trigger.  The lockdep code does a good
> job of noticing problems quickly regardless of how easy the deadlocks
> are to create.

Exactly. All I need to do is to reboot my server, I have 100% probability 
to get the warning.

>> not even be worth pushing for 2.6.24; I'm not exactly sure how difficult
>> the lockdep problem would be to trigger.
>>
>
> I'd like to see it go in there (for correct-ness) and to avoid hearing
> about these lockdep issues for the next few months until it makes it
> into 2.4.25.

Right. So, what is the final patch? I would like to test it if that's 
possible. ;)

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-12 10:53               ` Krzysztof Oledzki
@ 2008-01-12 17:56                 ` Jay Vosburgh
  2008-01-13  0:19                   ` Herbert Xu
  2008-01-14 22:15                   ` Krzysztof Oledzki
  0 siblings, 2 replies; 30+ messages in thread
From: Jay Vosburgh @ 2008-01-12 17:56 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: Andy Gospodarek, netdev, Jeff Garzik, David Miller, Herbert Xu

Krzysztof Oledzki <olel@ans.pl> wrote:
[...]
>Exactly. All I need to do is to reboot my server, I have 100% probability
>to get the warning.

	I wish it were that easy for me; I'm not sure what magic thing
you've got on your server or network that I don't, but I haven't been
able to make this lockdep warning happen at all.

>Right. So, what is the final patch? I would like to test it if that's
>possible. ;)

	Can you test the following and let me know if it triggers the
warning?  I believe this is the minimum locking needed, and based on
input from Herbert, we shouldn't need to hold the lock at _bh.  If this
one works, and nobody sees any other issues with it, then it's the final
patch for this lockdep problem.  I'll add some deep, meaningful comments
to explain the locking a bit (i.e., we're called with rtnl for the
allmulti and promisc cases, so we're ok there without additional locks,
but the later code could be called from anywhere, so it needs locks to
prevent the slave list from changing, but the mc_lists themselves are
covered by the netif_tx_lock that all callers will hold), but this would
be the actual code change.

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..6906dbc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
-
 	/*
 	 * Do promisc before checking multicast_mode
 	 */
@@ -3959,6 +3957,8 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 		bond_set_allmulti(bond, -1);
 	}
 
+	read_lock(&bond->lock);
+
 	bond->flags = bond_dev->flags;
 
 	/* looking for addresses to add to slaves' mc list */
@@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
 
-	write_unlock_bh(&bond->lock);
+	read_unlock(&bond->lock);
 }
 
 /*


	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-12 17:56                 ` Jay Vosburgh
@ 2008-01-13  0:19                   ` Herbert Xu
  2008-01-14 22:15                   ` Krzysztof Oledzki
  1 sibling, 0 replies; 30+ messages in thread
From: Herbert Xu @ 2008-01-13  0:19 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: olel, andy, netdev, jgarzik, davem, herbert

Jay Vosburgh <fubar@us.ibm.com> wrote:
> 
>        Can you test the following and let me know if it triggers the
> warning?  I believe this is the minimum locking needed, and based on
> input from Herbert, we shouldn't need to hold the lock at _bh.  If this
> one works, and nobody sees any other issues with it, then it's the final
> patch for this lockdep problem.  I'll add some deep, meaningful comments
> to explain the locking a bit (i.e., we're called with rtnl for the
> allmulti and promisc cases, so we're ok there without additional locks,
> but the later code could be called from anywhere, so it needs locks to
> prevent the slave list from changing, but the mc_lists themselves are
> covered by the netif_tx_lock that all callers will hold), but this would
> be the actual code change.

I just had a look at the bonding code and while I didn't find anything
wrong with the change of the write lock to a read lock, the mc_list
itself does not seem to have adequete protection.  In particular, there
doesn't seem to be anything protecting the walking of mc_list in
bond_enslave.

This could be a problem if we have an IGMP6 event triggering the change
in the master's mc_list.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
  2008-01-12 17:56                 ` Jay Vosburgh
  2008-01-13  0:19                   ` Herbert Xu
@ 2008-01-14 22:15                   ` Krzysztof Oledzki
  1 sibling, 0 replies; 30+ messages in thread
From: Krzysztof Oledzki @ 2008-01-14 22:15 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Andy Gospodarek, netdev, Jeff Garzik, David Miller, Herbert Xu

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2394 bytes --]



On Sat, 12 Jan 2008, Jay Vosburgh wrote:

> Krzysztof Oledzki <olel@ans.pl> wrote:
> [...]
>> Exactly. All I need to do is to reboot my server, I have 100% probability
>> to get the warning.
>
> 	I wish it were that easy for me; I'm not sure what magic thing
> you've got on your server or network that I don't, but I haven't been
> able to make this lockdep warning happen at all.
>
>> Right. So, what is the final patch? I would like to test it if that's
>> possible. ;)
>
> 	Can you test the following and let me know if it triggers the
> warning?  I believe this is the minimum locking needed, and based on
> input from Herbert, we shouldn't need to hold the lock at _bh.  If this
> one works, and nobody sees any other issues with it, then it's the final
> patch for this lockdep problem.  I'll add some deep, meaningful comments
> to explain the locking a bit (i.e., we're called with rtnl for the
> allmulti and promisc cases, so we're ok there without additional locks,
> but the later code could be called from anywhere, so it needs locks to
> prevent the slave list from changing, but the mc_lists themselves are
> covered by the netif_tx_lock that all callers will hold), but this would
> be the actual code change.
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 77d004d..6906dbc 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> 	struct bonding *bond = bond_dev->priv;
> 	struct dev_mc_list *dmi;
>
> -	write_lock_bh(&bond->lock);
> -
> 	/*
> 	 * Do promisc before checking multicast_mode
> 	 */
> @@ -3959,6 +3957,8 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> 		bond_set_allmulti(bond, -1);
> 	}
>
> +	read_lock(&bond->lock);
> +
> 	bond->flags = bond_dev->flags;
>
> 	/* looking for addresses to add to slaves' mc list */
> @@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> 	bond_mc_list_destroy(bond);
> 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
>
> -	write_unlock_bh(&bond->lock);
> +	read_unlock(&bond->lock);
> }
>
> /*
>
>

I can confirm that the warning went away.

Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>

Best regards,

 				Krzysztof Olędzki

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2008-01-14 22:22 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-08  1:56 [PATCH 0/3] bonding: 3 fixes for 2.6.24 Jay Vosburgh
2008-01-08  1:56 ` [PATCH 1/3] bonding: fix locking in sysfs primary/active selection Jay Vosburgh
2008-01-08  1:56   ` [PATCH 2/3] bonding: fix ASSERT_RTNL that produces spurious warnings Jay Vosburgh
2008-01-08  1:57     ` [PATCH 3/3] bonding: fix locking during alb failover and slave removal Jay Vosburgh
2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
2008-01-08 19:17   ` Andy Gospodarek
2008-01-08 20:28     ` Jay Vosburgh
2008-01-09  6:08     ` Herbert Xu
2008-01-08 19:30   ` Jay Vosburgh
2008-01-09  6:35     ` Krzysztof Oledzki
2008-01-09  7:58       ` Jay Vosburgh
2008-01-09  9:36         ` Krzysztof Oledzki
2008-01-09 15:27         ` Andy Gospodarek
2008-01-09 17:54           ` Jay Vosburgh
2008-01-09 20:17             ` Andy Gospodarek
2008-01-09 22:05               ` Herbert Xu
2008-01-09 23:19                 ` Jay Vosburgh
2008-01-10  0:58                   ` Herbert Xu
2008-01-10 14:51                     ` Andy Gospodarek
2008-01-10 20:36                       ` Herbert Xu
2008-01-10 20:50                         ` Jay Vosburgh
2008-01-10 21:03                           ` Andy Gospodarek
2008-01-10 21:05                             ` Herbert Xu
2008-01-11  1:06                               ` Jay Vosburgh
2008-01-11  4:55                                 ` Herbert Xu
2008-01-10 20:45                       ` Jay Vosburgh
2008-01-12 10:53               ` Krzysztof Oledzki
2008-01-12 17:56                 ` Jay Vosburgh
2008-01-13  0:19                   ` Herbert Xu
2008-01-14 22:15                   ` Krzysztof Oledzki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).