* BUG: bad unlock balance detected! e1000e
@ 2008-12-08 11:24 Zdenek Kabelac
2008-12-09 11:03 ` Frederik Deweerdt
0 siblings, 1 reply; 6+ messages in thread
From: Zdenek Kabelac @ 2008-12-08 11:24 UTC (permalink / raw)
To: Linux Kernel Mailing List
Hi
During occasional scan of message log - I've found out this BUG which
happened on Dec3 with the -rc7 from that day.
(So if it's now fixed in current git feel free to ignore :))
My machine T61 - C2D, 2GB, 64bit kernel - message appeared during
shutdown and was actually not noticed by me...
NetworkManager: <WARN> nm_signal_handler(): Caught signal 15,
shutting down normally.
NetworkManager: <info> (eth0): now unmanaged
NetworkManager: <info> (eth0): device state change: 3 -> 1
NetworkManager: <info> (eth0): cleaning up...
NetworkManager: <info> (eth0): taking down device.
=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
NetworkManager/2058 is trying to release lock (nvm_mutex) at:
[<ffffffff8052acb9>] mutex_unlock+0x9/0x10
but there are no more locks to release!
other info that might help us debug this:
1 lock held by NetworkManager/2058:
#0: (rtnl_mutex){--..}, at: [<ffffffff804b09ba>] rtnetlink_rcv+0x1a/0x40
stack backtrace:
Pid: 2058, comm: NetworkManager Not tainted 2.6.28-rc7 #90
Call Trace:
[<ffffffff8026bf6b>] print_unlock_inbalance_bug+0xfb/0x120
[<ffffffff8026db76>] ? mark_held_locks+0x56/0xa0
[<ffffffff802500ba>] ? try_to_del_timer_sync+0x5a/0x70
[<ffffffff8052ce23>] ? _spin_unlock_irqrestore+0x43/0x70
[<ffffffff8026f468>] lock_release_non_nested+0x1a8/0x2a0
[<ffffffff802500ba>] ? try_to_del_timer_sync+0x5a/0x70
[<ffffffff8052acb9>] ? mutex_unlock+0x9/0x10
[<ffffffff8026f61b>] lock_release+0xbb/0x200
[<ffffffff8052abc6>] __mutex_unlock_slowpath+0x86/0x170
[<ffffffff8052acb9>] mutex_unlock+0x9/0x10
[<ffffffffa019cf45>] e1000_release_swflag_ich8lan+0x35/0x40 [e1000e]
[<ffffffffa019de19>] e1000_reset_hw_ich8lan+0xa9/0x130 [e1000e]
[<ffffffffa01ab140>] e1000e_reset+0xf0/0x260 [e1000e]
[<ffffffff802500ea>] ? del_timer_sync+0x1a/0x30
[<ffffffffa01ab4d8>] e1000e_down+0x178/0x180 [e1000e]
[<ffffffffa01ad0b6>] e1000_close+0x26/0xd0 [e1000e]
[<ffffffff804a59b6>] dev_close+0x76/0xc0
[<ffffffff804a7bd6>] dev_change_flags+0x96/0x1e0
[<ffffffff804b1b4c>] do_setlink+0x2ac/0x440
[<ffffffff8052cd46>] ? _read_unlock+0x26/0x30
[<ffffffff804b1ded>] rtnl_setlink+0x10d/0x150
[<ffffffff8052aeb2>] ? mutex_lock_nested+0x1f2/0x300
[<ffffffff804b09ba>] ? rtnetlink_rcv+0x1a/0x40
[<ffffffff804b0b6d>] rtnetlink_rcv_msg+0x18d/0x240
[<ffffffff804b09e0>] ? rtnetlink_rcv_msg+0x0/0x240
[<ffffffff804bba29>] netlink_rcv_skb+0x89/0xb0
[<ffffffff804b09c9>] rtnetlink_rcv+0x29/0x40
[<ffffffff804bb74d>] netlink_unicast+0x2bd/0x2d0
[<ffffffff8049e3ee>] ? __alloc_skb+0x6e/0x150
[<ffffffff804bc6c4>] netlink_sendmsg+0x204/0x2f0
[<ffffffff80499ed8>] ? sock_def_readable+0x68/0x70
[<ffffffff80498077>] sock_sendmsg+0x107/0x130
[<ffffffff8025c690>] ? autoremove_wake_function+0x0/0x40
[<ffffffff80213853>] ? native_sched_clock+0x13/0x60
[<ffffffff802d60e6>] ? fget_light+0x106/0x110
[<ffffffff80497987>] ? move_addr_to_kernel+0x57/0x60
[<ffffffff804a087f>] ? verify_iovec+0x3f/0xe0
[<ffffffff80498229>] sys_sendmsg+0x189/0x320
[<ffffffff8049851d>] ? sys_sendto+0xfd/0x120
[<ffffffff802e6650>] ? d_free+0x50/0x60
[<ffffffff802d6691>] ? __fput+0x171/0x1e0
[<ffffffff8026dda3>] ? trace_hardirqs_on_caller+0x133/0x190
[<ffffffff8028e02f>] ? audit_syscall_entry+0x15f/0x190
[<ffffffff8052c906>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8020c57b>] system_call_fastpath+0x16/0x1b
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: BUG: bad unlock balance detected! e1000e 2008-12-08 11:24 BUG: bad unlock balance detected! e1000e Zdenek Kabelac @ 2008-12-09 11:03 ` Frederik Deweerdt 2008-12-09 23:08 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Frederik Deweerdt @ 2008-12-09 11:03 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Linux Kernel Mailing List, tglx Hello Zdenek, This could be due to 717d438d1fde94decef874b9808379d1f4523453 "e1000e: debug contention on NVM SWFLAG" Error handling is missing from e1000_reset_hw_ich8lan so it may happen that we don't acquire the nvm_mutex if the card times out. Adding Thomas to CC. Regards, Frederik It some error checking is missing in e1000e: debug contention on NVM SWFLAG On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: > Hi > > During occasional scan of message log - I've found out this BUG which > happened on Dec3 with the -rc7 from that day. > (So if it's now fixed in current git feel free to ignore :)) > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during > shutdown and was actually not noticed by me... > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, > shutting down normally. > NetworkManager: <info> (eth0): now unmanaged > NetworkManager: <info> (eth0): device state change: 3 -> 1 > NetworkManager: <info> (eth0): cleaning up... > NetworkManager: <info> (eth0): taking down device. > > ===================================== > [ BUG: bad unlock balance detected! ] > ------------------------------------- > NetworkManager/2058 is trying to release lock (nvm_mutex) at: > [<ffffffff8052acb9>] mutex_unlock+0x9/0x10 > but there are no more locks to release! > > other info that might help us debug this: > 1 lock held by NetworkManager/2058: > #0: (rtnl_mutex){--..}, at: [<ffffffff804b09ba>] rtnetlink_rcv+0x1a/0x40 > > stack backtrace: > Pid: 2058, comm: NetworkManager Not tainted 2.6.28-rc7 #90 > Call Trace: > [<ffffffff8026bf6b>] print_unlock_inbalance_bug+0xfb/0x120 > [<ffffffff8026db76>] ? mark_held_locks+0x56/0xa0 > [<ffffffff802500ba>] ? try_to_del_timer_sync+0x5a/0x70 > [<ffffffff8052ce23>] ? _spin_unlock_irqrestore+0x43/0x70 > [<ffffffff8026f468>] lock_release_non_nested+0x1a8/0x2a0 > [<ffffffff802500ba>] ? try_to_del_timer_sync+0x5a/0x70 > [<ffffffff8052acb9>] ? mutex_unlock+0x9/0x10 > [<ffffffff8026f61b>] lock_release+0xbb/0x200 > [<ffffffff8052abc6>] __mutex_unlock_slowpath+0x86/0x170 > [<ffffffff8052acb9>] mutex_unlock+0x9/0x10 > [<ffffffffa019cf45>] e1000_release_swflag_ich8lan+0x35/0x40 [e1000e] > [<ffffffffa019de19>] e1000_reset_hw_ich8lan+0xa9/0x130 [e1000e] > [<ffffffffa01ab140>] e1000e_reset+0xf0/0x260 [e1000e] > [<ffffffff802500ea>] ? del_timer_sync+0x1a/0x30 > [<ffffffffa01ab4d8>] e1000e_down+0x178/0x180 [e1000e] > [<ffffffffa01ad0b6>] e1000_close+0x26/0xd0 [e1000e] > [<ffffffff804a59b6>] dev_close+0x76/0xc0 > [<ffffffff804a7bd6>] dev_change_flags+0x96/0x1e0 > [<ffffffff804b1b4c>] do_setlink+0x2ac/0x440 > [<ffffffff8052cd46>] ? _read_unlock+0x26/0x30 > [<ffffffff804b1ded>] rtnl_setlink+0x10d/0x150 > [<ffffffff8052aeb2>] ? mutex_lock_nested+0x1f2/0x300 > [<ffffffff804b09ba>] ? rtnetlink_rcv+0x1a/0x40 > [<ffffffff804b0b6d>] rtnetlink_rcv_msg+0x18d/0x240 > [<ffffffff804b09e0>] ? rtnetlink_rcv_msg+0x0/0x240 > [<ffffffff804bba29>] netlink_rcv_skb+0x89/0xb0 > [<ffffffff804b09c9>] rtnetlink_rcv+0x29/0x40 > [<ffffffff804bb74d>] netlink_unicast+0x2bd/0x2d0 > [<ffffffff8049e3ee>] ? __alloc_skb+0x6e/0x150 > [<ffffffff804bc6c4>] netlink_sendmsg+0x204/0x2f0 > [<ffffffff80499ed8>] ? sock_def_readable+0x68/0x70 > [<ffffffff80498077>] sock_sendmsg+0x107/0x130 > [<ffffffff8025c690>] ? autoremove_wake_function+0x0/0x40 > [<ffffffff80213853>] ? native_sched_clock+0x13/0x60 > [<ffffffff802d60e6>] ? fget_light+0x106/0x110 > [<ffffffff80497987>] ? move_addr_to_kernel+0x57/0x60 > [<ffffffff804a087f>] ? verify_iovec+0x3f/0xe0 > [<ffffffff80498229>] sys_sendmsg+0x189/0x320 > [<ffffffff8049851d>] ? sys_sendto+0xfd/0x120 > [<ffffffff802e6650>] ? d_free+0x50/0x60 > [<ffffffff802d6691>] ? __fput+0x171/0x1e0 > [<ffffffff8026dda3>] ? trace_hardirqs_on_caller+0x133/0x190 > [<ffffffff8028e02f>] ? audit_syscall_entry+0x15f/0x190 > [<ffffffff8052c906>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff8020c57b>] system_call_fastpath+0x16/0x1b > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: BUG: bad unlock balance detected! e1000e 2008-12-09 11:03 ` Frederik Deweerdt @ 2008-12-09 23:08 ` Andrew Morton 2008-12-09 23:43 ` Frederik Deweerdt 0 siblings, 1 reply; 6+ messages in thread From: Andrew Morton @ 2008-12-09 23:08 UTC (permalink / raw) To: Frederik Deweerdt Cc: zdenek.kabelac, linux-kernel, tglx, netdev, Jesse Brandeburg, David S. Miller, stable On Tue, 9 Dec 2008 12:03:37 +0100 Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: > It some error checking is missing in e1000e: debug contention on NVM > SWFLAG > On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: > > Hi > > > > During occasional scan of message log - I've found out this BUG which > > happened on Dec3 with the -rc7 from that day. > > (So if it's now fixed in current git feel free to ignore :)) > > > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during > > shutdown and was actually not noticed by me... > > > > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, > > shutting down normally. > > NetworkManager: <info> (eth0): now unmanaged > > NetworkManager: <info> (eth0): device state change: 3 -> 1 > > NetworkManager: <info> (eth0): cleaning up... > > NetworkManager: <info> (eth0): taking down device. > > > > ===================================== > > [ BUG: bad unlock balance detected! ] > > ------------------------------------- (top-posting repaired. Please don't do that!!!). > Hello Zdenek, > > This could be due to 717d438d1fde94decef874b9808379d1f4523453 > "e1000e: debug contention on NVM SWFLAG" > Error handling is missing from e1000_reset_hw_ich8lan so it may happen > that we don't acquire the nvm_mutex if the card times out. > > Adding Thomas to CC. yup. 2.6.27 needs fixing also. Like this? From: Andrew Morton <akpm@linux-foundation.org> ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- A regression added by 717d438d1fde94decef874b9808379d1f4523453 ("e1000e: debug contention on NVM SWFLAG"). Reported-by: "Zdenek Kabelac" <zdenek.kabelac@gmail.com> Cc: Frederik Deweerdt <frederik.deweerdt@xprog.eu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- drivers/net/e1000e/ich8lan.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff -puN drivers/net/e1000e/ich8lan.c~drivers-net-e1000e-ich8lanc-fix-locking drivers/net/e1000e/ich8lan.c --- a/drivers/net/e1000e/ich8lan.c~drivers-net-e1000e-ich8lanc-fix-locking +++ a/drivers/net/e1000e/ich8lan.c @@ -400,6 +400,7 @@ static s32 e1000_acquire_swflag_ich8lan( { u32 extcnf_ctrl; u32 timeout = PHY_CFG_TIMEOUT; + s32 ret = 0; might_sleep(); @@ -427,11 +428,11 @@ static s32 e1000_acquire_swflag_ich8lan( extcnf_ctrl &= ~E1000_EXTCNF_CTRL_SWFLAG; ew32(EXTCNF_CTRL, extcnf_ctrl); nvm_owner = -1; - mutex_unlock(&nvm_mutex); - return -E1000_ERR_CONFIG; + ret = -E1000_ERR_CONFIG; } - return 0; + mutex_unlock(&nvm_mutex); + return ret; } /** _ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: BUG: bad unlock balance detected! e1000e 2008-12-09 23:08 ` Andrew Morton @ 2008-12-09 23:43 ` Frederik Deweerdt 2008-12-09 23:56 ` Andrew Morton 0 siblings, 1 reply; 6+ messages in thread From: Frederik Deweerdt @ 2008-12-09 23:43 UTC (permalink / raw) To: Andrew Morton Cc: zdenek.kabelac, linux-kernel, tglx, netdev, Jesse Brandeburg, David S. Miller, stable On Tue, Dec 09, 2008 at 03:08:01PM -0800, Andrew Morton wrote: > On Tue, 9 Dec 2008 12:03:37 +0100 > Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: > > > It some error checking is missing in e1000e: debug contention on NVM > > SWFLAG > > On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: > > > Hi > > > > > > During occasional scan of message log - I've found out this BUG which > > > happened on Dec3 with the -rc7 from that day. > > > (So if it's now fixed in current git feel free to ignore :)) > > > > > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during > > > shutdown and was actually not noticed by me... > > > > > > > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, > > > shutting down normally. > > > NetworkManager: <info> (eth0): now unmanaged > > > NetworkManager: <info> (eth0): device state change: 3 -> 1 > > > NetworkManager: <info> (eth0): cleaning up... > > > NetworkManager: <info> (eth0): taking down device. > > > > > > ===================================== > > > [ BUG: bad unlock balance detected! ] > > > ------------------------------------- > > (top-posting repaired. Please don't do that!!!). Yep, sorry. > > > Hello Zdenek, > > > > This could be due to 717d438d1fde94decef874b9808379d1f4523453 > > "e1000e: debug contention on NVM SWFLAG" > > Error handling is missing from e1000_reset_hw_ich8lan so it may happen > > that we don't acquire the nvm_mutex if the card times out. > > > > Adding Thomas to CC. > > yup. 2.6.27 needs fixing also. > > Like this? I don't think so, e1000_acquire_swflag_ich8lan() locks and e1000_release_swflag_ich8lan() unlocks. I think it is more along the lines of: diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c index 523b971..f971b83 100644 --- a/drivers/net/e1000e/ich8lan.c +++ b/drivers/net/e1000e/ich8lan.c @@ -1892,7 +1892,13 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw) */ ctrl |= E1000_CTRL_PHY_RST; } + ret_val = e1000_acquire_swflag_ich8lan(hw); + if (ret_val) { + hw_dbg(hw, "Failed to acquire NVM swflag"); + return ret_val; + } + hw_dbg(hw, "Issuing a global reset to ich8lan"); ew32(CTRL, (ctrl | E1000_CTRL_RST)); msleep(20); But I'm not sure we should cancel the ongoing reset if the card times out... Regards, Frederik > > From: Andrew Morton <akpm@linux-foundation.org> > > ===================================== > [ BUG: bad unlock balance detected! ] > ------------------------------------- > > A regression added by 717d438d1fde94decef874b9808379d1f4523453 ("e1000e: > debug contention on NVM SWFLAG"). > > Reported-by: "Zdenek Kabelac" <zdenek.kabelac@gmail.com> > Cc: Frederik Deweerdt <frederik.deweerdt@xprog.eu> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: <stable@kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > drivers/net/e1000e/ich8lan.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff -puN drivers/net/e1000e/ich8lan.c~drivers-net-e1000e-ich8lanc-fix-locking drivers/net/e1000e/ich8lan.c > --- a/drivers/net/e1000e/ich8lan.c~drivers-net-e1000e-ich8lanc-fix-locking > +++ a/drivers/net/e1000e/ich8lan.c > @@ -400,6 +400,7 @@ static s32 e1000_acquire_swflag_ich8lan( > { > u32 extcnf_ctrl; > u32 timeout = PHY_CFG_TIMEOUT; > + s32 ret = 0; > > might_sleep(); > > @@ -427,11 +428,11 @@ static s32 e1000_acquire_swflag_ich8lan( > extcnf_ctrl &= ~E1000_EXTCNF_CTRL_SWFLAG; > ew32(EXTCNF_CTRL, extcnf_ctrl); > nvm_owner = -1; > - mutex_unlock(&nvm_mutex); > - return -E1000_ERR_CONFIG; > + ret = -E1000_ERR_CONFIG; > } > > - return 0; > + mutex_unlock(&nvm_mutex); > + return ret; > } > > /** > _ > ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: BUG: bad unlock balance detected! e1000e 2008-12-09 23:43 ` Frederik Deweerdt @ 2008-12-09 23:56 ` Andrew Morton 2008-12-11 0:37 ` [E1000-devel] " Jeff Kirsher 0 siblings, 1 reply; 6+ messages in thread From: Andrew Morton @ 2008-12-09 23:56 UTC (permalink / raw) To: Frederik Deweerdt Cc: zdenek.kabelac, linux-kernel, tglx, netdev, jesse.brandeburg, davem, stable, e1000-devel On Wed, 10 Dec 2008 00:43:46 +0100 Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: > On Tue, Dec 09, 2008 at 03:08:01PM -0800, Andrew Morton wrote: > > On Tue, 9 Dec 2008 12:03:37 +0100 > > Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: > > > > > It some error checking is missing in e1000e: debug contention on NVM > > > SWFLAG > > > On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: > > > > Hi > > > > > > > > During occasional scan of message log - I've found out this BUG which > > > > happened on Dec3 with the -rc7 from that day. > > > > (So if it's now fixed in current git feel free to ignore :)) > > > > > > > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during > > > > shutdown and was actually not noticed by me... > > > > > > > > > > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, > > > > shutting down normally. > > > > NetworkManager: <info> (eth0): now unmanaged > > > > NetworkManager: <info> (eth0): device state change: 3 -> 1 > > > > NetworkManager: <info> (eth0): cleaning up... > > > > NetworkManager: <info> (eth0): taking down device. > > > > > > > > ===================================== > > > > [ BUG: bad unlock balance detected! ] > > > > ------------------------------------- > > > > (top-posting repaired. Please don't do that!!!). > Yep, sorry. > > > > > Hello Zdenek, > > > > > > This could be due to 717d438d1fde94decef874b9808379d1f4523453 > > > "e1000e: debug contention on NVM SWFLAG" > > > Error handling is missing from e1000_reset_hw_ich8lan so it may happen > > > that we don't acquire the nvm_mutex if the card times out. > > > > > > Adding Thomas to CC. > > > > yup. 2.6.27 needs fixing also. > > > > Like this? > I don't think so, e1000_acquire_swflag_ich8lan() locks and > e1000_release_swflag_ich8lan() unlocks. urgh, OK, I made the mistake of reading the comments. > I think it is more along the > lines of: > > > diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c > index 523b971..f971b83 100644 > --- a/drivers/net/e1000e/ich8lan.c > +++ b/drivers/net/e1000e/ich8lan.c > @@ -1892,7 +1892,13 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw) > */ > ctrl |= E1000_CTRL_PHY_RST; > } > + > ret_val = e1000_acquire_swflag_ich8lan(hw); > + if (ret_val) { > + hw_dbg(hw, "Failed to acquire NVM swflag"); > + return ret_val; > + } > + > hw_dbg(hw, "Issuing a global reset to ich8lan"); > ew32(CTRL, (ctrl | E1000_CTRL_RST)); > msleep(20); > > > But I'm not sure we should cancel the ongoing reset if the card times > out... > Yes, something like that. Or something like --- a/drivers/net/e1000e/ich8lan.c~a +++ a/drivers/net/e1000e/ich8lan.c @@ -1940,12 +1940,14 @@ static s32 e1000_reset_hw_ich8lan(struct ctrl |= E1000_CTRL_PHY_RST; } ret_val = e1000_acquire_swflag_ich8lan(hw); - hw_dbg(hw, "Issuing a global reset to ich8lan\n"); - ew32(CTRL, (ctrl | E1000_CTRL_RST)); - msleep(20); + if (!ret_val) { + hw_dbg(hw, "Issuing a global reset to ich8lan\n"); + ew32(CTRL, (ctrl | E1000_CTRL_RST)); + msleep(20); - /* release the swflag because it is not reset by hardware reset */ - e1000_release_swflag_ich8lan(hw); + /* release the swflag because it is not reset by hardware reset */ + e1000_release_swflag_ich8lan(hw); + } ret_val = e1000e_get_auto_rd_done(hw); if (ret_val) { _ Dunno. It's e1000-developer-summoning-dance time. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [E1000-devel] BUG: bad unlock balance detected! e1000e 2008-12-09 23:56 ` Andrew Morton @ 2008-12-11 0:37 ` Jeff Kirsher 0 siblings, 0 replies; 6+ messages in thread From: Jeff Kirsher @ 2008-12-11 0:37 UTC (permalink / raw) To: Andrew Morton Cc: Frederik Deweerdt, e1000-devel, netdev, jesse.brandeburg, linux-kernel, stable, tglx, zdenek.kabelac, davem On Tue, Dec 9, 2008 at 3:56 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 10 Dec 2008 00:43:46 +0100 > Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: > >> On Tue, Dec 09, 2008 at 03:08:01PM -0800, Andrew Morton wrote: >> > On Tue, 9 Dec 2008 12:03:37 +0100 >> > Frederik Deweerdt <frederik.deweerdt@xprog.eu> wrote: >> > >> > > It some error checking is missing in e1000e: debug contention on NVM >> > > SWFLAG >> > > On Mon, Dec 08, 2008 at 12:24:09PM +0100, Zdenek Kabelac wrote: >> > > > Hi >> > > > >> > > > During occasional scan of message log - I've found out this BUG which >> > > > happened on Dec3 with the -rc7 from that day. >> > > > (So if it's now fixed in current git feel free to ignore :)) >> > > > >> > > > My machine T61 - C2D, 2GB, 64bit kernel - message appeared during >> > > > shutdown and was actually not noticed by me... >> > > > >> > > > >> > > > NetworkManager: <WARN> nm_signal_handler(): Caught signal 15, >> > > > shutting down normally. >> > > > NetworkManager: <info> (eth0): now unmanaged >> > > > NetworkManager: <info> (eth0): device state change: 3 -> 1 >> > > > NetworkManager: <info> (eth0): cleaning up... >> > > > NetworkManager: <info> (eth0): taking down device. >> > > > >> > > > ===================================== >> > > > [ BUG: bad unlock balance detected! ] >> > > > ------------------------------------- >> > >> > (top-posting repaired. Please don't do that!!!). >> Yep, sorry. >> > >> > > Hello Zdenek, >> > > >> > > This could be due to 717d438d1fde94decef874b9808379d1f4523453 >> > > "e1000e: debug contention on NVM SWFLAG" >> > > Error handling is missing from e1000_reset_hw_ich8lan so it may happen >> > > that we don't acquire the nvm_mutex if the card times out. >> > > >> > > Adding Thomas to CC. >> > >> > yup. 2.6.27 needs fixing also. >> > >> > Like this? >> I don't think so, e1000_acquire_swflag_ich8lan() locks and >> e1000_release_swflag_ich8lan() unlocks. > > urgh, OK, I made the mistake of reading the comments. > >> I think it is more along the >> lines of: >> >> >> diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c >> index 523b971..f971b83 100644 >> --- a/drivers/net/e1000e/ich8lan.c >> +++ b/drivers/net/e1000e/ich8lan.c >> @@ -1892,7 +1892,13 @@ static s32 e1000_reset_hw_ich8lan(struct e1000_hw *hw) >> */ >> ctrl |= E1000_CTRL_PHY_RST; >> } >> + >> ret_val = e1000_acquire_swflag_ich8lan(hw); >> + if (ret_val) { >> + hw_dbg(hw, "Failed to acquire NVM swflag"); >> + return ret_val; >> + } >> + >> hw_dbg(hw, "Issuing a global reset to ich8lan"); >> ew32(CTRL, (ctrl | E1000_CTRL_RST)); >> msleep(20); >> >> >> But I'm not sure we should cancel the ongoing reset if the card times >> out... >> > > Yes, something like that. Or something like > > --- a/drivers/net/e1000e/ich8lan.c~a > +++ a/drivers/net/e1000e/ich8lan.c > @@ -1940,12 +1940,14 @@ static s32 e1000_reset_hw_ich8lan(struct > ctrl |= E1000_CTRL_PHY_RST; > } > ret_val = e1000_acquire_swflag_ich8lan(hw); > - hw_dbg(hw, "Issuing a global reset to ich8lan\n"); > - ew32(CTRL, (ctrl | E1000_CTRL_RST)); > - msleep(20); > + if (!ret_val) { > + hw_dbg(hw, "Issuing a global reset to ich8lan\n"); > + ew32(CTRL, (ctrl | E1000_CTRL_RST)); > + msleep(20); > > - /* release the swflag because it is not reset by hardware reset */ > - e1000_release_swflag_ich8lan(hw); > + /* release the swflag because it is not reset by hardware reset */ > + e1000_release_swflag_ich8lan(hw); > + } > > ret_val = e1000e_get_auto_rd_done(hw); > if (ret_val) { > _ > > > Dunno. It's e1000-developer-summoning-dance time. > Actually, if we time out trying to acquire the swflag, we still want to reset the part because we are most likely in an unrecoverable state. So I would suggest the following --- a/drivers/net/e1000e/ich8lan.c~a +++ a/drivers/net/e1000e/ich8lan.c @@ -1940,9 +1940,10 @@ static s32 e1000_reset_hw_ich8lan(struct ctrl |= E1000_CTRL_PHY_RST; } ret_val = e1000_acquire_swflag_ich8lan(hw); hw_dbg(hw, "Issuing a global reset to ich8lan\n"); ew32(CTRL, (ctrl | E1000_CTRL_RST)); msleep(20); + if (!ret_val) { - - /* release the swflag because it is not reset by hardware reset */ - e1000_release_swflag_ich8lan(hw); + /* release the swflag because it is not reset by hardware reset */ + e1000_release_swflag_ich8lan(hw); + } Of course, we will want to add a comment to the fact that we still want to reset the part, even if we have not acquired the lock because we are in an unrecoverable state. I can provide a patch in a few minutes. -- Cheers, Jeff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-12-11 0:38 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-12-08 11:24 BUG: bad unlock balance detected! e1000e Zdenek Kabelac 2008-12-09 11:03 ` Frederik Deweerdt 2008-12-09 23:08 ` Andrew Morton 2008-12-09 23:43 ` Frederik Deweerdt 2008-12-09 23:56 ` Andrew Morton 2008-12-11 0:37 ` [E1000-devel] " Jeff Kirsher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox