* [PATCH] b43: Fix locking problem when stopping rfkill polling
@ 2009-10-07 15:06 Larry Finger
2009-10-07 19:01 ` John W. Linville
0 siblings, 1 reply; 9+ messages in thread
From: Larry Finger @ 2009-10-07 15:06 UTC (permalink / raw)
To: John W Linville; +Cc: linux-wireless
In commit 26e5ab35b4c7b1d4cb487a11084520aed9a8d05e entitled "b43: Fix PPC
crash in rfkill polling on unload", the call to stop polling should not have
been placed inside the wl->mutex. The result was incorrect locking messages.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
---
John,
I had not intended for the previous patch to be applied as I was waiting for
the Bugzilla OP to test. He promised to do that today. In any case, that patch
introduced a locking problem that needs to be fixed.
Why do the one-liners cause so many problems?
Larry
---
Index: wireless-testing/drivers/net/wireless/b43/main.c
===================================================================
--- wireless-testing.orig/drivers/net/wireless/b43/main.c
+++ wireless-testing/drivers/net/wireless/b43/main.c
@@ -4501,8 +4501,8 @@ static void b43_op_stop(struct ieee80211
cancel_work_sync(&(wl->beacon_update_trigger));
- mutex_lock(&wl->mutex);
wiphy_rfkill_stop_polling(hw->wiphy);
+ mutex_lock(&wl->mutex);
if (b43_status(dev) >= B43_STAT_STARTED) {
dev = b43_wireless_core_stop(dev);
if (!dev)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 15:06 [PATCH] b43: Fix locking problem when stopping rfkill polling Larry Finger
@ 2009-10-07 19:01 ` John W. Linville
2009-10-07 19:28 ` Larry Finger
2009-10-08 20:23 ` Larry Finger
0 siblings, 2 replies; 9+ messages in thread
From: John W. Linville @ 2009-10-07 19:01 UTC (permalink / raw)
To: Larry Finger; +Cc: linux-wireless
On Wed, Oct 07, 2009 at 10:06:05AM -0500, Larry Finger wrote:
> In commit 26e5ab35b4c7b1d4cb487a11084520aed9a8d05e entitled "b43: Fix PPC
> crash in rfkill polling on unload", the call to stop polling should not have
> been placed inside the wl->mutex. The result was incorrect locking messages.
>
> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> ---
>
> John,
>
> I had not intended for the previous patch to be applied as I was waiting for
> the Bugzilla OP to test. He promised to do that today. In any case, that patch
> introduced a locking problem that needs to be fixed.
>
> Why do the one-liners cause so many problems?
>
> Larry
> ---
>
> Index: wireless-testing/drivers/net/wireless/b43/main.c
> ===================================================================
> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
> +++ wireless-testing/drivers/net/wireless/b43/main.c
> @@ -4501,8 +4501,8 @@ static void b43_op_stop(struct ieee80211
>
> cancel_work_sync(&(wl->beacon_update_trigger));
>
> - mutex_lock(&wl->mutex);
> wiphy_rfkill_stop_polling(hw->wiphy);
> + mutex_lock(&wl->mutex);
> if (b43_status(dev) >= B43_STAT_STARTED) {
> dev = b43_wireless_core_stop(dev);
> if (!dev)
OK, but why do we start polling under the lock but stop polling without
the lock? Should we start polling without holding the lock too?
John
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 19:01 ` John W. Linville
@ 2009-10-07 19:28 ` Larry Finger
2009-10-07 19:46 ` Michael Buesch
2009-10-07 22:36 ` Johannes Berg
2009-10-08 20:23 ` Larry Finger
1 sibling, 2 replies; 9+ messages in thread
From: Larry Finger @ 2009-10-07 19:28 UTC (permalink / raw)
To: John W. Linville; +Cc: linux-wireless
John W. Linville wrote:
> On Wed, Oct 07, 2009 at 10:06:05AM -0500, Larry Finger wrote:
>> In commit 26e5ab35b4c7b1d4cb487a11084520aed9a8d05e entitled "b43: Fix PPC
>> crash in rfkill polling on unload", the call to stop polling should not have
>> been placed inside the wl->mutex. The result was incorrect locking messages.
>>
>> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
>> ---
>>
>> John,
>>
>> I had not intended for the previous patch to be applied as I was waiting for
>> the Bugzilla OP to test. He promised to do that today. In any case, that patch
>> introduced a locking problem that needs to be fixed.
>>
>> Why do the one-liners cause so many problems?
>>
>> Larry
>> ---
>>
>> Index: wireless-testing/drivers/net/wireless/b43/main.c
>> ===================================================================
>> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
>> +++ wireless-testing/drivers/net/wireless/b43/main.c
>> @@ -4501,8 +4501,8 @@ static void b43_op_stop(struct ieee80211
>>
>> cancel_work_sync(&(wl->beacon_update_trigger));
>>
>> - mutex_lock(&wl->mutex);
>> wiphy_rfkill_stop_polling(hw->wiphy);
>> + mutex_lock(&wl->mutex);
>> if (b43_status(dev) >= B43_STAT_STARTED) {
>> dev = b43_wireless_core_stop(dev);
>> if (!dev)
>
> OK, but why do we start polling under the lock but stop polling without
> the lock? Should we start polling without holding the lock too?
I'll test that, but I suspect it doesn't matter. Of course, the reason
I put the stop under the lock was for symmetry, but then I got the
following when shutting down:
b43-phy0 debug: Removing Interface type 2
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-rc3-wl #225
-------------------------------------------------------
modprobe/25391 is trying to acquire lock:
(&(&rfkill->poll_work)->work){+.+...}, at: [<ffffffff81054a7f>]
__cancel_work_timer+0xd9/0x224
but task is already holding lock:
(&wl->mutex){+.+.+.}, at: [<ffffffffa02ff3d0>] b43_op_stop+0x30/0x7f
[b43]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&wl->mutex){+.+.+.}:
[<ffffffff81069790>] __lock_acquire+0x140e/0x174d
[<ffffffff81069b8b>] lock_acquire+0xbc/0xd9
[<ffffffff8128d420>] mutex_lock_nested+0x58/0x29c
[<ffffffffa03150ea>] b43_rfkill_poll+0x3a/0xfc [b43]
[<ffffffffa02c2f33>] ieee80211_rfkill_poll+0x26/0x28 [mac80211]
[<ffffffffa027c028>] cfg80211_rfkill_poll+0x14/0x16 [cfg80211]
[<ffffffffa0271081>] rfkill_poll+0x23/0x3d [rfkill]
[<ffffffff81054224>] worker_thread+0x22c/0x332
[<ffffffff81057fd8>] kthread+0x7d/0x85
[<ffffffff8100caba>] child_rip+0xa/0x20
Moving the stop ooutside the lock cured the problem.
Larry
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 19:28 ` Larry Finger
@ 2009-10-07 19:46 ` Michael Buesch
2009-10-07 22:36 ` Johannes Berg
1 sibling, 0 replies; 9+ messages in thread
From: Michael Buesch @ 2009-10-07 19:46 UTC (permalink / raw)
To: Larry Finger; +Cc: John W. Linville, linux-wireless
On Wednesday 07 October 2009 21:28:40 Larry Finger wrote:
> John W. Linville wrote:
> > On Wed, Oct 07, 2009 at 10:06:05AM -0500, Larry Finger wrote:
> >> In commit 26e5ab35b4c7b1d4cb487a11084520aed9a8d05e entitled "b43: Fix PPC
> >> crash in rfkill polling on unload", the call to stop polling should not have
> >> been placed inside the wl->mutex. The result was incorrect locking messages.
> >>
> >> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
> >> ---
> >>
> >> John,
> >>
> >> I had not intended for the previous patch to be applied as I was waiting for
> >> the Bugzilla OP to test. He promised to do that today. In any case, that patch
> >> introduced a locking problem that needs to be fixed.
> >>
> >> Why do the one-liners cause so many problems?
> >>
> >> Larry
> >> ---
> >>
> >> Index: wireless-testing/drivers/net/wireless/b43/main.c
> >> ===================================================================
> >> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
> >> +++ wireless-testing/drivers/net/wireless/b43/main.c
> >> @@ -4501,8 +4501,8 @@ static void b43_op_stop(struct ieee80211
> >>
> >> cancel_work_sync(&(wl->beacon_update_trigger));
> >>
> >> - mutex_lock(&wl->mutex);
> >> wiphy_rfkill_stop_polling(hw->wiphy);
> >> + mutex_lock(&wl->mutex);
> >> if (b43_status(dev) >= B43_STAT_STARTED) {
> >> dev = b43_wireless_core_stop(dev);
> >> if (!dev)
> >
> > OK, but why do we start polling under the lock but stop polling without
> > the lock? Should we start polling without holding the lock too?
>
> I'll test that, but I suspect it doesn't matter. Of course, the reason
> I put the stop under the lock was for symmetry, but then I got the
> following when shutting down:
>
> b43-phy0 debug: Removing Interface type 2
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.32-rc3-wl #225
> -------------------------------------------------------
> modprobe/25391 is trying to acquire lock:
> (&(&rfkill->poll_work)->work){+.+...}, at: [<ffffffff81054a7f>]
> __cancel_work_timer+0xd9/0x224
>
> but task is already holding lock:
> (&wl->mutex){+.+.+.}, at: [<ffffffffa02ff3d0>] b43_op_stop+0x30/0x7f
> [b43]
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&wl->mutex){+.+.+.}:
> [<ffffffff81069790>] __lock_acquire+0x140e/0x174d
> [<ffffffff81069b8b>] lock_acquire+0xbc/0xd9
> [<ffffffff8128d420>] mutex_lock_nested+0x58/0x29c
> [<ffffffffa03150ea>] b43_rfkill_poll+0x3a/0xfc [b43]
> [<ffffffffa02c2f33>] ieee80211_rfkill_poll+0x26/0x28 [mac80211]
> [<ffffffffa027c028>] cfg80211_rfkill_poll+0x14/0x16 [cfg80211]
> [<ffffffffa0271081>] rfkill_poll+0x23/0x3d [rfkill]
> [<ffffffff81054224>] worker_thread+0x22c/0x332
> [<ffffffff81057fd8>] kthread+0x7d/0x85
> [<ffffffff8100caba>] child_rip+0xa/0x20
>
> Moving the stop ooutside the lock cured the problem.
>
Just move it right after the existing cancel_work_sync() call
--
Greetings, Michael.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 19:28 ` Larry Finger
2009-10-07 19:46 ` Michael Buesch
@ 2009-10-07 22:36 ` Johannes Berg
2009-10-07 22:43 ` Larry Finger
1 sibling, 1 reply; 9+ messages in thread
From: Johannes Berg @ 2009-10-07 22:36 UTC (permalink / raw)
To: Larry Finger; +Cc: John W. Linville, linux-wireless
[-- Attachment #1: Type: text/plain, Size: 965 bytes --]
On Wed, 2009-10-07 at 14:28 -0500, Larry Finger wrote:
> > OK, but why do we start polling under the lock but stop polling without
> > the lock? Should we start polling without holding the lock too?
>
> I'll test that, but I suspect it doesn't matter. Of course, the reason
> I put the stop under the lock was for symmetry, but then I got the
> following when shutting down:
>
> b43-phy0 debug: Removing Interface type 2
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.32-rc3-wl #225
> -------------------------------------------------------
> modprobe/25391 is trying to acquire lock:
> (&(&rfkill->poll_work)->work){+.+...}, at: [<ffffffff81054a7f>]
> __cancel_work_timer+0xd9/0x224
This is because when stopping polling we need to cancel the work and
sync on it, but when starting it's completely async so starting can be
in any context.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 22:36 ` Johannes Berg
@ 2009-10-07 22:43 ` Larry Finger
2009-10-07 23:08 ` Michael Buesch
0 siblings, 1 reply; 9+ messages in thread
From: Larry Finger @ 2009-10-07 22:43 UTC (permalink / raw)
To: Johannes Berg; +Cc: John W. Linville, linux-wireless
Johannes Berg wrote:
> This is because when stopping polling we need to cancel the work and
> sync on it, but when starting it's completely async so starting can be
> in any context.
Does that mean that start could be moved outside the mutex_lock so
that the start and stop would look symmetrical in b43?
Larry
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 22:43 ` Larry Finger
@ 2009-10-07 23:08 ` Michael Buesch
0 siblings, 0 replies; 9+ messages in thread
From: Michael Buesch @ 2009-10-07 23:08 UTC (permalink / raw)
To: Larry Finger; +Cc: Johannes Berg, John W. Linville, linux-wireless
On Thursday 08 October 2009 00:43:36 Larry Finger wrote:
> Johannes Berg wrote:
> > This is because when stopping polling we need to cancel the work and
> > sync on it, but when starting it's completely async so starting can be
> > in any context.
>
> Does that mean that start could be moved outside the mutex_lock so
> that the start and stop would look symmetrical in b43?
I think it's already complicated enough what's safe in b43 to do
outside of the locks. Let's not complicate it further. The simple rule is:
Do it locked. The only exception really are syncing functions which sync
for code holding the same lock.
--
Greetings, Michael.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-07 19:01 ` John W. Linville
2009-10-07 19:28 ` Larry Finger
@ 2009-10-08 20:23 ` Larry Finger
2009-10-08 22:31 ` John W. Linville
1 sibling, 1 reply; 9+ messages in thread
From: Larry Finger @ 2009-10-08 20:23 UTC (permalink / raw)
To: John W. Linville; +Cc: linux-wireless, Greg Kroah-Hartman
On 10/07/2009 02:01 PM, John W. Linville wrote:
> OK, but why do we start polling under the lock but stop polling without
> the lock? Should we start polling without holding the lock too?
I see that this patch was committed. I hope that you will be sending
it upstream for 2.6.32. The correct version has been sent to stable
for inclusion in 2.6.31, but as usual, a mainline commit ID will be
needed.
Larry
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] b43: Fix locking problem when stopping rfkill polling
2009-10-08 20:23 ` Larry Finger
@ 2009-10-08 22:31 ` John W. Linville
0 siblings, 0 replies; 9+ messages in thread
From: John W. Linville @ 2009-10-08 22:31 UTC (permalink / raw)
To: Larry Finger; +Cc: linux-wireless, Greg Kroah-Hartman
On Thu, Oct 08, 2009 at 03:23:36PM -0500, Larry Finger wrote:
> On 10/07/2009 02:01 PM, John W. Linville wrote:
>> OK, but why do we start polling under the lock but stop polling without
>> the lock? Should we start polling without holding the lock too?
>
> I see that this patch was committed. I hope that you will be sending it
> upstream for 2.6.32. The correct version has been sent to stable for
> inclusion in 2.6.31, but as usual, a mainline commit ID will be needed.
I've folded this into "b43: Fix PPC crash in rfkill polling on unload"
for sending upstream.
Thanks!
John
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-10-08 22:46 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-07 15:06 [PATCH] b43: Fix locking problem when stopping rfkill polling Larry Finger
2009-10-07 19:01 ` John W. Linville
2009-10-07 19:28 ` Larry Finger
2009-10-07 19:46 ` Michael Buesch
2009-10-07 22:36 ` Johannes Berg
2009-10-07 22:43 ` Larry Finger
2009-10-07 23:08 ` Michael Buesch
2009-10-08 20:23 ` Larry Finger
2009-10-08 22:31 ` John W. Linville
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).