linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* iwl3945: Don't queue rfkill_poll work when module is exiting
@ 2009-03-24 15:44 TJ
  2009-03-24 17:51 ` Helmut Schaa
  0 siblings, 1 reply; 7+ messages in thread
From: TJ @ 2009-03-24 15:44 UTC (permalink / raw)
  To: Reinette Chatre; +Cc: linux-wireless

Tim Gardner suggested I forward this as a possible stable-release
update. I found the problem in our current Ubuntu Jaunty tree - iwlwifi
version 1.2.26k. What follows is a copy of the patch for Jaunty
(2.6.28).

----------
Bug: #345710

When the wireless interface is active and the iwl3945 module is unloaded the
call to ieee80211_unregister_hw() would call iwl3945_mac_stop() which would
restart the delayed workqueue for rfkill_poll. That workqueue had already been
cancelled so when the next work item was run (2 seconds later) the system would
suffer a hard lock-up because the module had been unloaded by then.

This patch implements STATUS_EXIT_PENDING checks in places where the rfkill_poll
work is scheduled, and moves the final workqueue cancellation to occur after the
call to ieee80211_unregister_hw().

Bug discovered, experienced and fix tested on my PC.

Signed-off-by: TJ <ubuntu@tjworld.net>
---
 drivers/net/wireless/iwlwifi/iwl3945-base.c |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c
index bb92db2..acaf038 100644
--- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
+++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
@@ -6062,7 +6062,9 @@ static void iwl3945_rfkill_poll(struct work_struct *data)
 	if (test_bit(STATUS_RF_KILL_HW, &status) != test_bit(STATUS_RF_KILL_HW, &priv->status))
 	   queue_work(priv->workqueue, &priv->rf_kill);
 
-	queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+	/* only queue if module isn't exiting */
+	if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+		queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
 						   round_jiffies_relative(2 * HZ));
 
 }
@@ -6588,7 +6590,10 @@ static void iwl3945_mac_stop(struct ieee80211_hw *hw)
 	flush_workqueue(priv->workqueue);
 
 	/* start polling the killswitch state again */
-	queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+
+	/* only queue if module isn't exiting */
+	if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+		queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
 						   round_jiffies_relative(2 * HZ));
 
 	IWL_DEBUG_MAC80211("leave\n");
@@ -8166,7 +8171,6 @@ static void __devexit iwl3945_pci_remove(struct pci_dev *pdev)
 	sysfs_remove_group(&pdev->dev.kobj, &iwl3945_attribute_group);
 
 	iwl3945_rfkill_unregister(priv);
-	cancel_delayed_work(&priv->rfkill_poll);
 	iwl3945_dealloc_ucode_pci(priv);
 
 	if (priv->rxq.bd)
@@ -8182,6 +8186,12 @@ static void __devexit iwl3945_pci_remove(struct pci_dev *pdev)
 	/*netif_stop_queue(dev); */
 	flush_workqueue(priv->workqueue);
 
+	/* ieee80211_unregister_hw calls iwl3945_mac_stop which used to restart the rfkill
+	   polling. Although that now checks STATUS_EXIT_PENDING do cancel and wait for any
+	   pending work to complete */
+	cancel_delayed_work(&priv->rfkill_poll);
+	cancel_work_sync(&priv->rfkill_poll);
+
 	/* ieee80211_unregister_hw calls iwl3945_mac_stop, which flushes
 	 * priv->workqueue... so we can't take down the workqueue
 	 * until now... */
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-24 15:44 iwl3945: Don't queue rfkill_poll work when module is exiting TJ
@ 2009-03-24 17:51 ` Helmut Schaa
  2009-03-24 21:22   ` TJ
  0 siblings, 1 reply; 7+ messages in thread
From: Helmut Schaa @ 2009-03-24 17:51 UTC (permalink / raw)
  To: TJ; +Cc: Reinette Chatre, linux-wireless

Am Dienstag, 24. M=E4rz 2009 schrieb TJ:
> Tim Gardner suggested I forward this as a possible stable-release
> update. I found the problem in our current Ubuntu Jaunty tree - iwlwi=
fi
> version 1.2.26k. What follows is a copy of the patch for Jaunty
> (2.6.28).

The patch that introduced the rfkill polling did not hit mainline yet. =
It
should pop up in 2.6.30, hence this patch is not needed for any stable
series.

I wasn't able to reproduce the issue by unloading iwl3945 here. Any hin=
ts
on how to reproduce this issue?

Thanks,
Helmut

> ----------
> Bug: #345710
>=20
> When the wireless interface is active and the iwl3945 module is unloa=
ded the
> call to ieee80211_unregister_hw() would call iwl3945_mac_stop() which=
 would
> restart the delayed workqueue for rfkill_poll. That workqueue had alr=
eady been
> cancelled so when the next work item was run (2 seconds later) the sy=
stem would
> suffer a hard lock-up because the module had been unloaded by then.
>=20
> This patch implements STATUS_EXIT_PENDING checks in places where the =
rfkill_poll
> work is scheduled, and moves the final workqueue cancellation to occu=
r after the
> call to ieee80211_unregister_hw().
>=20
> Bug discovered, experienced and fix tested on my PC.
>=20
> Signed-off-by: TJ <ubuntu@tjworld.net>
> ---
>  drivers/net/wireless/iwlwifi/iwl3945-base.c |   16 +++++++++++++---
>  1 files changed, 13 insertions(+), 3 deletions(-)
>=20
> diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/ne=
t/wireless/iwlwifi/iwl3945-base.c
> index bb92db2..acaf038 100644
> --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
> +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
> @@ -6062,7 +6062,9 @@ static void iwl3945_rfkill_poll(struct work_str=
uct *data)
>  	if (test_bit(STATUS_RF_KILL_HW, &status) !=3D test_bit(STATUS_RF_KI=
LL_HW, &priv->status))
>  	   queue_work(priv->workqueue, &priv->rf_kill);
> =20
> -	queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
> +	/* only queue if module isn't exiting */
> +	if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
> +		queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
>  						   round_jiffies_relative(2 * HZ));
> =20
>  }
> @@ -6588,7 +6590,10 @@ static void iwl3945_mac_stop(struct ieee80211_=
hw *hw)
>  	flush_workqueue(priv->workqueue);
> =20
>  	/* start polling the killswitch state again */
> -	queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
> +
> +	/* only queue if module isn't exiting */
> +	if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
> +		queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
>  						   round_jiffies_relative(2 * HZ));
> =20
>  	IWL_DEBUG_MAC80211("leave\n");
> @@ -8166,7 +8171,6 @@ static void __devexit iwl3945_pci_remove(struct=
 pci_dev *pdev)
>  	sysfs_remove_group(&pdev->dev.kobj, &iwl3945_attribute_group);
> =20
>  	iwl3945_rfkill_unregister(priv);
> -	cancel_delayed_work(&priv->rfkill_poll);
>  	iwl3945_dealloc_ucode_pci(priv);
> =20
>  	if (priv->rxq.bd)
> @@ -8182,6 +8186,12 @@ static void __devexit iwl3945_pci_remove(struc=
t pci_dev *pdev)
>  	/*netif_stop_queue(dev); */
>  	flush_workqueue(priv->workqueue);
> =20
> +	/* ieee80211_unregister_hw calls iwl3945_mac_stop which used to res=
tart the rfkill
> +	   polling. Although that now checks STATUS_EXIT_PENDING do cancel =
and wait for any
> +	   pending work to complete */
> +	cancel_delayed_work(&priv->rfkill_poll);
> +	cancel_work_sync(&priv->rfkill_poll);
> +
>  	/* ieee80211_unregister_hw calls iwl3945_mac_stop, which flushes
>  	 * priv->workqueue... so we can't take down the workqueue
>  	 * until now... */


--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-24 17:51 ` Helmut Schaa
@ 2009-03-24 21:22   ` TJ
  2009-03-25  3:41     ` Huaxu Wan
  0 siblings, 1 reply; 7+ messages in thread
From: TJ @ 2009-03-24 21:22 UTC (permalink / raw)
  To: Helmut Schaa; +Cc: Reinette Chatre, linux-wireless

On Tue, 2009-03-24 at 18:51 +0100, Helmut Schaa wrote:
> Am Dienstag, 24. M=C3=A4rz 2009 schrieb TJ:
> > Tim Gardner suggested I forward this as a possible stable-release
> > update. I found the problem in our current Ubuntu Jaunty tree - iwl=
wifi
> > version 1.2.26k. What follows is a copy of the patch for Jaunty
> > (2.6.28).
>=20
> The patch that introduced the rfkill polling did not hit mainline yet=
=2E It
> should pop up in 2.6.30, hence this patch is not needed for any stabl=
e
> series.
>=20
> I wasn't able to reproduce the issue by unloading iwl3945 here. Any h=
ints
> on how to reproduce this issue?

lspci -nn -s 06:00.0
06:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945A=
BG [Golan] Network Connection [8086:4222] (rev 02)

With the interface active and in use do:

sudo modprobe -r iwl3945

Approximately 2 seconds later the PC locks solid and requires a
power-off button restart.

Confirmed by another of our kernel team:

https://launchpad.net/bugs/345710

--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-24 21:22   ` TJ
@ 2009-03-25  3:41     ` Huaxu Wan
  2009-03-25  4:31       ` TJ
  2009-04-02 22:36       ` reinette chatre
  0 siblings, 2 replies; 7+ messages in thread
From: Huaxu Wan @ 2009-03-25  3:41 UTC (permalink / raw)
  To: TJ; +Cc: Helmut Schaa, Reinette Chatre, linux-wireless

Hi,

This issue is confirmed here too. But during compile, there is warning =
about
cancel_work_sync(), cancel_delayed_work_sync() should be used instead.

And, without the modification in iwl3945_rfkill_poll() and=20
iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this=
 issue
is also fixed in my testing.

Best regards
Huaxu

PS: The following is the log when iwl3945 was unloaded before patched.

root@Januty-devel:~# modprobe -r iwl3945
[  164.840610] iwl3945: U iwl3945_pci_remove *** UNLOAD DRIVER ***
[  164.846515] iwl3945: U __iwl3945_down iwl3945 is going down
[  164.852315] iwl3945: U iwl3945_hw_nic_stop_master stop master
[  164.858634] iwl3945: U iwl3945_clear_free_frames 0 frames on pre-all=
ocated heap on clear.
[  164.917177] iwl3945: U __iwl3945_down iwl3945 is going down
[  164.922735] iwl3945: U iwl3945_clear_free_frames 0 frames on pre-all=
ocated heap on clear.
[  166.804090] BUG: unable to handle kernel paging request at 000077ff8=
b64b8c7
[  166.808082] IP: [<ffffffff80262b62>] delayed_work_timer_fn+0x32/0x40
[  166.808082] PGD 0=20
[  166.808082] Oops: 0000 [#1] SMP=20
[  166.808082] last sysfs file: /sys/module/cfg80211/refcnt
[  166.808082] Dumping ftrace buffer:
[  166.808082]    (ftrace buffer empty)
[  166.808082] CPU 1=20
[  166.808082] Modules linked in: lp parport arc4 ecb psmouse iTCO_wdt =
video serio_raw pcspkr usbhid intel_agp iTCO_vendor_support output ehci=
_hcd uhci_hcd ]
[  166.808082] Pid: 0, comm: swapper Not tainted 2.6.28.8 #1
[  166.808082] RIP: 0010:[<ffffffff80262b62>]  [<ffffffff80262b62>] del=
ayed_work_timer_fn+0x32/0x40
[  166.808082] RSP: 0018:ffff88007b62be80  EFLAGS: 00010246
[  166.808082] RAX: 000077ff8b64b8bf RBX: ffff8800738df7c8 RCX: ffff880=
0738df7c8
[  166.808082] RDX: 0000000000000001 RSI: ffff8800738df7c8 RDI: ffff880=
0738df7c8
[  166.808082] RBP: ffff88007b62be80 R08: ffff8800738df820 R09: 0000000=
000000000
[  166.808082] R10: ffff88007b625e18 R11: 0000000000000000 R12: 0000000=
000000100
[  166.808082] R13: ffff88007b614000 R14: ffff88007b62beb0 R15: fffffff=
f80262b30
[  166.808082] FS:  0000000000000000(0000) GS:ffff88007bc02c00(0000) kn=
lGS:0000000000000000
[  166.808082] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  166.808082] CR2: 000077ff8b64b8c7 CR3: 0000000000201000 CR4: 0000000=
0000006a0
[  166.808082] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000=
000000000
[  166.808082] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000=
000000400
[  166.808082] Process swapper (pid: 0, threadinfo ffff88007b624000, ta=
sk ffff88007b61c320)
[  166.808082] Stack:
[  166.808082]  ffff88007b62bef0 ffffffff80259e39 ffff88007b615c18 ffff=
88007b615818
[  166.808082]  ffff88007b615418 ffff88007b615018 ffff88007b62beb0 ffff=
88007b62beb0
[  166.808082]  ffffffff80271efd 0000000000000008 0000000000000001 0000=
000000000100
[  166.808082] Call Trace:
[  166.808082]  <IRQ> <0> [<ffffffff80259e39>] run_timer_softirq+0x179/=
0x260
[  166.808082]  [<ffffffff80271efd>] ? tick_handle_oneshot_broadcast+0x=
ed/0x100
[  166.808082]  [<ffffffff80254cfc>] __do_softirq+0x9c/0x170
[  166.808082]  [<ffffffff80213cac>] call_softirq+0x1c/0x30
[  166.808082]  [<ffffffff80214f0d>] do_softirq+0x5d/0xa0
[  166.808082]  [<ffffffff80254a7d>] irq_exit+0x8d/0xa0
[  166.808082]  [<ffffffff802151d5>] do_IRQ+0xc5/0x110
[  166.808082]  [<ffffffff80212b13>] ret_from_intr+0x0/0x29
[  166.808082]  <EOI> <0> [<ffffffff8046d906>] ? acpi_idle_enter_simple=
+0x166/0x1a4
[  166.808082]  [<ffffffff8046d8fe>] ? acpi_idle_enter_simple+0x15e/0x1=
a4
[  166.808082]  [<ffffffff805745a5>] ? cpuidle_idle_call+0xa5/0x100
[  166.808082]  [<ffffffff80210e85>] ? cpu_idle+0x65/0xc0
[  166.808082]  [<ffffffff8067e6fd>] ? start_secondary+0x168/0x1bb
[  166.808082] Code: 65 8b 14 25 24 00 00 00 48 89 e5 48 83 e0 fc 48 8b=
 40 38 8b 70 20 48 8b 00 85 f6 0f 45 15 ff c9 72 00 48 f7 d0 48 89 ce 4=
8 63 d2 <48> 8=20
[  166.808082] RIP  [<ffffffff80262b62>] delayed_work_timer_fn+0x32/0x4=
0
[  166.808082]  RSP <ffff88007b62be80>
[  166.808082] CR2: 000077ff8b64b8c7
[  167.086916] Kernel panic - not syncing: Fatal exception in interrupt


On 21:22 Tue 24 Mar, TJ wrote:
> On Tue, 2009-03-24 at 18:51 +0100, Helmut Schaa wrote:
> > Am Dienstag, 24. M=E4rz 2009 schrieb TJ:
> > > Tim Gardner suggested I forward this as a possible stable-release
> > > update. I found the problem in our current Ubuntu Jaunty tree - i=
wlwifi
> > > version 1.2.26k. What follows is a copy of the patch for Jaunty
> > > (2.6.28).
> >=20
> > The patch that introduced the rfkill polling did not hit mainline y=
et. It
> > should pop up in 2.6.30, hence this patch is not needed for any sta=
ble
> > series.
> >=20
> > I wasn't able to reproduce the issue by unloading iwl3945 here. Any=
 hints
> > on how to reproduce this issue?
>=20
> lspci -nn -s 06:00.0
> 06:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 394=
5ABG [Golan] Network Connection [8086:4222] (rev 02)
>=20
> With the interface active and in use do:
>=20
> sudo modprobe -r iwl3945
>=20
> Approximately 2 seconds later the PC locks solid and requires a
> power-off button restart.
>=20
> Confirmed by another of our kernel team:
>=20
> https://launchpad.net/bugs/345710
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wirel=
ess" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-25  3:41     ` Huaxu Wan
@ 2009-03-25  4:31       ` TJ
  2009-03-26 19:39         ` reinette chatre
  2009-04-02 22:36       ` reinette chatre
  1 sibling, 1 reply; 7+ messages in thread
From: TJ @ 2009-03-25  4:31 UTC (permalink / raw)
  To: Huaxu Wan; +Cc: Helmut Schaa, Reinette Chatre, linux-wireless

On Wed, 2009-03-25 at 11:41 +0800, Huaxu Wan wrote:
> Hi,
> 
> This issue is confirmed here too. But during compile, there is warning about
> cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
> 
> And, without the modification in iwl3945_rfkill_poll() and 
> iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
> is also fixed in my testing.

Originally I did try adding just cancel_delayed_work_sync() although I
put it immediately following:

        iwl3945_rfkill_unregister(priv);
        cancel_delayed_work(&priv->rfkill_poll);

Wondered why that didn't work and hunted down the unnecessary calls to
queue_delayed_work() that were causing the issue and made them
conditional on STATUS_EXIT_PENDING.

Then I decided to be absolutely sure and moved cancel_delayed_work()
further down and for some reason mis-used cancel_work_sync() instead of
cancel_delayed_work_sync() - not paying attention I guess!




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-25  4:31       ` TJ
@ 2009-03-26 19:39         ` reinette chatre
  0 siblings, 0 replies; 7+ messages in thread
From: reinette chatre @ 2009-03-26 19:39 UTC (permalink / raw)
  To: TJ; +Cc: Huaxu Wan, Helmut Schaa, linux-wireless@vger.kernel.org

On Tue, 2009-03-24 at 21:31 -0700, TJ wrote:
> On Wed, 2009-03-25 at 11:41 +0800, Huaxu Wan wrote:
> > Hi,
> > 
> > This issue is confirmed here too. But during compile, there is warning about
> > cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
> > 
> > And, without the modification in iwl3945_rfkill_poll() and 
> > iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
> > is also fixed in my testing.
> 
> Originally I did try adding just cancel_delayed_work_sync() although I
> put it immediately following:
> 
>         iwl3945_rfkill_unregister(priv);
>         cancel_delayed_work(&priv->rfkill_poll);
> 
> Wondered why that didn't work and hunted down the unnecessary calls to
> queue_delayed_work() that were causing the issue and made them
> conditional on STATUS_EXIT_PENDING.
> 
> Then I decided to be absolutely sure and moved cancel_delayed_work()
> further down and for some reason mis-used cancel_work_sync() instead of
> cancel_delayed_work_sync() - not paying attention I guess!

Sounds like this can be done in one line ... could you please resubmit?

Thanks

Reinette


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: iwl3945: Don't queue rfkill_poll work when module is exiting
  2009-03-25  3:41     ` Huaxu Wan
  2009-03-25  4:31       ` TJ
@ 2009-04-02 22:36       ` reinette chatre
  1 sibling, 0 replies; 7+ messages in thread
From: reinette chatre @ 2009-04-02 22:36 UTC (permalink / raw)
  To: Huaxu Wan; +Cc: TJ, Helmut Schaa, linux-wireless@vger.kernel.org

Hi Huaxu,

On Tue, 2009-03-24 at 20:41 -0700, Huaxu Wan wrote:
> Hi,
> 
> This issue is confirmed here too. But during compile, there is warning about
> cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
> 
> And, without the modification in iwl3945_rfkill_poll() and 
> iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
> is also fixed in my testing.

TJ disappeared. We do need this fixed, would you like to submit your
patch instead?

Thanks

Reinette



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-02 22:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-24 15:44 iwl3945: Don't queue rfkill_poll work when module is exiting TJ
2009-03-24 17:51 ` Helmut Schaa
2009-03-24 21:22   ` TJ
2009-03-25  3:41     ` Huaxu Wan
2009-03-25  4:31       ` TJ
2009-03-26 19:39         ` reinette chatre
2009-04-02 22:36       ` reinette chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).