linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* EEE PC hangs when booting off battery
       [not found]         ` <49EF0ABD.2080801@tuffmail.co.uk>
@ 2009-04-26 11:34           ` Alan Jenkins
  2009-04-28  9:19             ` [PATCH] [RFC] " Alan Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Jenkins @ 2009-04-26 11:34 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org
  Cc: Arjan van de Ven, linux acpi, linux-kernel, Kernel Testers List,
	Venkatesh Pallipadi, Bjorn Helgaas

Alan Jenkins wrote:
> Alan Jenkins wrote:
>   
>> Bjorn Helgaas wrote:
>>   
>>     
>>> On Tuesday 14 April 2009 09:17:28 am Arjan van de Ven wrote:
>>>   
>>>     
>>>       
>>>> On Tue, 14 Apr 2009 08:59:01 -0600
>>>> Bjorn Helgaas <bjorn.helgaas@hp.com> wrote:
>>>>
>>>>     
>>>>       
>>>>         
>>>>> I can't help with the real problem of why the asynchronous battery
>>>>> init causes the hang.
>>>>>       
>>>>>         
>>>>>           
>>>> that got fixed already for the module case.
>>>>     
>>>>       
>>>>         
>>> But apparently still broken for the builtin case?  I think Alan is
>>> running pretty new bits -- he said "latest git" on April 11.
>>>   
>>>     
>>>       
>> It's now fixed, in 2.6.30-rc2.  My battery is modular btw.  I suspect
>>
>> 5d38258ec026921a7b266f4047ebeaa75db358e5 "ACPI battery: fix async boot
>> oops" [removal of __init]
>>
>> was not sufficient to fix my problem, but it was solved by the "real" fix,
>>
>> d6de2c80e9d758d2e36c21699117db6178c0f517 "async: Fix module loading
>> async-work regression" [module loading waits on async work]
>>
>>
>> I would argue there's still a question over why the asynchronous battery
>> init (_with_ the oops fix) should cause a hang in the idle routine.  But
>> since the regression is fixed, it's not exactly an urgent question.
>>   
>>     
>
> Ugh.  Recently I tried building the battery driver into the kernel, to
> benefit from the async work.  Later, I tried booting from the battery
> and it hung again.
>
> This time, the kernel did not even respond to SysRq.  I tried
> nmi_watchdog=1 and waiting 2 minutes, but the watchdog didn't trigger
> either.  As before, it doesn't happen with acpi=off.
>
> I checked that this still happened in todays rc3, and it doesn't happen
> if I revert
>
> 0f66af530116e9f4dd97f328d91718b56a6fc5a4 "ACPI: battery: asynchronous init"
>
>   

It looks like my hang is caused by linkwatch_event() deadlocking on
rtnl_lock().  I can't see any direct connection to asynchronous battery
init, so perhaps that is just revealing a bug by changing the timing.

It appears I wasn't patient enough for hung task detection.  If I leave
it long enough, I see:

<scrolled off top of screen>
? kobject_uevent_env
? kobject_uevent_env
__mutex_lock_slowpath
mutex_lock
rtnl_lock
linkwatch_event
worker_thread
? linkwatch_event
? autoremove_wake_function
? worker_thread
kthread
kernel_thread_helper


INFO: task modprobe:485 blocked for more than 120 seconds
Call trace:
? __atomic_notifier_call_chain
schedule
schedule_timeout
? notify_update
? do_con_write
? __wake_up
wait_for_common
? default_wake_function
wait_for_completion
flush_cpu_workqueue
? wq_barrier_func
flush_workqueue
flush_scheduled_work
tty_ldisc_release
? tty_fasyc
tty_release_dev
? free_pgtables
tty_release
__fput
filp_close
sys_close
syscall_call
? __send_remote_softirq
? usecs_to_jiffies


I then seem to get another repetition of the second calltrace, followed
by a new one

INFO: task swapper:1 blocked for more than 120 seconds
Call trace:
schedule
schedule_timeout
? __wake_up_common
? wake_up
wait_for_common
wait_for_completion
call_usermodehelper_exec
__request_module
crypto_larval_lookup
? extract_entropy
crypto_alg_mod_lookup
crypto_alloc_base
ieee80211_wep_init
ieee80211_register_hw
? ath5k_hw_set_bss
ath5k+pci_probe
local_pci_probe
pci_device_probe
driver_probe_device
__driver_attach
bus_for_each_dev
driver_attach
? __driver_attach
buad_add_driver
driver_register
? ktime_get_ts
__pci_register_driver
init_ath5k_pci
_stext
? init_ath5k_pci
? proc_create_data
? register_ieq_proc
kernel_init
? kernel_init
kernel_thread_helper




The hang happens at this point:

[    0.967588] scsi 1:0:0:0: Direct-Access     ATA      SILICONMOTION SM
n/a  PQ: 0 ANSI: 5
[    0.968049] calling  4_sd_probe_async+0x0/0x225 @ 323
[    0.968313] initcall 3_async_port_probe+0x0/0x95 returned 0 after
343051 usecs
<mark> (see below).
[    0.968786] sd 1:0:0:0: [sda] 7815024 512-byte hardware sectors:
(4.00 GB/3.72 GiB)
[    0.968964] sd 1:0:0:0: [sda] Write Protect is off
[    0.969062] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.969132] sd 1:0:0:0: [sda] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[    0.969543]  sda: sda1 sda2
[    0.970965] sd 1:0:0:0: [sda] Attached SCSI disk
[    0.971073] initcall 4_sd_probe_async+0x0/0x225 returned 0 after 2849
usecs

On a successful boot, the next lines are

[    0.971188] async_continuing @ 1 after 2483 usec
[    0.971305] Freeing unused kernel memory: 256k freed
[    1.071724] calling  ata_generic_init+0x0/0x19 [ata_generic] @ 574
[    1.073798] initcall ata_generic_init+0x0/0x19 [ata_generic] returned
0 after 144 usecs
[    1.183372] Clocksource tsc unstable (delta = -128600689 ns)
[    2.035932] EXT4-fs: delayed allocation enabled

Also, on a successful boot, I see these additional lines at the point
<mark> above.

[    0.968461] async_continuing @ 1 after 76663 usec
[    0.968556] async_waiting @ 1

In fact, when the hang happens I can see no "async_waiting @ 1" on my
50-line screen.  Which makes sense if  the kernel init process is hung
in init_athk_pci().


Thanks
Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-26 11:34           ` EEE PC hangs when booting off battery Alan Jenkins
@ 2009-04-28  9:19             ` Alan Jenkins
  2009-04-28  9:58               ` Johannes Berg
  0 siblings, 1 reply; 9+ messages in thread
From: Alan Jenkins @ 2009-04-28  9:19 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org
  Cc: Arjan van de Ven, linux-kernel, Kernel Testers List

I found a regression where my EEE hangs at boot time, if the battery is
present.

I'm confident it's a regression because it disappears if I revert
Arjan's asynchronous battery initialisation.  However, the evidence
points to a deadlock in the wireless stack which has simply been
uncovered by timing changes.

If I leave the system long enough, I get a series of hung task
warnings.  They suggest the following deadlock:

- ieee80211_wep_init(), which is called with rtnl_lock() held, is
blocked in request_module() [waiting for modprobe to load a crypto module].
- modprobe is blocked in a call to flush_workqueue(), caused by closing
a TTY.
- worker_thread is blocked because the workqueue item linkwatch_event()
is blocked on rtnl_lock.


I've hacked up a test patch to move wep_init() outside of rtnl_lock, and
it solved the problem.  My one caveat is that it would probably be
cleaner to move it after rtnl_unlock(), instead of before rtnl_lock(). 
I just wasn't 100% sure if that would be safe.  Here's the patch:

---8<---

diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index fbcbed6..fffa7f9 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -909,6 +909,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 	if (result < 0)
 		goto fail_sta_info;
 
+	result = ieee80211_wep_init(local);
+	if (result < 0) {
+		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
+		       wiphy_name(local->hw.wiphy), result);
+		goto fail_wep;
+	}
+
 	rtnl_lock();
 	result = dev_alloc_name(local->mdev, local->mdev->name);
 	if (result < 0)
@@ -930,14 +937,6 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 		goto fail_rate;
 	}
 
-	result = ieee80211_wep_init(local);
-
-	if (result < 0) {
-		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
-		       wiphy_name(local->hw.wiphy), result);
-		goto fail_wep;
-	}
-
 	/* add one default STA interface if supported */
 	if (local->hw.wiphy->interface_modes & BIT(NL80211_IFTYPE_STATION)) {
 		result = ieee80211_if_add(local, "wlan%d", NULL,
@@ -967,13 +966,12 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 
 	return 0;
 
-fail_wep:
-	rate_control_deinitialize(local);
 fail_rate:
 	unregister_netdevice(local->mdev);
 	local->mdev = NULL;
 fail_dev:
 	rtnl_unlock();
+fail_wep:
 	sta_info_stop(local);
 fail_sta_info:
 	debugfs_hw_del(local);



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-28  9:19             ` [PATCH] [RFC] " Alan Jenkins
@ 2009-04-28  9:58               ` Johannes Berg
  2009-04-28 10:27                 ` Alan Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Berg @ 2009-04-28  9:58 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: linux-wireless@vger.kernel.org, Arjan van de Ven, linux-kernel,
	Kernel Testers List

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

It seems the only reason lockdep doesn't warn is the detour to userspace
(modprobe) and the waiting for a userspace task (waiting isn't lockdep
annotated and I don't think it can be)

It seems you cannot load a module while under _any_ lock, ever, since
userspace might turn around and do something that requires that lock? I
think we should probably add a warning to the code that waits for the
userspace task:

	debug_check_no_locks_held(current);

Or not? Some purely internal locks _might_ be ok... but how could you
verify that?

> - ieee80211_wep_init(), which is called with rtnl_lock() held, is
> blocked in request_module() [waiting for modprobe to load a crypto module].

Right.

> - modprobe is blocked in a call to flush_workqueue(), caused by closing
> a TTY.

Can you point out where? I can't find that.

> - worker_thread is blocked because the workqueue item linkwatch_event()
> is blocked on rtnl_lock.

This I know about.

> I've hacked up a test patch to move wep_init() outside of rtnl_lock, and
> it solved the problem.  My one caveat is that it would probably be
> cleaner to move it after rtnl_unlock(), instead of before rtnl_lock(). 
> I just wasn't 100% sure if that would be safe.  Here's the patch:

That doesn't seem relevant, this just does some initialisation. However,
you definitely missed adding a call to wep_free().

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-28  9:58               ` Johannes Berg
@ 2009-04-28 10:27                 ` Alan Jenkins
  2009-04-28 10:35                   ` Johannes Berg
  2009-04-29 11:14                   ` Alan Jenkins
  0 siblings, 2 replies; 9+ messages in thread
From: Alan Jenkins @ 2009-04-28 10:27 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linux-wireless@vger.kernel.org, Arjan van de Ven, linux-kernel,
	Kernel Testers List

Johannes Berg wrote:
> It seems the only reason lockdep doesn't warn is the detour to userspace
> (modprobe) and the waiting for a userspace task (waiting isn't lockdep
> annotated and I don't think it can be)
>
> It seems you cannot load a module while under _any_ lock, ever, since
> userspace might turn around and do something that requires that lock? I
> think we should probably add a warning to the code that waits for the
> userspace task:
>
> 	debug_check_no_locks_held(current);
>
> Or not? Some purely internal locks _might_ be ok... but how could you
> verify that?
>
>   
>> - ieee80211_wep_init(), which is called with rtnl_lock() held, is
>> blocked in request_module() [waiting for modprobe to load a crypto module].
>>     
>
> Right.
>
>   
>> - modprobe is blocked in a call to flush_workqueue(), caused by closing
>> a TTY.
>>     
>
> Can you point out where? I can't find that.
>   

I posted the calltraces earlier at
<http://marc.info/?l=linux-wireless&m=124074566523506&w=2>.

wait_for_completion
flush_cpu_workqueue
? wq_barrier_func
flush_workqueue
flush_scheduled_work
tty_ldisc_release
? tty_fasyc
tty_release_dev
? free_pgtables
tty_release
__fput
filp_close
sys_close
syscall_call


>> - worker_thread is blocked because the workqueue item linkwatch_event()
>> is blocked on rtnl_lock.
>>     
>
> This I know about.
>
>   
>> I've hacked up a test patch to move wep_init() outside of rtnl_lock, and
>> it solved the problem.  My one caveat is that it would probably be
>> cleaner to move it after rtnl_unlock(), instead of before rtnl_lock(). 
>> I just wasn't 100% sure if that would be safe.  Here's the patch:
>>     
>
> That doesn't seem relevant, this just does some initialisation. However,
> you definitely missed adding a call to wep_free().
>   

Hah, I should have realized something was wrong when I noticed I was
removing more lines that I added. 

The crypto init does cause the module load:

wait_for_completion
call_usermodehelper_exec
__request_module
crypto_larval_lookup
? extract_entropy
crypto_alg_mod_lookup
crypto_alloc_base
ieee80211_wep_init
ieee80211_register_hw
? ath5k_hw_set_bss
ath5k+pci_probe
local_pci_probe
pci_device_probe
driver_probe_device
__driver_attach
bus_for_each_dev
driver_attach
? __driver_attach
buad_add_driver
driver_register
? ktime_get_ts
__pci_register_driver
init_ath5k_pci
_stext
? init_ath5k_pci
? proc_create_data
? register_ieq_proc
kernel_init


Thanks
Alan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-28 10:27                 ` Alan Jenkins
@ 2009-04-28 10:35                   ` Johannes Berg
  2009-04-29 11:14                   ` Alan Jenkins
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Berg @ 2009-04-28 10:35 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: linux-wireless@vger.kernel.org, Arjan van de Ven, linux-kernel,
	Kernel Testers List

[-- Attachment #1: Type: text/plain, Size: 411 bytes --]

On Tue, 2009-04-28 at 11:27 +0100, Alan Jenkins wrote:

> > Can you point out where? I can't find that.

> 
> I posted the calltraces earlier at
> <http://marc.info/?l=linux-wireless&m=124074566523506&w=2>.
> 
> wait_for_completion
> flush_cpu_workqueue
> ? wq_barrier_func
> flush_workqueue
> flush_scheduled_work
> tty_ldisc_release

Ah, I was just grepping for the wrong thing :)

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-28 10:27                 ` Alan Jenkins
  2009-04-28 10:35                   ` Johannes Berg
@ 2009-04-29 11:14                   ` Alan Jenkins
  2009-04-29 11:20                     ` Johannes Berg
  2009-05-15 21:48                     ` Rafael J. Wysocki
  1 sibling, 2 replies; 9+ messages in thread
From: Alan Jenkins @ 2009-04-29 11:14 UTC (permalink / raw)
  To: John Linville
  Cc: Johannes Berg, linux-wireless@vger.kernel.org, Arjan van de Ven,
	linux-kernel, Kernel Testers List

Alan Jenkins wrote:
> Johannes Berg wrote:
>   
>> That doesn't seem relevant, this just does some initialisation. However,
>> you definitely missed adding a call to wep_free().
>>   
>>     
>
> Hah, I should have realized something was wrong when I noticed I was
> removing more lines that I added. 
>
> The crypto init does cause the module load:
>
> wait_for_completion
> call_usermodehelper_exec
> __request_module
> crypto_larval_lookup
> ? extract_entropy
> crypto_alg_mod_lookup
> crypto_alloc_base
> ieee80211_wep_init
> ieee80211_register_hw
>   

Here's a corrected patch complete with changelog.  If there are no other
problems with it, can you please  apply this for 2.6.30 to keep my EeePC
regression-free?

Thanks
Alan

------>
>From c5e9dc036247e70956d1a28e8850c3810385dda0 Mon Sep 17 00:00:00 2001
From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Date: Wed, 29 Apr 2009 11:41:24 +0100
Subject: [PATCH] mac80211: fix modprobe deadlock by not calling wep_init under rtnl_lock

 - ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
   request_module() [waiting for modprobe to load a crypto module].

 - modprobe blocks in a call to flush_workqueue(), when it closes a TTY
   [presumably when it exits].

 - The workqueue item linkwatch_event() blocks on rtnl_lock.

There's no reason for wep_init() to be called with rtnl_lock held, so
just move it outside the critical section.

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
---
 net/mac80211/main.c |   19 +++++++++----------
 1 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index fbcbed6..00968c2 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -909,6 +909,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 	if (result < 0)
 		goto fail_sta_info;
 
+	result = ieee80211_wep_init(local);
+	if (result < 0) {
+		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
+		       wiphy_name(local->hw.wiphy), result);
+		goto fail_wep;
+	}
+
 	rtnl_lock();
 	result = dev_alloc_name(local->mdev, local->mdev->name);
 	if (result < 0)
@@ -930,14 +937,6 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 		goto fail_rate;
 	}
 
-	result = ieee80211_wep_init(local);
-
-	if (result < 0) {
-		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
-		       wiphy_name(local->hw.wiphy), result);
-		goto fail_wep;
-	}
-
 	/* add one default STA interface if supported */
 	if (local->hw.wiphy->interface_modes & BIT(NL80211_IFTYPE_STATION)) {
 		result = ieee80211_if_add(local, "wlan%d", NULL,
@@ -967,13 +966,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
 
 	return 0;
 
-fail_wep:
-	rate_control_deinitialize(local);
 fail_rate:
 	unregister_netdevice(local->mdev);
 	local->mdev = NULL;
 fail_dev:
 	rtnl_unlock();
+	ieee80211_wep_free(local);
+fail_wep:
 	sta_info_stop(local);
 fail_sta_info:
 	debugfs_hw_del(local);
-- 
1.5.4.3





^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-29 11:14                   ` Alan Jenkins
@ 2009-04-29 11:20                     ` Johannes Berg
  2009-05-15 21:48                     ` Rafael J. Wysocki
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Berg @ 2009-04-29 11:20 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: John Linville, linux-wireless@vger.kernel.org, Arjan van de Ven,
	linux-kernel, Kernel Testers List

[-- Attachment #1: Type: text/plain, Size: 2517 bytes --]

On Wed, 2009-04-29 at 12:14 +0100, Alan Jenkins wrote:

> From c5e9dc036247e70956d1a28e8850c3810385dda0 Mon Sep 17 00:00:00 2001
> From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date: Wed, 29 Apr 2009 11:41:24 +0100
> Subject: [PATCH] mac80211: fix modprobe deadlock by not calling wep_init under rtnl_lock
> 
>  - ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
>    request_module() [waiting for modprobe to load a crypto module].
> 
>  - modprobe blocks in a call to flush_workqueue(), when it closes a TTY
>    [presumably when it exits].
> 
>  - The workqueue item linkwatch_event() blocks on rtnl_lock.
> 
> There's no reason for wep_init() to be called with rtnl_lock held, so
> just move it outside the critical section.

Looks correct to me.

> Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>

> ---
>  net/mac80211/main.c |   19 +++++++++----------
>  1 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/net/mac80211/main.c b/net/mac80211/main.c
> index fbcbed6..00968c2 100644
> --- a/net/mac80211/main.c
> +++ b/net/mac80211/main.c
> @@ -909,6 +909,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  	if (result < 0)
>  		goto fail_sta_info;
>  
> +	result = ieee80211_wep_init(local);
> +	if (result < 0) {
> +		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
> +		       wiphy_name(local->hw.wiphy), result);
> +		goto fail_wep;
> +	}
> +
>  	rtnl_lock();
>  	result = dev_alloc_name(local->mdev, local->mdev->name);
>  	if (result < 0)
> @@ -930,14 +937,6 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  		goto fail_rate;
>  	}
>  
> -	result = ieee80211_wep_init(local);
> -
> -	if (result < 0) {
> -		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
> -		       wiphy_name(local->hw.wiphy), result);
> -		goto fail_wep;
> -	}
> -
>  	/* add one default STA interface if supported */
>  	if (local->hw.wiphy->interface_modes & BIT(NL80211_IFTYPE_STATION)) {
>  		result = ieee80211_if_add(local, "wlan%d", NULL,
> @@ -967,13 +966,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  
>  	return 0;
>  
> -fail_wep:
> -	rate_control_deinitialize(local);
>  fail_rate:
>  	unregister_netdevice(local->mdev);
>  	local->mdev = NULL;
>  fail_dev:
>  	rtnl_unlock();
> +	ieee80211_wep_free(local);
> +fail_wep:
>  	sta_info_stop(local);
>  fail_sta_info:
>  	debugfs_hw_del(local);

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-04-29 11:14                   ` Alan Jenkins
  2009-04-29 11:20                     ` Johannes Berg
@ 2009-05-15 21:48                     ` Rafael J. Wysocki
  2009-05-16  8:39                       ` Alan Jenkins
  1 sibling, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2009-05-15 21:48 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: John Linville, Johannes Berg, linux-wireless@vger.kernel.org,
	Arjan van de Ven, linux-kernel, Kernel Testers List

On Wednesday 29 April 2009, Alan Jenkins wrote:
> Alan Jenkins wrote:
> > Johannes Berg wrote:
> >   
> >> That doesn't seem relevant, this just does some initialisation. However,
> >> you definitely missed adding a call to wep_free().
> >>   
> >>     
> >
> > Hah, I should have realized something was wrong when I noticed I was
> > removing more lines that I added. 
> >
> > The crypto init does cause the module load:
> >
> > wait_for_completion
> > call_usermodehelper_exec
> > __request_module
> > crypto_larval_lookup
> > ? extract_entropy
> > crypto_alg_mod_lookup
> > crypto_alloc_base
> > ieee80211_wep_init
> > ieee80211_register_hw
> >   
> 
> Here's a corrected patch complete with changelog.  If there are no other
> problems with it, can you please  apply this for 2.6.30 to keep my EeePC
> regression-free?
> 
> Thanks
> Alan
> 
> ------>
> From c5e9dc036247e70956d1a28e8850c3810385dda0 Mon Sep 17 00:00:00 2001
> From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> Date: Wed, 29 Apr 2009 11:41:24 +0100
> Subject: [PATCH] mac80211: fix modprobe deadlock by not calling wep_init under rtnl_lock
> 
>  - ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
>    request_module() [waiting for modprobe to load a crypto module].
> 
>  - modprobe blocks in a call to flush_workqueue(), when it closes a TTY
>    [presumably when it exits].
> 
>  - The workqueue item linkwatch_event() blocks on rtnl_lock.
> 
> There's no reason for wep_init() to be called with rtnl_lock held, so
> just move it outside the critical section.

Has it been merged already or is it queued up for merging?

Rafael


> Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
> ---
>  net/mac80211/main.c |   19 +++++++++----------
>  1 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/net/mac80211/main.c b/net/mac80211/main.c
> index fbcbed6..00968c2 100644
> --- a/net/mac80211/main.c
> +++ b/net/mac80211/main.c
> @@ -909,6 +909,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  	if (result < 0)
>  		goto fail_sta_info;
>  
> +	result = ieee80211_wep_init(local);
> +	if (result < 0) {
> +		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
> +		       wiphy_name(local->hw.wiphy), result);
> +		goto fail_wep;
> +	}
> +
>  	rtnl_lock();
>  	result = dev_alloc_name(local->mdev, local->mdev->name);
>  	if (result < 0)
> @@ -930,14 +937,6 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  		goto fail_rate;
>  	}
>  
> -	result = ieee80211_wep_init(local);
> -
> -	if (result < 0) {
> -		printk(KERN_DEBUG "%s: Failed to initialize wep: %d\n",
> -		       wiphy_name(local->hw.wiphy), result);
> -		goto fail_wep;
> -	}
> -
>  	/* add one default STA interface if supported */
>  	if (local->hw.wiphy->interface_modes & BIT(NL80211_IFTYPE_STATION)) {
>  		result = ieee80211_if_add(local, "wlan%d", NULL,
> @@ -967,13 +966,13 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
>  
>  	return 0;
>  
> -fail_wep:
> -	rate_control_deinitialize(local);
>  fail_rate:
>  	unregister_netdevice(local->mdev);
>  	local->mdev = NULL;
>  fail_dev:
>  	rtnl_unlock();
> +	ieee80211_wep_free(local);
> +fail_wep:
>  	sta_info_stop(local);
>  fail_sta_info:
>  	debugfs_hw_del(local);


-- 
Everyone knows that debugging is twice as hard as writing a program
in the first place.  So if you're as clever as you can be when you write it,
how will you ever debug it? --- Brian Kernighan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [RFC] EEE PC hangs when booting off battery
  2009-05-15 21:48                     ` Rafael J. Wysocki
@ 2009-05-16  8:39                       ` Alan Jenkins
  0 siblings, 0 replies; 9+ messages in thread
From: Alan Jenkins @ 2009-05-16  8:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Linville, Johannes Berg, linux-wireless@vger.kernel.org,
	Arjan van de Ven, linux-kernel, Kernel Testers List

Rafael J. Wysocki wrote:
> On Wednesday 29 April 2009, Alan Jenkins wrote:
>   
>> Alan Jenkins wrote:
>>     
>>> Johannes Berg wrote:
>>>   
>>>       
>>>> That doesn't seem relevant, this just does some initialisation. However,
>>>> you definitely missed adding a call to wep_free().
>>>>   
>>>>     
>>>>         
>>> Hah, I should have realized something was wrong when I noticed I was
>>> removing more lines that I added. 
>>>
>>> The crypto init does cause the module load:
>>>
>>> wait_for_completion
>>> call_usermodehelper_exec
>>> __request_module
>>> crypto_larval_lookup
>>> ? extract_entropy
>>> crypto_alg_mod_lookup
>>> crypto_alloc_base
>>> ieee80211_wep_init
>>> ieee80211_register_hw
>>>   
>>>       
>> Here's a corrected patch complete with changelog.  If there are no other
>> problems with it, can you please  apply this for 2.6.30 to keep my EeePC
>> regression-free?
>>
>> Thanks
>> Alan
>>
>> ------>
>> From c5e9dc036247e70956d1a28e8850c3810385dda0 Mon Sep 17 00:00:00 2001
>> From: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
>> Date: Wed, 29 Apr 2009 11:41:24 +0100
>> Subject: [PATCH] mac80211: fix modprobe deadlock by not calling wep_init under rtnl_lock
>>
>>  - ieee80211_wep_init(), which is called with rtnl_lock held, blocks in
>>    request_module() [waiting for modprobe to load a crypto module].
>>
>>  - modprobe blocks in a call to flush_workqueue(), when it closes a TTY
>>    [presumably when it exits].
>>
>>  - The workqueue item linkwatch_event() blocks on rtnl_lock.
>>
>> There's no reason for wep_init() to be called with rtnl_lock held, so
>> just move it outside the critical section.
>>     
>
> Has it been merged already or is it queued up for merging?
>
> Rafael
>   

It looks like it's been merged as v2.6.30-rc6~2^2~40^2~1.

Thanks for checking up
Alan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-05-16  8:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <49E065CF.6040408@tuffmail.co.uk>
     [not found] ` <200904140859.02188.bjorn.helgaas@hp.com>
     [not found]   ` <20090414081728.10de978a@infradead.org>
     [not found]     ` <200904140948.37633.bjorn.helgaas@hp.com>
     [not found]       ` <49E5F01B.2060201@tuffmail.co.uk>
     [not found]         ` <49EF0ABD.2080801@tuffmail.co.uk>
2009-04-26 11:34           ` EEE PC hangs when booting off battery Alan Jenkins
2009-04-28  9:19             ` [PATCH] [RFC] " Alan Jenkins
2009-04-28  9:58               ` Johannes Berg
2009-04-28 10:27                 ` Alan Jenkins
2009-04-28 10:35                   ` Johannes Berg
2009-04-29 11:14                   ` Alan Jenkins
2009-04-29 11:20                     ` Johannes Berg
2009-05-15 21:48                     ` Rafael J. Wysocki
2009-05-16  8:39                       ` Alan Jenkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).