public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SwSusp to disk doesn't work - Try 2
@ 2007-03-11 18:08 Thomas Meyer
  2007-03-11 18:26 ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 18:08 UTC (permalink / raw)
  To: linux-kernel

Suspend to disk doesn't work on my laptop.

The suspend seems to hang while enabling the non-boot cpus again.

with platform = "test" and state = "disk" i get this:
"
[cut]
acpi device:02: freeze
video video:00: freeze
acpi device:01: freeze
acpi PNP0C02:00: freeze
pci_root PNP0A08:00: freeze
button PNP0C0E:00: freeze
button PNP0C0C:00: freeze
acpi APP0002:00: freeze
button PNP0C0D:00: freeze
ac ACPI0003:00: freeze
acpi device:00: freeze
processor ACPI0007:01: freeze
processor ACPI0007:00: freeze
button button_power:00: freeze
acpi acpi_system:00: freeze
Disabling non-boot CPUs ...
CPU 1 is now offline
SMP alternatives: switching to UP code
PM: Removing info for No Bus:cpu1
PM: Removing info for No Bus:msr1
CPU1 is down
swsusp debug: Waiting for 5 seconds.
Enabling non-boot CPUs ...

----> Here the process hangs. But a fortunate coincidence showed me that 
an acpi event continues the process (pressing the power off button a few 
times... (2x - 4x)  ).

SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
CPU 1 irqstacks, hard=c0389000 soft=c0387000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3663.73 BogoMIPS 
(lpj=6103576)
CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 
0000c1a9 00000000 00000000
monitor/mwait feature present.
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9 
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel Genuine Intel(R) CPU           T2400  @ 1.83GHz stepping 08
PM: Adding info for No Bus:cpu1
PM: Adding info for No Bus:msr1
CPU1 is up
acpi acpi_system:00: resuming
button button_power:00: resuming
processor ACPI0007:00: resuming
processor ACPI0007:01: resuming
acpi device:00: resuming
ac ACPI0003:00: resuming
button PNP0C0D:00: resuming
acpi APP0002:00: resuming
button PNP0C0C:00: resuming
button PNP0C0E:00: resuming
pci_root PNP0A08:00: resuming
"

Any ideas?

The same is true for disk = "platform".

With kind regards
thomas




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 18:08 SwSusp to disk doesn't work - Try 2 Thomas Meyer
@ 2007-03-11 18:26 ` Rafael J. Wysocki
  2007-03-11 18:37   ` Thomas Meyer
  2007-03-11 19:04   ` Milan Broz
  0 siblings, 2 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 18:26 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: linux-kernel, Pavel Machek

On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> Suspend to disk doesn't work on my laptop.
> 
> The suspend seems to hang while enabling the non-boot cpus again.
> 
> with platform = "test" and state = "disk" i get this:
> "
> [cut]
> acpi device:02: freeze
> video video:00: freeze
> acpi device:01: freeze
> acpi PNP0C02:00: freeze
> pci_root PNP0A08:00: freeze
> button PNP0C0E:00: freeze
> button PNP0C0C:00: freeze
> acpi APP0002:00: freeze
> button PNP0C0D:00: freeze
> ac ACPI0003:00: freeze
> acpi device:00: freeze
> processor ACPI0007:01: freeze
> processor ACPI0007:00: freeze
> button button_power:00: freeze
> acpi acpi_system:00: freeze
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> SMP alternatives: switching to UP code
> PM: Removing info for No Bus:cpu1
> PM: Removing info for No Bus:msr1
> CPU1 is down
> swsusp debug: Waiting for 5 seconds.
> Enabling non-boot CPUs ...
> 
> ----> Here the process hangs. But a fortunate coincidence showed me that 
> an acpi event continues the process (pressing the power off button a few 
> times... (2x - 4x)  ).

Hm, interesting.

> SMP alternatives: switching to SMP code
> Booting processor 1/1 eip 3000
> CPU 1 irqstacks, hard=c0389000 soft=c0387000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3663.73 BogoMIPS 
> (lpj=6103576)
> CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000 
> 0000c1a9 00000000 00000000
> monitor/mwait feature present.
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9 
> 00000000 00000000
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#1.
> CPU1: Intel Genuine Intel(R) CPU           T2400  @ 1.83GHz stepping 08
> PM: Adding info for No Bus:cpu1
> PM: Adding info for No Bus:msr1
> CPU1 is up
> acpi acpi_system:00: resuming
> button button_power:00: resuming
> processor ACPI0007:00: resuming
> processor ACPI0007:01: resuming
> acpi device:00: resuming
> ac ACPI0003:00: resuming
> button PNP0C0D:00: resuming
> acpi APP0002:00: resuming
> button PNP0C0C:00: resuming
> button PNP0C0E:00: resuming
> pci_root PNP0A08:00: resuming
> "
> 
> Any ideas?

Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
it gets stuck?  I bet one of the notifiers goes to sleep (cpufreq, maybe).

Greetings,
Rafael
-- 
If you don't have the time to read,
you don't have the time or the tools to write.
		- Stephen King

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 18:26 ` Rafael J. Wysocki
@ 2007-03-11 18:37   ` Thomas Meyer
  2007-03-11 19:27     ` Rafael J. Wysocki
  2007-03-11 19:04   ` Milan Broz
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 18:37 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-kernel, Pavel Machek

Rafael J. Wysocki schrieb:
>
> Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> it gets stuck?  I bet one of the notifiers goes to sleep (cpufreq, maybe).
>   
Here we go (ok. i forgot __FUNCTION__ ...):

Mar 11 19:31:33 [kernel] ac ACPI0003:00: freeze
Mar 11 19:31:33 [kernel] acpi device:00: freeze
Mar 11 19:31:33 [kernel] processor ACPI0007:01: freeze
Mar 11 19:31:33 [kernel] processor ACPI0007:00: freeze
Mar 11 19:31:33 [kernel] button button_power:00: freeze
Mar 11 19:31:33 [kernel] acpi acpi_system:00: freeze
Mar 11 19:31:33 [kernel] Disabling non-boot CPUs ...
Mar 11 19:31:33 [kernel] kvm: disabling virtualization on CPU1
Mar 11 19:31:33 [kernel] CPU 1 is now offline
Mar 11 19:31:33 [kernel] SMP alternatives: switching to UP code
Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:msr1
Mar 11 19:31:33 [kernel] CPU1 is down
Mar 11 19:31:33 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 19:31:33 [kernel] Enabling non-boot CPUs ...
Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_UP_PREPARE.

Hung here.

Mar 11 19:31:33 [kernel] <NULL>: after notifier CPU_UP_PREPARE.
Mar 11 19:31:33 [kernel] SMP alternatives: switching to SMP code
Mar 11 19:31:33 [kernel] Booting processor 1/1 eip 3000
Mar 11 19:31:33 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 19:31:33 [kernel] Initializing CPU#1
Mar 11 19:31:33 [kernel] Calibrating delay using timer specific 
routine.. 3663.72 BogoMIPS (lpj=6103555)
Mar 11 19:31:33 [kernel] CPU: After generic identify, caps: bfe9fbff 
00100000 00000000 00000000 0000c1a9 00000000 00000000
Mar 11 19:31:33 [kernel] monitor/mwait feature present.
Mar 11 19:31:33 [kernel] CPU: L1 I cache: 32K, L1 D cache: 32K
Mar 11 19:31:33 [kernel] CPU: L2 cache: 2048K
Mar 11 19:31:33 [kernel] CPU: Physical Processor ID: 0
Mar 11 19:31:33 [kernel] CPU: Processor Core ID: 1
Mar 11 19:31:33 [kernel] CPU: After all inits, caps: bfe9fbff 00100000 
00000000 00002940 0000c1a9 00000000 00000000
Mar 11 19:31:33 [kernel] CPU1: Intel Genuine Intel(R) CPU           
T2400  @ 1.83GHz stepping 08
Mar 11 19:31:33 [kernel] <NULL>: after __cpu_up
Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_ONLINE.
Mar 11 19:31:33 [kernel] kvm: enabling virtualization on CPU1
Mar 11 19:31:33 [kernel] Switched to high resolution mode on CPU 1
Mar 11 19:31:33 [kernel] PM: Adding info for No Bus:cpu1
Mar 11 19:31:33 [kernel] PM: Adding info for No Bus:msr1
Mar 11 19:31:33 [kernel] <NULL>: after notifier CPU_ONLINE.
Mar 11 19:31:33 [kernel] CPU1 is up
Mar 11 19:31:33 [kernel] acpi acpi_system:00: resuming
Mar 11 19:31:33 [kernel] button button_power:00: resuming
Mar 11 19:31:33 [kernel] processor ACPI0007:00: resuming
Mar 11 19:31:33 [kernel] processor ACPI0007:01: resuming


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 18:26 ` Rafael J. Wysocki
  2007-03-11 18:37   ` Thomas Meyer
@ 2007-03-11 19:04   ` Milan Broz
  2007-03-11 19:16     ` Thomas Meyer
  2007-03-11 19:38     ` Rafael J. Wysocki
  1 sibling, 2 replies; 19+ messages in thread
From: Milan Broz @ 2007-03-11 19:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Thomas Meyer, linux-kernel, Pavel Machek

Rafael J. Wysocki napsal(a):
> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
>> Suspend to disk doesn't work on my laptop.
>>
>> The suspend seems to hang while enabling the non-boot cpus again.
>>
>> with platform = "test" and state = "disk" i get this:

>> Enabling non-boot CPUs ...
...
> 
> Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> it gets stuck?  I bet one of the notifiers goes to sleep (cpufreq, maybe).

Hi,
I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless), 
short printk trace 

enable_nonboot_cpus
 _cpu_up
  raw_notifier_callchain (CPU_UP_PREPARE)
    ...
    update_sched_domains
     detach_destroy_domains
       [waits here] --> synchronize_sched (==synchronize_rcu)


Milan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 19:04   ` Milan Broz
@ 2007-03-11 19:16     ` Thomas Meyer
  2007-03-11 19:50       ` Rafael J. Wysocki
  2007-03-11 19:38     ` Rafael J. Wysocki
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 19:16 UTC (permalink / raw)
  To: Milan Broz; +Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek

Milan Broz schrieb:
> Rafael J. Wysocki napsal(a):
>   
>> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
>>     
>>> Suspend to disk doesn't work on my laptop.
>>>
>>> The suspend seems to hang while enabling the non-boot cpus again.
>>>
>>> with platform = "test" and state = "disk" i get this:
>>>       
>
>   
> Hi,
> I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless), 
> short printk trace 
>
> enable_nonboot_cpus
>  _cpu_up
>   raw_notifier_callchain (CPU_UP_PREPARE)
>     ...
>     update_sched_domains
>      detach_destroy_domains
>        [waits here] --> synchronize_sched (==synchronize_rcu)
>
>   

Maybe this helps:

Mar 11 20:03:56 [kernel] PM: Removing info for No Bus:msr1
Mar 11 20:03:56 [kernel] CPU1 is down
Mar 11 20:03:56 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 20:03:56 [kernel] Enabling non-boot CPUs ...
Mar 11 20:03:56 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 20:03:56 [kernel] migration_call: Hi!
Mar 11 20:03:56 [kernel] rcu_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] timer_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] cpu_callback: Hi!
Mar 11 20:03:56 [kernel] workqueue_cpu_callback: Hi!
Mar 11 20:03:56 [kernel] topology_cpu_callback: Hi!

Guess what? Hang!

Mar 11 20:03:56 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 20:03:56 [kernel] SMP alternatives: switching to SMP code
Mar 11 20:03:56 [kernel] Booting processor 1/1 eip 3000
Mar 11 20:03:56 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 20:03:56 [kernel] Initializing CPU#1


I use NO_HZ, too. Will try without it.

With kind regards
thomas



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 18:37   ` Thomas Meyer
@ 2007-03-11 19:27     ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:27 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: linux-kernel, Pavel Machek

On Sunday, 11 March 2007 19:37, Thomas Meyer wrote:
> Rafael J. Wysocki schrieb:
> >
> > Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> > it gets stuck?  I bet one of the notifiers goes to sleep (cpufreq, maybe).
> >   
> Here we go (ok. i forgot __FUNCTION__ ...):
> 
> Mar 11 19:31:33 [kernel] ac ACPI0003:00: freeze
> Mar 11 19:31:33 [kernel] acpi device:00: freeze
> Mar 11 19:31:33 [kernel] processor ACPI0007:01: freeze
> Mar 11 19:31:33 [kernel] processor ACPI0007:00: freeze
> Mar 11 19:31:33 [kernel] button button_power:00: freeze
> Mar 11 19:31:33 [kernel] acpi acpi_system:00: freeze
> Mar 11 19:31:33 [kernel] Disabling non-boot CPUs ...
> Mar 11 19:31:33 [kernel] kvm: disabling virtualization on CPU1
> Mar 11 19:31:33 [kernel] CPU 1 is now offline
> Mar 11 19:31:33 [kernel] SMP alternatives: switching to UP code
> Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:cpu1
> Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:msr1
> Mar 11 19:31:33 [kernel] CPU1 is down
> Mar 11 19:31:33 [kernel] swsusp debug: Waiting for 5 seconds.
> Mar 11 19:31:33 [kernel] Enabling non-boot CPUs ...
> Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_UP_PREPARE.
> 
> Hung here.

This means that one of the notifiers had not returned before you pressed
the button.  Now the question is which one (there are many).

I don't know if there's any nicer way to find out that, but I usually hack
kernel/sys.c:notifier_call_chain() to print nb->notifier_call (as a pointer)
before the call is made.  Then I write down the address of the last one
called before the hang/oops and use gdb to check which function it points to.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 19:04   ` Milan Broz
  2007-03-11 19:16     ` Thomas Meyer
@ 2007-03-11 19:38     ` Rafael J. Wysocki
  2007-03-11 20:23       ` Milan Broz
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:38 UTC (permalink / raw)
  To: Milan Broz; +Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner

On Sunday, 11 March 2007 20:04, Milan Broz wrote:
> Rafael J. Wysocki napsal(a):
> > On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> >> Suspend to disk doesn't work on my laptop.
> >>
> >> The suspend seems to hang while enabling the non-boot cpus again.
> >>
> >> with platform = "test" and state = "disk" i get this:
> 
> >> Enabling non-boot CPUs ...
> ...
> > 
> > Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> > it gets stuck?  I bet one of the notifiers goes to sleep (cpufreq, maybe).
> 
> Hi,
> I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless), 

Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.

> short printk trace 
> 
> enable_nonboot_cpus
>  _cpu_up
>   raw_notifier_callchain (CPU_UP_PREPARE)
>     ...
>     update_sched_domains
>      detach_destroy_domains
>        [waits here] --> synchronize_sched (==synchronize_rcu)

Well, I think the call to wait_for_completion() does not return, probably
because the task supposed to complete the completion is frozen at this
point.  Can you please try to confirm that it gets stuck on
wait_for_completion() in synchronize_rcu()?

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 19:16     ` Thomas Meyer
@ 2007-03-11 19:50       ` Rafael J. Wysocki
  2007-03-11 20:04         ` Thomas Meyer
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:50 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: Milan Broz, linux-kernel, Pavel Machek

On Sunday, 11 March 2007 20:16, Thomas Meyer wrote:
> Milan Broz schrieb:
> > Rafael J. Wysocki napsal(a):
> >   
> >> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> >>     
> >>> Suspend to disk doesn't work on my laptop.
> >>>
> >>> The suspend seems to hang while enabling the non-boot cpus again.
> >>>
> >>> with platform = "test" and state = "disk" i get this:
> >>>       
> >
> >   
> > Hi,
> > I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless), 
> > short printk trace 
> >
> > enable_nonboot_cpus
> >  _cpu_up
> >   raw_notifier_callchain (CPU_UP_PREPARE)
> >     ...
> >     update_sched_domains
> >      detach_destroy_domains
> >        [waits here] --> synchronize_sched (==synchronize_rcu)
> >
> >   
> 
> Maybe this helps:
> 
> Mar 11 20:03:56 [kernel] PM: Removing info for No Bus:msr1
> Mar 11 20:03:56 [kernel] CPU1 is down
> Mar 11 20:03:56 [kernel] swsusp debug: Waiting for 5 seconds.
> Mar 11 20:03:56 [kernel] Enabling non-boot CPUs ...
> Mar 11 20:03:56 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
> Mar 11 20:03:56 [kernel] migration_call: Hi!
> Mar 11 20:03:56 [kernel] rcu_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] timer_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] hrtimer_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] cpu_callback: Hi!
> Mar 11 20:03:56 [kernel] workqueue_cpu_callback: Hi!
> Mar 11 20:03:56 [kernel] topology_cpu_callback: Hi!
> 
> Guess what? Hang!

Hm, you may be hitting the problem described in this thread:
http://lkml.org/lkml/2007/3/6/364 .

Can you please apply the patch at http://lkml.org/lkml/2007/3/7/255
and see if that helps?

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 19:50       ` Rafael J. Wysocki
@ 2007-03-11 20:04         ` Thomas Meyer
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek

Sorry the systems hangs here:

Mar 11 20:55:46 [kernel] CPU 1 is now offline
Mar 11 20:55:46 [kernel] SMP alternatives: switching to UP code
Mar 11 20:55:46 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 20:55:46 [kernel] PM: Removing info for No Bus:msr1
Mar 11 20:55:46 [kernel] CPU1 is down
Mar 11 20:55:46 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 20:55:46 [kernel] Enabling non-boot CPUs ...
Mar 11 20:55:46 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 20:55:46 [kernel] migration_call: Hi!
Mar 11 20:55:46 [kernel] rcu_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] timer_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] cpu_callback: Hi!
Mar 11 20:55:46 [kernel] synchronize_rcu: Befor wait

---> System hangs here. After the first "Before wait" message.

                - Last output repeated twice -
Mar 11 20:55:46 [kernel] synchronize_rcu: After wait
Mar 11 20:55:46 [kernel] workqueue_cpu_callback: Hi!
Mar 11 20:55:46 [kernel] topology_cpu_callback: Hi!
Mar 11 20:55:46 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 20:55:46 [kernel] SMP alternatives: switching to SMP code
Mar 11 20:55:46 [kernel] Booting processor 1/1 eip 3000
Mar 11 20:55:46 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 20:55:46 [kernel] Initializing CPU#1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 19:38     ` Rafael J. Wysocki
@ 2007-03-11 20:23       ` Milan Broz
  2007-03-11 20:32         ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Milan Broz @ 2007-03-11 20:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner

Rafael J. Wysocki:
> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
> 
>> short printk trace 
>>
>> enable_nonboot_cpus
>>  _cpu_up
>>   raw_notifier_callchain (CPU_UP_PREPARE)
>>     ...
>>     update_sched_domains
>>      detach_destroy_domains
>>        [waits here] --> synchronize_sched (==synchronize_rcu)
> 
> Well, I think the call to wait_for_completion() does not return, probably
> because the task supposed to complete the completion is frozen at this
> point.  Can you please try to confirm that it gets stuck on
> wait_for_completion() in synchronize_rcu()?

Yes, it's in wait_for_completion() in synchronize_rcu().
As noted in some previous mail, it will wake up after
event - key press etc.

Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
I added it to my quilt and applied anyway -> no change.

Milan


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:32         ` Rafael J. Wysocki
@ 2007-03-11 20:28           ` Thomas Meyer
  2007-03-11 20:45             ` Rafael J. Wysocki
  2007-03-11 20:57             ` Milan Broz
  0 siblings, 2 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:28 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner

Rafael J. Wysocki schrieb:
> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>   
>> Rafael J. Wysocki:
>>     
>>> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
>>>
>>>       
>>>> short printk trace 
>>>>
>>>> enable_nonboot_cpus
>>>>  _cpu_up
>>>>   raw_notifier_callchain (CPU_UP_PREPARE)
>>>>     ...
>>>>     update_sched_domains
>>>>      detach_destroy_domains
>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>         
>>> Well, I think the call to wait_for_completion() does not return, probably
>>> because the task supposed to complete the completion is frozen at this
>>> point.  Can you please try to confirm that it gets stuck on
>>> wait_for_completion() in synchronize_rcu()?
>>>       
>> Yes, it's in wait_for_completion() in synchronize_rcu().
>> As noted in some previous mail, it will wake up after
>> event - key press etc.
>>
>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>> I added it to my quilt and applied anyway -> no change.
>>     
>
> Does the problem go away if NO_HZ is unset?
>   

i tried to boot with nohz=off, but the problem did persist.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:23       ` Milan Broz
@ 2007-03-11 20:32         ` Rafael J. Wysocki
  2007-03-11 20:28           ` Thomas Meyer
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 20:32 UTC (permalink / raw)
  To: Milan Broz; +Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner

On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> Rafael J. Wysocki:
> > Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
> > 
> >> short printk trace 
> >>
> >> enable_nonboot_cpus
> >>  _cpu_up
> >>   raw_notifier_callchain (CPU_UP_PREPARE)
> >>     ...
> >>     update_sched_domains
> >>      detach_destroy_domains
> >>        [waits here] --> synchronize_sched (==synchronize_rcu)
> > 
> > Well, I think the call to wait_for_completion() does not return, probably
> > because the task supposed to complete the completion is frozen at this
> > point.  Can you please try to confirm that it gets stuck on
> > wait_for_completion() in synchronize_rcu()?
> 
> Yes, it's in wait_for_completion() in synchronize_rcu().
> As noted in some previous mail, it will wake up after
> event - key press etc.
> 
> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> I added it to my quilt and applied anyway -> no change.

Does the problem go away if NO_HZ is unset?

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:28           ` Thomas Meyer
@ 2007-03-11 20:45             ` Rafael J. Wysocki
  2007-03-11 20:49               ` Thomas Meyer
  2007-03-11 20:57             ` Milan Broz
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 20:45 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner

On Sunday, 11 March 2007 21:28, Thomas Meyer wrote:
> Rafael J. Wysocki schrieb:
> > On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> >   
> >> Rafael J. Wysocki:
> >>     
> >>> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
> >>>
> >>>       
> >>>> short printk trace 
> >>>>
> >>>> enable_nonboot_cpus
> >>>>  _cpu_up
> >>>>   raw_notifier_callchain (CPU_UP_PREPARE)
> >>>>     ...
> >>>>     update_sched_domains
> >>>>      detach_destroy_domains
> >>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
> >>>>         
> >>> Well, I think the call to wait_for_completion() does not return, probably
> >>> because the task supposed to complete the completion is frozen at this
> >>> point.  Can you please try to confirm that it gets stuck on
> >>> wait_for_completion() in synchronize_rcu()?
> >>>       
> >> Yes, it's in wait_for_completion() in synchronize_rcu().
> >> As noted in some previous mail, it will wake up after
> >> event - key press etc.
> >>
> >> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> >> I added it to my quilt and applied anyway -> no change.
> >>     
> >
> > Does the problem go away if NO_HZ is unset?
> >   
> 
> i tried to boot with nohz=off, but the problem did persist.

Okay, but could you please compile the kernel without NO_HZ and retest?

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:45             ` Rafael J. Wysocki
@ 2007-03-11 20:49               ` Thomas Meyer
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:49 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner

Rafael J. Wysocki schrieb:
>
> Okay, but could you please compile the kernel without NO_HZ and retest?
>
>   


Sure.
But i get the same behaviour:

Mar 11 21:42:07 [kernel] processor ACPI0007:00: freeze
Mar 11 21:42:07 [kernel] button button_power:00: freeze
Mar 11 21:42:07 [kernel] acpi acpi_system:00: freeze
Mar 11 21:42:07 [kernel] Disabling non-boot CPUs ...
Mar 11 21:42:07 [kernel] kvm: disabling virtualization on CPU1
Mar 11 21:42:07 [kernel] synchronize_rcu: Befor wait
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] CPU 1 is now offline
Mar 11 21:42:07 [kernel] SMP alternatives: switching to UP code
Mar 11 21:42:07 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 21:42:07 [kernel] PM: Removing info for No Bus:msr1
Mar 11 21:42:07 [kernel] CPU1 is down
Mar 11 21:42:07 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 21:42:07 [kernel] Enabling non-boot CPUs ...
Mar 11 21:42:07 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 21:42:07 [kernel] migration_call: Hi!
Mar 11 21:42:07 [kernel] rcu_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] timer_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] cpu_callback: Hi!
Mar 11 21:42:07 [kernel] synchronize_rcu: Befor wait


----> Hang
Why does this message appear two times?

                - Last output repeated twice -
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] workqueue_cpu_callback: Hi!
Mar 11 21:42:07 [kernel] topology_cpu_callback: Hi!
Mar 11 21:42:07 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 21:42:07 [kernel] SMP alternatives: switching to SMP code
Mar 11 21:42:07 [kernel] Booting processor 1/1 eip 3000
Mar 11 21:42:07 [kernel] CPU 1 irqstacks, hard=c0387000 soft=c0385000
Mar 11 21:42:07 [kernel] Initializing CPU#1
Mar 11 21:42:07 [kernel] Calibrating delay using timer specific 
routine.. 3662.97 BogoMIPS (lpj=6102352)
Mar 11 21:42:07 [kernel] CPU: After generic identify, caps: bfe9fbff 
00100000 00000000 00000000 0000c1a9 00000000 00000000
Mar 11 21:42:07 [kernel] monitor/mwait feature present.
Mar 11 21:42:07 [kernel] CPU: L1 I cache: 32K, L1 D cache: 32K
Mar 11 21:42:07 [kernel] CPU: L2 cache: 2048K
Mar 11 21:42:07 [kernel] CPU: Physical Processor ID: 0
Mar 11 21:42:07 [kernel] CPU: Processor Core ID: 1
Mar 11 21:42:07 [kernel] CPU: After all inits, caps: bfe9fbff 00100000 
00000000 00002940 0000c1a9 00000000 00000000
Mar 11 21:42:07 [kernel] CPU1: Intel Genuine Intel(R) CPU           
T2400  @ 1.83GHz stepping 08
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] _cpu_up: after __cpu_up
Mar 11 21:42:07 [kernel] _cpu_up: before notifier CPU_ONLINE.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:28           ` Thomas Meyer
  2007-03-11 20:45             ` Rafael J. Wysocki
@ 2007-03-11 20:57             ` Milan Broz
  2007-03-11 21:02               ` Thomas Meyer
  2007-03-11 21:09               ` Rafael J. Wysocki
  1 sibling, 2 replies; 19+ messages in thread
From: Milan Broz @ 2007-03-11 20:57 UTC (permalink / raw)
  To: Thomas Meyer
  Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek, Thomas Gleixner

Thomas Meyer napsal(a):
> Rafael J. Wysocki schrieb:
>> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>>   
>>> Rafael J. Wysocki:
>>>     
>>>> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
>>>>
>>>>       
>>>>> short printk trace 
>>>>>
>>>>> enable_nonboot_cpus
>>>>>  _cpu_up
>>>>>   raw_notifier_callchain (CPU_UP_PREPARE)
>>>>>     ...
>>>>>     update_sched_domains
>>>>>      detach_destroy_domains
>>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>         
>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>> because the task supposed to complete the completion is frozen at this
>>>> point.  Can you please try to confirm that it gets stuck on
>>>> wait_for_completion() in synchronize_rcu()?
>>>>       
>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>> As noted in some previous mail, it will wake up after
>>> event - key press etc.
>>>
>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>> I added it to my quilt and applied anyway -> no change.
>>>     
>> Does the problem go away if NO_HZ is unset?
>>   
> 
> i tried to boot with nohz=off, but the problem did persist.

Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.

Milan



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:57             ` Milan Broz
@ 2007-03-11 21:02               ` Thomas Meyer
  2007-03-11 21:09               ` Rafael J. Wysocki
  1 sibling, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 21:02 UTC (permalink / raw)
  To: Milan Broz; +Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek, Thomas Gleixner

Milan Broz schrieb:
> Thomas Meyer napsal(a):
>   
>> Rafael J. Wysocki schrieb:
>>     
>>> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>>>   
>>>       
>>>> Rafael J. Wysocki:
>>>>     
>>>>         
>>>>> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
>>>>>
>>>>>       
>>>>>           
>>>>>> short printk trace 
>>>>>>
>>>>>> enable_nonboot_cpus
>>>>>>  _cpu_up
>>>>>>   raw_notifier_callchain (CPU_UP_PREPARE)
>>>>>>     ...
>>>>>>     update_sched_domains
>>>>>>      detach_destroy_domains
>>>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>>         
>>>>>>             
>>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>>> because the task supposed to complete the completion is frozen at this
>>>>> point.  Can you please try to confirm that it gets stuck on
>>>>> wait_for_completion() in synchronize_rcu()?
>>>>>       
>>>>>           
>>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>>> As noted in some previous mail, it will wake up after
>>>> event - key press etc.
>>>>
>>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>>> I added it to my quilt and applied anyway -> no change.
>>>>     
>>>>         
>>> Does the problem go away if NO_HZ is unset?
>>>   
>>>       
>> i tried to boot with nohz=off, but the problem did persist.
>>     
>
> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
>
> Milan
>   

I got a working config:

Without hrtimers and without nohz it is working!

With hrtimers and without nohz it is not working
With hrtimers and with nohz it is not working

Now i want to test: without hrtimers and with nohz.






^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 20:57             ` Milan Broz
  2007-03-11 21:02               ` Thomas Meyer
@ 2007-03-11 21:09               ` Rafael J. Wysocki
  2007-03-11 21:56                 ` Thomas Gleixner
  1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 21:09 UTC (permalink / raw)
  To: Milan Broz, Andrew Morton
  Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner

On Sunday, 11 March 2007 21:57, Milan Broz wrote:
> Thomas Meyer napsal(a):
> > Rafael J. Wysocki schrieb:
> >> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> >>   
> >>> Rafael J. Wysocki:
> >>>     
> >>>> Ah, NO_HZ.  Thomas Gleixner's address added to the Cc list.
> >>>>
> >>>>       
> >>>>> short printk trace 
> >>>>>
> >>>>> enable_nonboot_cpus
> >>>>>  _cpu_up
> >>>>>   raw_notifier_callchain (CPU_UP_PREPARE)
> >>>>>     ...
> >>>>>     update_sched_domains
> >>>>>      detach_destroy_domains
> >>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
> >>>>>         
> >>>> Well, I think the call to wait_for_completion() does not return, probably
> >>>> because the task supposed to complete the completion is frozen at this
> >>>> point.  Can you please try to confirm that it gets stuck on
> >>>> wait_for_completion() in synchronize_rcu()?
> >>>>       
> >>> Yes, it's in wait_for_completion() in synchronize_rcu().
> >>> As noted in some previous mail, it will wake up after
> >>> event - key press etc.
> >>>
> >>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> >>> I added it to my quilt and applied anyway -> no change.
> >>>     
> >> Does the problem go away if NO_HZ is unset?
> >>   
> > 
> > i tried to boot with nohz=off, but the problem did persist.
> 
> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.

Definitely something strange is going on here.

I think we need an advice from someone who knows the RCU internals.

Andrew, could you please tell me whom I should ask?

Rafael

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 21:09               ` Rafael J. Wysocki
@ 2007-03-11 21:56                 ` Thomas Gleixner
  2007-03-11 21:57                   ` Thomas Meyer
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2007-03-11 21:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Milan Broz, Andrew Morton, Thomas Meyer, linux-kernel,
	Pavel Machek

On Sun, 2007-03-11 at 22:09 +0100, Rafael J. Wysocki wrote:
> > >>>>>     update_sched_domains
> > >>>>>      detach_destroy_domains
> > >>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
> > >>>>>         
> > >>>> Well, I think the call to wait_for_completion() does not return, probably
> > >>>> because the task supposed to complete the completion is frozen at this
> > >>>> point.  Can you please try to confirm that it gets stuck on
> > >>>> wait_for_completion() in synchronize_rcu()?
> > >>>>       
> > >>> Yes, it's in wait_for_completion() in synchronize_rcu().
> > >>> As noted in some previous mail, it will wake up after
> > >>> event - key press etc.
> > >>>
> > >>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> > >>> I added it to my quilt and applied anyway -> no change.
> > >>>     
> > >> Does the problem go away if NO_HZ is unset?
> > >>   
> > > 
> > > i tried to boot with nohz=off, but the problem did persist.
> > 
> > Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
> 
> Definitely something strange is going on here.
> 
> I think we need an advice from someone who knows the RCU internals.

RCU synchronization depends on the timer interrupt. Which kernel version
are you guys talking about ?

	tglx




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: SwSusp to disk doesn't work - Try 2
  2007-03-11 21:56                 ` Thomas Gleixner
@ 2007-03-11 21:57                   ` Thomas Meyer
  0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 21:57 UTC (permalink / raw)
  To: tglx
  Cc: Rafael J. Wysocki, Milan Broz, Andrew Morton, linux-kernel,
	Pavel Machek

Thomas Gleixner schrieb:
> On Sun, 2007-03-11 at 22:09 +0100, Rafael J. Wysocki wrote:
>   
>>>>>>>>     update_sched_domains
>>>>>>>>      detach_destroy_domains
>>>>>>>>        [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>>>>         
>>>>>>>>                 
>>>>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>>>>> because the task supposed to complete the completion is frozen at this
>>>>>>> point.  Can you please try to confirm that it gets stuck on
>>>>>>> wait_for_completion() in synchronize_rcu()?
>>>>>>>       
>>>>>>>               
>>>>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>>>>> As noted in some previous mail, it will wake up after
>>>>>> event - key press etc.
>>>>>>
>>>>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>>>>> I added it to my quilt and applied anyway -> no change.
>>>>>>     
>>>>>>             
>>>>> Does the problem go away if NO_HZ is unset?
>>>>>   
>>>>>           
>>>> i tried to boot with nohz=off, but the problem did persist.
>>>>         
>>> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
>>>       
>> Definitely something strange is going on here.
>>
>> I think we need an advice from someone who knows the RCU internals.
>>     
>
> RCU synchronization depends on the timer interrupt. Which kernel version
> are you guys talking about ?
>
> 	tglx
>   

I talk about be521466feb3bb1cd89de82a2b1d080e9ebd3cb6 (2.6.21-rc3+).

The worst config is with nohz and without hrtimers: the kernel even 
doesn't come back after pressing the power key.

But i stay with without nohz and without hrtimers for now, because here 
the suspend to disk works.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2007-03-11 22:00 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-11 18:08 SwSusp to disk doesn't work - Try 2 Thomas Meyer
2007-03-11 18:26 ` Rafael J. Wysocki
2007-03-11 18:37   ` Thomas Meyer
2007-03-11 19:27     ` Rafael J. Wysocki
2007-03-11 19:04   ` Milan Broz
2007-03-11 19:16     ` Thomas Meyer
2007-03-11 19:50       ` Rafael J. Wysocki
2007-03-11 20:04         ` Thomas Meyer
2007-03-11 19:38     ` Rafael J. Wysocki
2007-03-11 20:23       ` Milan Broz
2007-03-11 20:32         ` Rafael J. Wysocki
2007-03-11 20:28           ` Thomas Meyer
2007-03-11 20:45             ` Rafael J. Wysocki
2007-03-11 20:49               ` Thomas Meyer
2007-03-11 20:57             ` Milan Broz
2007-03-11 21:02               ` Thomas Meyer
2007-03-11 21:09               ` Rafael J. Wysocki
2007-03-11 21:56                 ` Thomas Gleixner
2007-03-11 21:57                   ` Thomas Meyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox