* SwSusp to disk doesn't work - Try 2
@ 2007-03-11 18:08 Thomas Meyer
2007-03-11 18:26 ` Rafael J. Wysocki
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 18:08 UTC (permalink / raw)
To: linux-kernel
Suspend to disk doesn't work on my laptop.
The suspend seems to hang while enabling the non-boot cpus again.
with platform = "test" and state = "disk" i get this:
"
[cut]
acpi device:02: freeze
video video:00: freeze
acpi device:01: freeze
acpi PNP0C02:00: freeze
pci_root PNP0A08:00: freeze
button PNP0C0E:00: freeze
button PNP0C0C:00: freeze
acpi APP0002:00: freeze
button PNP0C0D:00: freeze
ac ACPI0003:00: freeze
acpi device:00: freeze
processor ACPI0007:01: freeze
processor ACPI0007:00: freeze
button button_power:00: freeze
acpi acpi_system:00: freeze
Disabling non-boot CPUs ...
CPU 1 is now offline
SMP alternatives: switching to UP code
PM: Removing info for No Bus:cpu1
PM: Removing info for No Bus:msr1
CPU1 is down
swsusp debug: Waiting for 5 seconds.
Enabling non-boot CPUs ...
----> Here the process hangs. But a fortunate coincidence showed me that
an acpi event continues the process (pressing the power off button a few
times... (2x - 4x) ).
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
CPU 1 irqstacks, hard=c0389000 soft=c0387000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3663.73 BogoMIPS
(lpj=6103576)
CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000
0000c1a9 00000000 00000000
monitor/mwait feature present.
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel Genuine Intel(R) CPU T2400 @ 1.83GHz stepping 08
PM: Adding info for No Bus:cpu1
PM: Adding info for No Bus:msr1
CPU1 is up
acpi acpi_system:00: resuming
button button_power:00: resuming
processor ACPI0007:00: resuming
processor ACPI0007:01: resuming
acpi device:00: resuming
ac ACPI0003:00: resuming
button PNP0C0D:00: resuming
acpi APP0002:00: resuming
button PNP0C0C:00: resuming
button PNP0C0E:00: resuming
pci_root PNP0A08:00: resuming
"
Any ideas?
The same is true for disk = "platform".
With kind regards
thomas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 18:08 SwSusp to disk doesn't work - Try 2 Thomas Meyer
@ 2007-03-11 18:26 ` Rafael J. Wysocki
2007-03-11 18:37 ` Thomas Meyer
2007-03-11 19:04 ` Milan Broz
0 siblings, 2 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 18:26 UTC (permalink / raw)
To: Thomas Meyer; +Cc: linux-kernel, Pavel Machek
On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> Suspend to disk doesn't work on my laptop.
>
> The suspend seems to hang while enabling the non-boot cpus again.
>
> with platform = "test" and state = "disk" i get this:
> "
> [cut]
> acpi device:02: freeze
> video video:00: freeze
> acpi device:01: freeze
> acpi PNP0C02:00: freeze
> pci_root PNP0A08:00: freeze
> button PNP0C0E:00: freeze
> button PNP0C0C:00: freeze
> acpi APP0002:00: freeze
> button PNP0C0D:00: freeze
> ac ACPI0003:00: freeze
> acpi device:00: freeze
> processor ACPI0007:01: freeze
> processor ACPI0007:00: freeze
> button button_power:00: freeze
> acpi acpi_system:00: freeze
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> SMP alternatives: switching to UP code
> PM: Removing info for No Bus:cpu1
> PM: Removing info for No Bus:msr1
> CPU1 is down
> swsusp debug: Waiting for 5 seconds.
> Enabling non-boot CPUs ...
>
> ----> Here the process hangs. But a fortunate coincidence showed me that
> an acpi event continues the process (pressing the power off button a few
> times... (2x - 4x) ).
Hm, interesting.
> SMP alternatives: switching to SMP code
> Booting processor 1/1 eip 3000
> CPU 1 irqstacks, hard=c0389000 soft=c0387000
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 3663.73 BogoMIPS
> (lpj=6103576)
> CPU: After generic identify, caps: bfe9fbff 00100000 00000000 00000000
> 0000c1a9 00000000 00000000
> monitor/mwait feature present.
> CPU: L1 I cache: 32K, L1 D cache: 32K
> CPU: L2 cache: 2048K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> CPU: After all inits, caps: bfe9fbff 00100000 00000000 00002940 0000c1a9
> 00000000 00000000
> Intel machine check architecture supported.
> Intel machine check reporting enabled on CPU#1.
> CPU1: Intel Genuine Intel(R) CPU T2400 @ 1.83GHz stepping 08
> PM: Adding info for No Bus:cpu1
> PM: Adding info for No Bus:msr1
> CPU1 is up
> acpi acpi_system:00: resuming
> button button_power:00: resuming
> processor ACPI0007:00: resuming
> processor ACPI0007:01: resuming
> acpi device:00: resuming
> ac ACPI0003:00: resuming
> button PNP0C0D:00: resuming
> acpi APP0002:00: resuming
> button PNP0C0C:00: resuming
> button PNP0C0E:00: resuming
> pci_root PNP0A08:00: resuming
> "
>
> Any ideas?
Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
it gets stuck? I bet one of the notifiers goes to sleep (cpufreq, maybe).
Greetings,
Rafael
--
If you don't have the time to read,
you don't have the time or the tools to write.
- Stephen King
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 18:26 ` Rafael J. Wysocki
@ 2007-03-11 18:37 ` Thomas Meyer
2007-03-11 19:27 ` Rafael J. Wysocki
2007-03-11 19:04 ` Milan Broz
1 sibling, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 18:37 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: linux-kernel, Pavel Machek
Rafael J. Wysocki schrieb:
>
> Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> it gets stuck? I bet one of the notifiers goes to sleep (cpufreq, maybe).
>
Here we go (ok. i forgot __FUNCTION__ ...):
Mar 11 19:31:33 [kernel] ac ACPI0003:00: freeze
Mar 11 19:31:33 [kernel] acpi device:00: freeze
Mar 11 19:31:33 [kernel] processor ACPI0007:01: freeze
Mar 11 19:31:33 [kernel] processor ACPI0007:00: freeze
Mar 11 19:31:33 [kernel] button button_power:00: freeze
Mar 11 19:31:33 [kernel] acpi acpi_system:00: freeze
Mar 11 19:31:33 [kernel] Disabling non-boot CPUs ...
Mar 11 19:31:33 [kernel] kvm: disabling virtualization on CPU1
Mar 11 19:31:33 [kernel] CPU 1 is now offline
Mar 11 19:31:33 [kernel] SMP alternatives: switching to UP code
Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:msr1
Mar 11 19:31:33 [kernel] CPU1 is down
Mar 11 19:31:33 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 19:31:33 [kernel] Enabling non-boot CPUs ...
Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_UP_PREPARE.
Hung here.
Mar 11 19:31:33 [kernel] <NULL>: after notifier CPU_UP_PREPARE.
Mar 11 19:31:33 [kernel] SMP alternatives: switching to SMP code
Mar 11 19:31:33 [kernel] Booting processor 1/1 eip 3000
Mar 11 19:31:33 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 19:31:33 [kernel] Initializing CPU#1
Mar 11 19:31:33 [kernel] Calibrating delay using timer specific
routine.. 3663.72 BogoMIPS (lpj=6103555)
Mar 11 19:31:33 [kernel] CPU: After generic identify, caps: bfe9fbff
00100000 00000000 00000000 0000c1a9 00000000 00000000
Mar 11 19:31:33 [kernel] monitor/mwait feature present.
Mar 11 19:31:33 [kernel] CPU: L1 I cache: 32K, L1 D cache: 32K
Mar 11 19:31:33 [kernel] CPU: L2 cache: 2048K
Mar 11 19:31:33 [kernel] CPU: Physical Processor ID: 0
Mar 11 19:31:33 [kernel] CPU: Processor Core ID: 1
Mar 11 19:31:33 [kernel] CPU: After all inits, caps: bfe9fbff 00100000
00000000 00002940 0000c1a9 00000000 00000000
Mar 11 19:31:33 [kernel] CPU1: Intel Genuine Intel(R) CPU
T2400 @ 1.83GHz stepping 08
Mar 11 19:31:33 [kernel] <NULL>: after __cpu_up
Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_ONLINE.
Mar 11 19:31:33 [kernel] kvm: enabling virtualization on CPU1
Mar 11 19:31:33 [kernel] Switched to high resolution mode on CPU 1
Mar 11 19:31:33 [kernel] PM: Adding info for No Bus:cpu1
Mar 11 19:31:33 [kernel] PM: Adding info for No Bus:msr1
Mar 11 19:31:33 [kernel] <NULL>: after notifier CPU_ONLINE.
Mar 11 19:31:33 [kernel] CPU1 is up
Mar 11 19:31:33 [kernel] acpi acpi_system:00: resuming
Mar 11 19:31:33 [kernel] button button_power:00: resuming
Mar 11 19:31:33 [kernel] processor ACPI0007:00: resuming
Mar 11 19:31:33 [kernel] processor ACPI0007:01: resuming
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 18:37 ` Thomas Meyer
@ 2007-03-11 19:27 ` Rafael J. Wysocki
0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:27 UTC (permalink / raw)
To: Thomas Meyer; +Cc: linux-kernel, Pavel Machek
On Sunday, 11 March 2007 19:37, Thomas Meyer wrote:
> Rafael J. Wysocki schrieb:
> >
> > Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> > it gets stuck? I bet one of the notifiers goes to sleep (cpufreq, maybe).
> >
> Here we go (ok. i forgot __FUNCTION__ ...):
>
> Mar 11 19:31:33 [kernel] ac ACPI0003:00: freeze
> Mar 11 19:31:33 [kernel] acpi device:00: freeze
> Mar 11 19:31:33 [kernel] processor ACPI0007:01: freeze
> Mar 11 19:31:33 [kernel] processor ACPI0007:00: freeze
> Mar 11 19:31:33 [kernel] button button_power:00: freeze
> Mar 11 19:31:33 [kernel] acpi acpi_system:00: freeze
> Mar 11 19:31:33 [kernel] Disabling non-boot CPUs ...
> Mar 11 19:31:33 [kernel] kvm: disabling virtualization on CPU1
> Mar 11 19:31:33 [kernel] CPU 1 is now offline
> Mar 11 19:31:33 [kernel] SMP alternatives: switching to UP code
> Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:cpu1
> Mar 11 19:31:33 [kernel] PM: Removing info for No Bus:msr1
> Mar 11 19:31:33 [kernel] CPU1 is down
> Mar 11 19:31:33 [kernel] swsusp debug: Waiting for 5 seconds.
> Mar 11 19:31:33 [kernel] Enabling non-boot CPUs ...
> Mar 11 19:31:33 [kernel] <NULL>: before notifier CPU_UP_PREPARE.
>
> Hung here.
This means that one of the notifiers had not returned before you pressed
the button. Now the question is which one (there are many).
I don't know if there's any nicer way to find out that, but I usually hack
kernel/sys.c:notifier_call_chain() to print nb->notifier_call (as a pointer)
before the call is made. Then I write down the address of the last one
called before the hang/oops and use gdb to check which function it points to.
Greetings,
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 18:26 ` Rafael J. Wysocki
2007-03-11 18:37 ` Thomas Meyer
@ 2007-03-11 19:04 ` Milan Broz
2007-03-11 19:16 ` Thomas Meyer
2007-03-11 19:38 ` Rafael J. Wysocki
1 sibling, 2 replies; 19+ messages in thread
From: Milan Broz @ 2007-03-11 19:04 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Thomas Meyer, linux-kernel, Pavel Machek
Rafael J. Wysocki napsal(a):
> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
>> Suspend to disk doesn't work on my laptop.
>>
>> The suspend seems to hang while enabling the non-boot cpus again.
>>
>> with platform = "test" and state = "disk" i get this:
>> Enabling non-boot CPUs ...
...
>
> Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> it gets stuck? I bet one of the notifiers goes to sleep (cpufreq, maybe).
Hi,
I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless),
short printk trace
enable_nonboot_cpus
_cpu_up
raw_notifier_callchain (CPU_UP_PREPARE)
...
update_sched_domains
detach_destroy_domains
[waits here] --> synchronize_sched (==synchronize_rcu)
Milan
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 19:04 ` Milan Broz
@ 2007-03-11 19:16 ` Thomas Meyer
2007-03-11 19:50 ` Rafael J. Wysocki
2007-03-11 19:38 ` Rafael J. Wysocki
1 sibling, 1 reply; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 19:16 UTC (permalink / raw)
To: Milan Broz; +Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek
Milan Broz schrieb:
> Rafael J. Wysocki napsal(a):
>
>> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
>>
>>> Suspend to disk doesn't work on my laptop.
>>>
>>> The suspend seems to hang while enabling the non-boot cpus again.
>>>
>>> with platform = "test" and state = "disk" i get this:
>>>
>
>
> Hi,
> I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless),
> short printk trace
>
> enable_nonboot_cpus
> _cpu_up
> raw_notifier_callchain (CPU_UP_PREPARE)
> ...
> update_sched_domains
> detach_destroy_domains
> [waits here] --> synchronize_sched (==synchronize_rcu)
>
>
Maybe this helps:
Mar 11 20:03:56 [kernel] PM: Removing info for No Bus:msr1
Mar 11 20:03:56 [kernel] CPU1 is down
Mar 11 20:03:56 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 20:03:56 [kernel] Enabling non-boot CPUs ...
Mar 11 20:03:56 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 20:03:56 [kernel] migration_call: Hi!
Mar 11 20:03:56 [kernel] rcu_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] timer_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 20:03:56 [kernel] cpu_callback: Hi!
Mar 11 20:03:56 [kernel] workqueue_cpu_callback: Hi!
Mar 11 20:03:56 [kernel] topology_cpu_callback: Hi!
Guess what? Hang!
Mar 11 20:03:56 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 20:03:56 [kernel] SMP alternatives: switching to SMP code
Mar 11 20:03:56 [kernel] Booting processor 1/1 eip 3000
Mar 11 20:03:56 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 20:03:56 [kernel] Initializing CPU#1
I use NO_HZ, too. Will try without it.
With kind regards
thomas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 19:16 ` Thomas Meyer
@ 2007-03-11 19:50 ` Rafael J. Wysocki
2007-03-11 20:04 ` Thomas Meyer
0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:50 UTC (permalink / raw)
To: Thomas Meyer; +Cc: Milan Broz, linux-kernel, Pavel Machek
On Sunday, 11 March 2007 20:16, Thomas Meyer wrote:
> Milan Broz schrieb:
> > Rafael J. Wysocki napsal(a):
> >
> >> On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> >>
> >>> Suspend to disk doesn't work on my laptop.
> >>>
> >>> The suspend seems to hang while enabling the non-boot cpus again.
> >>>
> >>> with platform = "test" and state = "disk" i get this:
> >>>
> >
> >
> > Hi,
> > I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless),
> > short printk trace
> >
> > enable_nonboot_cpus
> > _cpu_up
> > raw_notifier_callchain (CPU_UP_PREPARE)
> > ...
> > update_sched_domains
> > detach_destroy_domains
> > [waits here] --> synchronize_sched (==synchronize_rcu)
> >
> >
>
> Maybe this helps:
>
> Mar 11 20:03:56 [kernel] PM: Removing info for No Bus:msr1
> Mar 11 20:03:56 [kernel] CPU1 is down
> Mar 11 20:03:56 [kernel] swsusp debug: Waiting for 5 seconds.
> Mar 11 20:03:56 [kernel] Enabling non-boot CPUs ...
> Mar 11 20:03:56 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
> Mar 11 20:03:56 [kernel] migration_call: Hi!
> Mar 11 20:03:56 [kernel] rcu_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] timer_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] hrtimer_cpu_notify: Hi!
> Mar 11 20:03:56 [kernel] cpu_callback: Hi!
> Mar 11 20:03:56 [kernel] workqueue_cpu_callback: Hi!
> Mar 11 20:03:56 [kernel] topology_cpu_callback: Hi!
>
> Guess what? Hang!
Hm, you may be hitting the problem described in this thread:
http://lkml.org/lkml/2007/3/6/364 .
Can you please apply the patch at http://lkml.org/lkml/2007/3/7/255
and see if that helps?
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 19:50 ` Rafael J. Wysocki
@ 2007-03-11 20:04 ` Thomas Meyer
0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:04 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek
Sorry the systems hangs here:
Mar 11 20:55:46 [kernel] CPU 1 is now offline
Mar 11 20:55:46 [kernel] SMP alternatives: switching to UP code
Mar 11 20:55:46 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 20:55:46 [kernel] PM: Removing info for No Bus:msr1
Mar 11 20:55:46 [kernel] CPU1 is down
Mar 11 20:55:46 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 20:55:46 [kernel] Enabling non-boot CPUs ...
Mar 11 20:55:46 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 20:55:46 [kernel] migration_call: Hi!
Mar 11 20:55:46 [kernel] rcu_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] timer_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 20:55:46 [kernel] cpu_callback: Hi!
Mar 11 20:55:46 [kernel] synchronize_rcu: Befor wait
---> System hangs here. After the first "Before wait" message.
- Last output repeated twice -
Mar 11 20:55:46 [kernel] synchronize_rcu: After wait
Mar 11 20:55:46 [kernel] workqueue_cpu_callback: Hi!
Mar 11 20:55:46 [kernel] topology_cpu_callback: Hi!
Mar 11 20:55:46 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 20:55:46 [kernel] SMP alternatives: switching to SMP code
Mar 11 20:55:46 [kernel] Booting processor 1/1 eip 3000
Mar 11 20:55:46 [kernel] CPU 1 irqstacks, hard=c0388000 soft=c0386000
Mar 11 20:55:46 [kernel] Initializing CPU#1
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 19:04 ` Milan Broz
2007-03-11 19:16 ` Thomas Meyer
@ 2007-03-11 19:38 ` Rafael J. Wysocki
2007-03-11 20:23 ` Milan Broz
1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 19:38 UTC (permalink / raw)
To: Milan Broz; +Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner
On Sunday, 11 March 2007 20:04, Milan Broz wrote:
> Rafael J. Wysocki napsal(a):
> > On Sunday, 11 March 2007 19:08, Thomas Meyer wrote:
> >> Suspend to disk doesn't work on my laptop.
> >>
> >> The suspend seems to hang while enabling the non-boot cpus again.
> >>
> >> with platform = "test" and state = "disk" i get this:
>
> >> Enabling non-boot CPUs ...
> ...
> >
> > Could you please put some printk()s in kernel/cpu.c:_cpu_up() to see where
> > it gets stuck? I bet one of the notifiers goes to sleep (cpufreq, maybe).
>
> Hi,
> I see the same problem - 2.6.21-rc3 with NO_HZ set (tickless),
Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
> short printk trace
>
> enable_nonboot_cpus
> _cpu_up
> raw_notifier_callchain (CPU_UP_PREPARE)
> ...
> update_sched_domains
> detach_destroy_domains
> [waits here] --> synchronize_sched (==synchronize_rcu)
Well, I think the call to wait_for_completion() does not return, probably
because the task supposed to complete the completion is frozen at this
point. Can you please try to confirm that it gets stuck on
wait_for_completion() in synchronize_rcu()?
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 19:38 ` Rafael J. Wysocki
@ 2007-03-11 20:23 ` Milan Broz
2007-03-11 20:32 ` Rafael J. Wysocki
0 siblings, 1 reply; 19+ messages in thread
From: Milan Broz @ 2007-03-11 20:23 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner
Rafael J. Wysocki:
> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
>
>> short printk trace
>>
>> enable_nonboot_cpus
>> _cpu_up
>> raw_notifier_callchain (CPU_UP_PREPARE)
>> ...
>> update_sched_domains
>> detach_destroy_domains
>> [waits here] --> synchronize_sched (==synchronize_rcu)
>
> Well, I think the call to wait_for_completion() does not return, probably
> because the task supposed to complete the completion is frozen at this
> point. Can you please try to confirm that it gets stuck on
> wait_for_completion() in synchronize_rcu()?
Yes, it's in wait_for_completion() in synchronize_rcu().
As noted in some previous mail, it will wake up after
event - key press etc.
Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
I added it to my quilt and applied anyway -> no change.
Milan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:23 ` Milan Broz
@ 2007-03-11 20:32 ` Rafael J. Wysocki
2007-03-11 20:28 ` Thomas Meyer
0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 20:32 UTC (permalink / raw)
To: Milan Broz; +Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner
On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> Rafael J. Wysocki:
> > Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
> >
> >> short printk trace
> >>
> >> enable_nonboot_cpus
> >> _cpu_up
> >> raw_notifier_callchain (CPU_UP_PREPARE)
> >> ...
> >> update_sched_domains
> >> detach_destroy_domains
> >> [waits here] --> synchronize_sched (==synchronize_rcu)
> >
> > Well, I think the call to wait_for_completion() does not return, probably
> > because the task supposed to complete the completion is frozen at this
> > point. Can you please try to confirm that it gets stuck on
> > wait_for_completion() in synchronize_rcu()?
>
> Yes, it's in wait_for_completion() in synchronize_rcu().
> As noted in some previous mail, it will wake up after
> event - key press etc.
>
> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> I added it to my quilt and applied anyway -> no change.
Does the problem go away if NO_HZ is unset?
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:32 ` Rafael J. Wysocki
@ 2007-03-11 20:28 ` Thomas Meyer
2007-03-11 20:45 ` Rafael J. Wysocki
2007-03-11 20:57 ` Milan Broz
0 siblings, 2 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:28 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner
Rafael J. Wysocki schrieb:
> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>
>> Rafael J. Wysocki:
>>
>>> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
>>>
>>>
>>>> short printk trace
>>>>
>>>> enable_nonboot_cpus
>>>> _cpu_up
>>>> raw_notifier_callchain (CPU_UP_PREPARE)
>>>> ...
>>>> update_sched_domains
>>>> detach_destroy_domains
>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>
>>> Well, I think the call to wait_for_completion() does not return, probably
>>> because the task supposed to complete the completion is frozen at this
>>> point. Can you please try to confirm that it gets stuck on
>>> wait_for_completion() in synchronize_rcu()?
>>>
>> Yes, it's in wait_for_completion() in synchronize_rcu().
>> As noted in some previous mail, it will wake up after
>> event - key press etc.
>>
>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>> I added it to my quilt and applied anyway -> no change.
>>
>
> Does the problem go away if NO_HZ is unset?
>
i tried to boot with nohz=off, but the problem did persist.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:28 ` Thomas Meyer
@ 2007-03-11 20:45 ` Rafael J. Wysocki
2007-03-11 20:49 ` Thomas Meyer
2007-03-11 20:57 ` Milan Broz
1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 20:45 UTC (permalink / raw)
To: Thomas Meyer; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner
On Sunday, 11 March 2007 21:28, Thomas Meyer wrote:
> Rafael J. Wysocki schrieb:
> > On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> >
> >> Rafael J. Wysocki:
> >>
> >>> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
> >>>
> >>>
> >>>> short printk trace
> >>>>
> >>>> enable_nonboot_cpus
> >>>> _cpu_up
> >>>> raw_notifier_callchain (CPU_UP_PREPARE)
> >>>> ...
> >>>> update_sched_domains
> >>>> detach_destroy_domains
> >>>> [waits here] --> synchronize_sched (==synchronize_rcu)
> >>>>
> >>> Well, I think the call to wait_for_completion() does not return, probably
> >>> because the task supposed to complete the completion is frozen at this
> >>> point. Can you please try to confirm that it gets stuck on
> >>> wait_for_completion() in synchronize_rcu()?
> >>>
> >> Yes, it's in wait_for_completion() in synchronize_rcu().
> >> As noted in some previous mail, it will wake up after
> >> event - key press etc.
> >>
> >> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> >> I added it to my quilt and applied anyway -> no change.
> >>
> >
> > Does the problem go away if NO_HZ is unset?
> >
>
> i tried to boot with nohz=off, but the problem did persist.
Okay, but could you please compile the kernel without NO_HZ and retest?
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:45 ` Rafael J. Wysocki
@ 2007-03-11 20:49 ` Thomas Meyer
0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 20:49 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Milan Broz, linux-kernel, Pavel Machek, Thomas Gleixner
Rafael J. Wysocki schrieb:
>
> Okay, but could you please compile the kernel without NO_HZ and retest?
>
>
Sure.
But i get the same behaviour:
Mar 11 21:42:07 [kernel] processor ACPI0007:00: freeze
Mar 11 21:42:07 [kernel] button button_power:00: freeze
Mar 11 21:42:07 [kernel] acpi acpi_system:00: freeze
Mar 11 21:42:07 [kernel] Disabling non-boot CPUs ...
Mar 11 21:42:07 [kernel] kvm: disabling virtualization on CPU1
Mar 11 21:42:07 [kernel] synchronize_rcu: Befor wait
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] CPU 1 is now offline
Mar 11 21:42:07 [kernel] SMP alternatives: switching to UP code
Mar 11 21:42:07 [kernel] PM: Removing info for No Bus:cpu1
Mar 11 21:42:07 [kernel] PM: Removing info for No Bus:msr1
Mar 11 21:42:07 [kernel] CPU1 is down
Mar 11 21:42:07 [kernel] swsusp debug: Waiting for 5 seconds.
Mar 11 21:42:07 [kernel] Enabling non-boot CPUs ...
Mar 11 21:42:07 [kernel] _cpu_up: before notifier CPU_UP_PREPARE.
Mar 11 21:42:07 [kernel] migration_call: Hi!
Mar 11 21:42:07 [kernel] rcu_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] timer_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] hrtimer_cpu_notify: Hi!
Mar 11 21:42:07 [kernel] cpu_callback: Hi!
Mar 11 21:42:07 [kernel] synchronize_rcu: Befor wait
----> Hang
Why does this message appear two times?
- Last output repeated twice -
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] workqueue_cpu_callback: Hi!
Mar 11 21:42:07 [kernel] topology_cpu_callback: Hi!
Mar 11 21:42:07 [kernel] _cpu_up: after notifier CPU_UP_PREPARE.
Mar 11 21:42:07 [kernel] SMP alternatives: switching to SMP code
Mar 11 21:42:07 [kernel] Booting processor 1/1 eip 3000
Mar 11 21:42:07 [kernel] CPU 1 irqstacks, hard=c0387000 soft=c0385000
Mar 11 21:42:07 [kernel] Initializing CPU#1
Mar 11 21:42:07 [kernel] Calibrating delay using timer specific
routine.. 3662.97 BogoMIPS (lpj=6102352)
Mar 11 21:42:07 [kernel] CPU: After generic identify, caps: bfe9fbff
00100000 00000000 00000000 0000c1a9 00000000 00000000
Mar 11 21:42:07 [kernel] monitor/mwait feature present.
Mar 11 21:42:07 [kernel] CPU: L1 I cache: 32K, L1 D cache: 32K
Mar 11 21:42:07 [kernel] CPU: L2 cache: 2048K
Mar 11 21:42:07 [kernel] CPU: Physical Processor ID: 0
Mar 11 21:42:07 [kernel] CPU: Processor Core ID: 1
Mar 11 21:42:07 [kernel] CPU: After all inits, caps: bfe9fbff 00100000
00000000 00002940 0000c1a9 00000000 00000000
Mar 11 21:42:07 [kernel] CPU1: Intel Genuine Intel(R) CPU
T2400 @ 1.83GHz stepping 08
Mar 11 21:42:07 [kernel] synchronize_rcu: After wait
Mar 11 21:42:07 [kernel] _cpu_up: after __cpu_up
Mar 11 21:42:07 [kernel] _cpu_up: before notifier CPU_ONLINE.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:28 ` Thomas Meyer
2007-03-11 20:45 ` Rafael J. Wysocki
@ 2007-03-11 20:57 ` Milan Broz
2007-03-11 21:02 ` Thomas Meyer
2007-03-11 21:09 ` Rafael J. Wysocki
1 sibling, 2 replies; 19+ messages in thread
From: Milan Broz @ 2007-03-11 20:57 UTC (permalink / raw)
To: Thomas Meyer
Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek, Thomas Gleixner
Thomas Meyer napsal(a):
> Rafael J. Wysocki schrieb:
>> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>>
>>> Rafael J. Wysocki:
>>>
>>>> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
>>>>
>>>>
>>>>> short printk trace
>>>>>
>>>>> enable_nonboot_cpus
>>>>> _cpu_up
>>>>> raw_notifier_callchain (CPU_UP_PREPARE)
>>>>> ...
>>>>> update_sched_domains
>>>>> detach_destroy_domains
>>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>
>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>> because the task supposed to complete the completion is frozen at this
>>>> point. Can you please try to confirm that it gets stuck on
>>>> wait_for_completion() in synchronize_rcu()?
>>>>
>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>> As noted in some previous mail, it will wake up after
>>> event - key press etc.
>>>
>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>> I added it to my quilt and applied anyway -> no change.
>>>
>> Does the problem go away if NO_HZ is unset?
>>
>
> i tried to boot with nohz=off, but the problem did persist.
Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
Milan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:57 ` Milan Broz
@ 2007-03-11 21:02 ` Thomas Meyer
2007-03-11 21:09 ` Rafael J. Wysocki
1 sibling, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 21:02 UTC (permalink / raw)
To: Milan Broz; +Cc: Rafael J. Wysocki, linux-kernel, Pavel Machek, Thomas Gleixner
Milan Broz schrieb:
> Thomas Meyer napsal(a):
>
>> Rafael J. Wysocki schrieb:
>>
>>> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
>>>
>>>
>>>> Rafael J. Wysocki:
>>>>
>>>>
>>>>> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
>>>>>
>>>>>
>>>>>
>>>>>> short printk trace
>>>>>>
>>>>>> enable_nonboot_cpus
>>>>>> _cpu_up
>>>>>> raw_notifier_callchain (CPU_UP_PREPARE)
>>>>>> ...
>>>>>> update_sched_domains
>>>>>> detach_destroy_domains
>>>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>>
>>>>>>
>>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>>> because the task supposed to complete the completion is frozen at this
>>>>> point. Can you please try to confirm that it gets stuck on
>>>>> wait_for_completion() in synchronize_rcu()?
>>>>>
>>>>>
>>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>>> As noted in some previous mail, it will wake up after
>>>> event - key press etc.
>>>>
>>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>>> I added it to my quilt and applied anyway -> no change.
>>>>
>>>>
>>> Does the problem go away if NO_HZ is unset?
>>>
>>>
>> i tried to boot with nohz=off, but the problem did persist.
>>
>
> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
>
> Milan
>
I got a working config:
Without hrtimers and without nohz it is working!
With hrtimers and without nohz it is not working
With hrtimers and with nohz it is not working
Now i want to test: without hrtimers and with nohz.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 20:57 ` Milan Broz
2007-03-11 21:02 ` Thomas Meyer
@ 2007-03-11 21:09 ` Rafael J. Wysocki
2007-03-11 21:56 ` Thomas Gleixner
1 sibling, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2007-03-11 21:09 UTC (permalink / raw)
To: Milan Broz, Andrew Morton
Cc: Thomas Meyer, linux-kernel, Pavel Machek, Thomas Gleixner
On Sunday, 11 March 2007 21:57, Milan Broz wrote:
> Thomas Meyer napsal(a):
> > Rafael J. Wysocki schrieb:
> >> On Sunday, 11 March 2007 21:23, Milan Broz wrote:
> >>
> >>> Rafael J. Wysocki:
> >>>
> >>>> Ah, NO_HZ. Thomas Gleixner's address added to the Cc list.
> >>>>
> >>>>
> >>>>> short printk trace
> >>>>>
> >>>>> enable_nonboot_cpus
> >>>>> _cpu_up
> >>>>> raw_notifier_callchain (CPU_UP_PREPARE)
> >>>>> ...
> >>>>> update_sched_domains
> >>>>> detach_destroy_domains
> >>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
> >>>>>
> >>>> Well, I think the call to wait_for_completion() does not return, probably
> >>>> because the task supposed to complete the completion is frozen at this
> >>>> point. Can you please try to confirm that it gets stuck on
> >>>> wait_for_completion() in synchronize_rcu()?
> >>>>
> >>> Yes, it's in wait_for_completion() in synchronize_rcu().
> >>> As noted in some previous mail, it will wake up after
> >>> event - key press etc.
> >>>
> >>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> >>> I added it to my quilt and applied anyway -> no change.
> >>>
> >> Does the problem go away if NO_HZ is unset?
> >>
> >
> > i tried to boot with nohz=off, but the problem did persist.
>
> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
Definitely something strange is going on here.
I think we need an advice from someone who knows the RCU internals.
Andrew, could you please tell me whom I should ask?
Rafael
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 21:09 ` Rafael J. Wysocki
@ 2007-03-11 21:56 ` Thomas Gleixner
2007-03-11 21:57 ` Thomas Meyer
0 siblings, 1 reply; 19+ messages in thread
From: Thomas Gleixner @ 2007-03-11 21:56 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Milan Broz, Andrew Morton, Thomas Meyer, linux-kernel,
Pavel Machek
On Sun, 2007-03-11 at 22:09 +0100, Rafael J. Wysocki wrote:
> > >>>>> update_sched_domains
> > >>>>> detach_destroy_domains
> > >>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
> > >>>>>
> > >>>> Well, I think the call to wait_for_completion() does not return, probably
> > >>>> because the task supposed to complete the completion is frozen at this
> > >>>> point. Can you please try to confirm that it gets stuck on
> > >>>> wait_for_completion() in synchronize_rcu()?
> > >>>>
> > >>> Yes, it's in wait_for_completion() in synchronize_rcu().
> > >>> As noted in some previous mail, it will wake up after
> > >>> event - key press etc.
> > >>>
> > >>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
> > >>> I added it to my quilt and applied anyway -> no change.
> > >>>
> > >> Does the problem go away if NO_HZ is unset?
> > >>
> > >
> > > i tried to boot with nohz=off, but the problem did persist.
> >
> > Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
>
> Definitely something strange is going on here.
>
> I think we need an advice from someone who knows the RCU internals.
RCU synchronization depends on the timer interrupt. Which kernel version
are you guys talking about ?
tglx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: SwSusp to disk doesn't work - Try 2
2007-03-11 21:56 ` Thomas Gleixner
@ 2007-03-11 21:57 ` Thomas Meyer
0 siblings, 0 replies; 19+ messages in thread
From: Thomas Meyer @ 2007-03-11 21:57 UTC (permalink / raw)
To: tglx
Cc: Rafael J. Wysocki, Milan Broz, Andrew Morton, linux-kernel,
Pavel Machek
Thomas Gleixner schrieb:
> On Sun, 2007-03-11 at 22:09 +0100, Rafael J. Wysocki wrote:
>
>>>>>>>> update_sched_domains
>>>>>>>> detach_destroy_domains
>>>>>>>> [waits here] --> synchronize_sched (==synchronize_rcu)
>>>>>>>>
>>>>>>>>
>>>>>>> Well, I think the call to wait_for_completion() does not return, probably
>>>>>>> because the task supposed to complete the completion is frozen at this
>>>>>>> point. Can you please try to confirm that it gets stuck on
>>>>>>> wait_for_completion() in synchronize_rcu()?
>>>>>>>
>>>>>>>
>>>>>> Yes, it's in wait_for_completion() in synchronize_rcu().
>>>>>> As noted in some previous mail, it will wake up after
>>>>>> event - key press etc.
>>>>>>
>>>>>> Patch in http://lkml.org/lkml/2007/3/7/255 solves different problem.
>>>>>> I added it to my quilt and applied anyway -> no change.
>>>>>>
>>>>>>
>>>>> Does the problem go away if NO_HZ is unset?
>>>>>
>>>>>
>>>> i tried to boot with nohz=off, but the problem did persist.
>>>>
>>> Hmmmm, both variants (nohz=off or recompiled kernel without NO_HZ) works for me.
>>>
>> Definitely something strange is going on here.
>>
>> I think we need an advice from someone who knows the RCU internals.
>>
>
> RCU synchronization depends on the timer interrupt. Which kernel version
> are you guys talking about ?
>
> tglx
>
I talk about be521466feb3bb1cd89de82a2b1d080e9ebd3cb6 (2.6.21-rc3+).
The worst config is with nohz and without hrtimers: the kernel even
doesn't come back after pressing the power key.
But i stay with without nohz and without hrtimers for now, because here
the suspend to disk works.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-03-11 22:00 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-11 18:08 SwSusp to disk doesn't work - Try 2 Thomas Meyer
2007-03-11 18:26 ` Rafael J. Wysocki
2007-03-11 18:37 ` Thomas Meyer
2007-03-11 19:27 ` Rafael J. Wysocki
2007-03-11 19:04 ` Milan Broz
2007-03-11 19:16 ` Thomas Meyer
2007-03-11 19:50 ` Rafael J. Wysocki
2007-03-11 20:04 ` Thomas Meyer
2007-03-11 19:38 ` Rafael J. Wysocki
2007-03-11 20:23 ` Milan Broz
2007-03-11 20:32 ` Rafael J. Wysocki
2007-03-11 20:28 ` Thomas Meyer
2007-03-11 20:45 ` Rafael J. Wysocki
2007-03-11 20:49 ` Thomas Meyer
2007-03-11 20:57 ` Milan Broz
2007-03-11 21:02 ` Thomas Meyer
2007-03-11 21:09 ` Rafael J. Wysocki
2007-03-11 21:56 ` Thomas Gleixner
2007-03-11 21:57 ` Thomas Meyer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox