Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "

public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
       [not found]   ` <CACfp433snentBjXMaoLLnemD82-xRCZge9aCCRd1Ao=7yhG=RQ@mail.gmail.com>
@ 2012-08-09  9:41     ` Rafael J. Wysocki
  2012-08-09 17:14       ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2012-08-09  9:41 UTC (permalink / raw)
  To: Athlion; +Cc: linux-pm, ACPI Devel Mailing List

On Thursday, August 09, 2012, Athlion wrote:
> Thanks for the swift reply!
> 
> If I choose processors, I get (among other normal things) this:

What happens if you choose "core"?

> [  305.156134] suspend debug: Waiting for 5 seconds.
> [  310.149418] Enabling non-boot CPUs ...
> [  310.152933] Booting Node 0 Processor 1 APIC 0x1
> [  310.163948] Disabled fast string operations
> [  310.166366] ACPI Exception: AE_BAD_PARAMETER, Returned by Handler
> for [EmbeddedControl] (20120320/evregion-501)
> [  310.166369] ACPI Error: Method parse/execution failed
> [\_SB_.PCI0.LPC_.EC__.LPMD] (Node ffff88011824dc58), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.166374] ACPI Error: Method parse/execution failed
> [\_PR_.CPU0._PPC] (Node ffff88011827d118), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.166377] ACPI Error: Method parse/execution failed
> [\_PR_.CPU1._PPC] (Node ffff88011827d938), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.166382] ACPI Exception: AE_BAD_PARAMETER, Evaluating _PPC
> (20120320/processor_perflib-140)
> [  310.166393] CPU1 is up
> [  310.166455] Booting Node 0 Processor 2 APIC 0x2
> [  310.177563] Disabled fast string operations
> [  310.180057] ACPI Exception: AE_BAD_PARAMETER, Returned by Handler
> for [EmbeddedControl] (20120320/evregion-501)
> [  310.180061] ACPI Error: Method parse/execution failed
> [\_SB_.PCI0.LPC_.EC__.LPMD] (Node ffff88011824dc58), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.180066] ACPI Error: Method parse/execution failed
> [\_PR_.CPU0._PPC] (Node ffff88011827d118), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.180069] ACPI Error: Method parse/execution failed
> [\_PR_.CPU2._PPC] (Node ffff88011827daf0), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.180074] ACPI Exception: AE_BAD_PARAMETER, Evaluating _PPC
> (20120320/processor_perflib-140)
> [  310.180116] CPU2 is up
> [  310.180350] Booting Node 0 Processor 3 APIC 0x3
> [  310.191357] Disabled fast string operations
> [  310.193805] ACPI Exception: AE_BAD_PARAMETER, Returned by Handler
> for [EmbeddedControl] (20120320/evregion-501)
> [  310.193809] ACPI Error: Method parse/execution failed
> [\_SB_.PCI0.LPC_.EC__.LPMD] (Node ffff88011824dc58), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.193814] ACPI Error: Method parse/execution failed
> [\_PR_.CPU0._PPC] (Node ffff88011827d118), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.193817] ACPI Error: Method parse/execution failed
> [\_PR_.CPU3._PPC] (Node ffff88011827da50), AE_BAD_PARAMETER
> (20120320/psparse-536)
> [  310.193821] ACPI Exception: AE_BAD_PARAMETER, Evaluating _PPC
> (20120320/processor_perflib-140)
> [  310.193863] CPU3 is up
> [  310.196882] ACPI: Waking up from system sleep state S3
> [  310.444627] i915 0000:00:02.0: power state changed by ACPI to D0
> 
> I do not get these ACPI Error and ACPI Exceptions in a normal run. Is
> this normal?

No, it is not.

> Does it indicate a problem?

A small one.  It indicates that P-states as defined in the BIOS aren't
usable to the ACPI cpufreq driver.  There also seems to be a problem with
the handling of the embedded controller, which may be more serious.

> The dmesg of the pm-suspend after i have
> # echo "processors" > /sys/power/pm_test
> can be found here: https://dl.dropbox.com/u/63420/processors.txt
> 
> The dmesg of a full pm-suspend run is here
> https://dl.dropbox.com/u/63420/full.txt

That seems to be a clean suspend/resume cycle.  What's the problem, then?

Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-09  9:41     ` Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... " Rafael J. Wysocki
@ 2012-08-09 17:14       ` Athlion
  2012-08-11 15:18         ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-09 17:14 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

On Thu, Aug 9, 2012 at 12:41 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, August 09, 2012, Athlion wrote:
>> Thanks for the swift reply!
>>
>> If I choose processors, I get (among other normal things) this:
>
> What happens if you choose "core"?

Both core and processors always exhibit the same ACPI errors.

>> I do not get these ACPI Error and ACPI Exceptions in a normal run. Is
>> this normal?
>
> No, it is not.
>
>> Does it indicate a problem?
>
> A small one.  It indicates that P-states as defined in the BIOS aren't
> usable to the ACPI cpufreq driver.  There also seems to be a problem with
> the handling of the embedded controller, which may be more serious.

Should I report this to the ACPI list?

>> The dmesg of the pm-suspend after i have
>> # echo "processors" > /sys/power/pm_test
>> can be found here: https://dl.dropbox.com/u/63420/processors.txt
>>
>> The dmesg of a full pm-suspend run is here
>> https://dl.dropbox.com/u/63420/full.txt
>
> That seems to be a clean suspend/resume cycle.  What's the problem, then?

Yes, I'm sorry if it wasn't clear. Since that was a fresh reboot, the
suspend went OK. I posted it because I thought that these ACPI errors
were directly relevant to the problem I'm having. But I am keeping
track of all this info to have when this problem happens again but
since my computer hard locked (this hasn't happened before!) when I
opened the lid just 5 mins ago, and this happens about at 24 hours of
uptime, I will probably have all the info collected by tomorrow. Sorry
for the misunderstanding and thanks for the interest!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-09 17:14       ` Athlion
@ 2012-08-11 15:18         ` Athlion
  2012-08-11 22:08           ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-11 15:18 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

Ok, this thing happened again! Whatsmore, this happened on pm-suspend
while on [core].

The kernel output this:
Aug 11 17:59:16 localhost logger: LID closed
Aug 11 17:59:16 localhost kernel: [32512.391116] EXT4-fs (sda1):
re-mounted. Opts: discard,barrier=0,commit=0
Aug 11 17:59:17 localhost kernel: [32512.520905] wlan0:
deauthenticating from 00:13:33:a1:23:46 by local choice (reason=3)
Aug 11 17:59:17 localhost kernel: [32512.530786] cfg80211: Calling
CRDA to update world regulatory domain
Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): now unmanaged
Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): device
state change: activated -> unmanaged (reason 'removed') [100 10 36]
Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0):
deactivating device (reason 'removed') [36]
Aug 11 17:59:17 localhost dhcpcd[12635]: received SIGTERM, stopping
Aug 11 17:59:17 localhost dhcpcd[12635]: wlan0: removing interface
Aug 11 17:59:17 localhost dhcpcd[12635]: wlan0: del_route: No such device
Aug 11 17:59:17 localhost dhcpcd[12635]: del_address: No such device
Aug 11 17:59:17 localhost kernel: [32512.607181] ehci_hcd
0000:00:1d.0: remove, state 1
Aug 11 17:59:17 localhost kernel: [32512.607212] usb usb2: USB
disconnect, device number 1
Aug 11 17:59:17 localhost kernel: [32512.607214] usb 2-1: USB
disconnect, device number 2
Aug 11 17:59:17 localhost kernel: [32512.612488] ehci_hcd
0000:00:1d.0: USB bus 2 deregistered
Aug 11 17:59:17 localhost kernel: [32512.612528] ehci_hcd
0000:00:1a.0: remove, state 1
Aug 11 17:59:17 localhost kernel: [32512.612533] usb usb1: USB
disconnect, device number 1
Aug 11 17:59:17 localhost kernel: [32512.612534] usb 1-1: USB
disconnect, device number 2
Aug 11 17:59:17 localhost kernel: [32512.617226] ehci_hcd
0000:00:1a.0: USB bus 1 deregistered
Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0):
canceled DHCP transaction, DHCP client pid 12635
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
find interface name for index
Aug 11 17:59:17 localhost NetworkManager[794]:
nm_system_iface_flush_routes: assertion `iface != NULL' failed
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
find interface name for index
Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): cleaning up...
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
find interface name for index
Aug 11 17:59:17 localhost NetworkManager[794]:
(nm-system.c:685):nm_system_iface_get_flags: runtime check failed:
(iface != NULL)
Aug 11 17:59:17 localhost NetworkManager[794]: <error>
[1344697157.261251] [nm-system.c:687] nm_system_iface_get_flags():
(unknown): failed to get interface link object
Aug 11 17:59:17 localhost dbus[406]: [system] Activating service
name='org.freedesktop.nm_dispatcher' (using servicehelper)
Aug 11 17:59:17 localhost NetworkManager[794]: <info> radio killswitch
/sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/ieee80211/phy33/rfkill41
disappeared
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (pid 12635)
unhandled DHCP event for interface wlan0
Aug 11 17:59:17 localhost dbus[406]: [system] Successfully activated
service 'org.freedesktop.nm_dispatcher'
Aug 11 17:59:17 localhost kernel: [32512.842664] PM: Syncing
filesystems ... done.
Aug 11 17:59:17 localhost kernel: [32512.844894] PM: Preparing system
for mem sleep
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> error requesting
auth for org.freedesktop.NetworkManager.wifi.share.protected: (3)
GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner:
GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not get
UID of name ':1.154': no such name
Aug 11 17:59:17 localhost NetworkManager[794]: <warn> error requesting
auth for org.freedesktop.NetworkManager.wifi.share.open: (3)
GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner:
GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not get
UID of name ':1.154': no such name
Aug 11 14:59:56 localhost rtkit-daemon[1885]: Successfully made thread
13922 of process 13922 (/usr/bin/pulseaudio) owned by '1000' high
priority at nice level -11.
Aug 11 14:59:56 localhost rtkit-daemon[1885]: Supervising 1 threads of
1 processes of 1 users.
Aug 11 17:59:56 localhost pulseaudio[13922]: [pulseaudio] pid.c: Stale
PID file, overwriting.
Aug 11 17:59:56 localhost dbus[406]: [system] Activating service
name='org.bluez' (using servicehelper)
Aug 11 17:59:56 localhost dbus[406]: [system] Activated service
'org.bluez' failed: Launch helper exited with unknown return code 1
Aug 11 17:59:56 localhost pulseaudio[13922]: [pulseaudio]
bluetooth-util.c: org.bluez.Manager.ListAdapters() failed:
org.freedesktop.DBus.Error.Spawn.ChildExited: Launch helper exited
with unknown return code 1
Aug 11 14:59:57 localhost rtkit-daemon[1885]: Successfully made thread
13987 of process 13987 (/usr/bin/pulseaudio) owned by '1000' high
priority at nice level -11.
Aug 11 14:59:57 localhost rtkit-daemon[1885]: Supervising 2 threads of
2 processes of 1 users.
Aug 11 17:59:57 localhost pulseaudio[13987]: [pulseaudio] pid.c:
Daemon already running.
Aug 11 18:01:01 localhost /USR/SBIN/CROND[14152]: (root) CMD
(run-parts /etc/cron.hourly)

and then nothing more. Whatsmore, X crashed and no more ACPI events
got registered.
I am really clueless regarding this and have nowhere to look. One hint
would be that after I killed and restarted my X server at 12 hours of
uptime, the problem re-surfaced at about 33 hours uptime, so it might
be something with X but I don't really know how this could affect the
kernel....

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-11 15:18         ` Athlion
@ 2012-08-11 22:08           ` Rafael J. Wysocki
  2012-08-12 11:01             ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2012-08-11 22:08 UTC (permalink / raw)
  To: Athlion; +Cc: linux-pm, ACPI Devel Mailing List

On Saturday, August 11, 2012, Athlion wrote:
> Ok, this thing happened again! Whatsmore, this happened on pm-suspend
> while on [core].
> 
> The kernel output this:

The majority of the messages below are not from the kernel.  The
kernel's messages contain the string "kernel:".

> Aug 11 17:59:16 localhost logger: LID closed
> Aug 11 17:59:16 localhost kernel: [32512.391116] EXT4-fs (sda1):
> re-mounted. Opts: discard,barrier=0,commit=0
> Aug 11 17:59:17 localhost kernel: [32512.520905] wlan0:
> deauthenticating from 00:13:33:a1:23:46 by local choice (reason=3)
> Aug 11 17:59:17 localhost kernel: [32512.530786] cfg80211: Calling
> CRDA to update world regulatory domain
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): now unmanaged
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): device
> state change: activated -> unmanaged (reason 'removed') [100 10 36]
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0):
> deactivating device (reason 'removed') [36]
> Aug 11 17:59:17 localhost dhcpcd[12635]: received SIGTERM, stopping
> Aug 11 17:59:17 localhost dhcpcd[12635]: wlan0: removing interface
> Aug 11 17:59:17 localhost dhcpcd[12635]: wlan0: del_route: No such device
> Aug 11 17:59:17 localhost dhcpcd[12635]: del_address: No such device
> Aug 11 17:59:17 localhost kernel: [32512.607181] ehci_hcd
> 0000:00:1d.0: remove, state 1
> Aug 11 17:59:17 localhost kernel: [32512.607212] usb usb2: USB
> disconnect, device number 1
> Aug 11 17:59:17 localhost kernel: [32512.607214] usb 2-1: USB
> disconnect, device number 2
> Aug 11 17:59:17 localhost kernel: [32512.612488] ehci_hcd
> 0000:00:1d.0: USB bus 2 deregistered
> Aug 11 17:59:17 localhost kernel: [32512.612528] ehci_hcd
> 0000:00:1a.0: remove, state 1
> Aug 11 17:59:17 localhost kernel: [32512.612533] usb usb1: USB
> disconnect, device number 1
> Aug 11 17:59:17 localhost kernel: [32512.612534] usb 1-1: USB
> disconnect, device number 2
> Aug 11 17:59:17 localhost kernel: [32512.617226] ehci_hcd
> 0000:00:1a.0: USB bus 1 deregistered
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0):
> canceled DHCP transaction, DHCP client pid 12635
> Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
> find interface name for index
> Aug 11 17:59:17 localhost NetworkManager[794]:
> nm_system_iface_flush_routes: assertion `iface != NULL' failed
> Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
> find interface name for index
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> (wlan0): cleaning up...
> Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (44) failed to
> find interface name for index
> Aug 11 17:59:17 localhost NetworkManager[794]:
> (nm-system.c:685):nm_system_iface_get_flags: runtime check failed:
> (iface != NULL)
> Aug 11 17:59:17 localhost NetworkManager[794]: <error>
> [1344697157.261251] [nm-system.c:687] nm_system_iface_get_flags():
> (unknown): failed to get interface link object
> Aug 11 17:59:17 localhost dbus[406]: [system] Activating service
> name='org.freedesktop.nm_dispatcher' (using servicehelper)
> Aug 11 17:59:17 localhost NetworkManager[794]: <info> radio killswitch
> /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/ieee80211/phy33/rfkill41
> disappeared
> Aug 11 17:59:17 localhost NetworkManager[794]: <warn> (pid 12635)
> unhandled DHCP event for interface wlan0
> Aug 11 17:59:17 localhost dbus[406]: [system] Successfully activated
> service 'org.freedesktop.nm_dispatcher'
> Aug 11 17:59:17 localhost kernel: [32512.842664] PM: Syncing
> filesystems ... done.
> Aug 11 17:59:17 localhost kernel: [32512.844894] PM: Preparing system
> for mem sleep

This seems to be the last kernel message you've got.

It looks like there's a problem with a power management notifier within
the kernel.  Perhaps a race condition, since it is not reproducible 100%
of the time.

Does it happen if you don't use the lid to trigger suspend?

Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-11 22:08           ` Rafael J. Wysocki
@ 2012-08-12 11:01             ` Athlion
  2012-08-12 13:26               ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-12 11:01 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> This seems to be the last kernel message you've got.
>
> It looks like there's a problem with a power management notifier within
> the kernel.  Perhaps a race condition, since it is not reproducible 100%
> of the time.
>
> Does it happen if you don't use the lid to trigger suspend?
>
> Rafael

No, it does not.

If I don't use the lid, the suspend succeeds 100% of the time (at
least, I have achieved over 4 days of uptime by using the
logout/suspend button of xfce, I never could stand not closing the lid
for more...)

What I don't know exactly is how to begin tracking this problem down.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-12 11:01             ` Athlion
@ 2012-08-12 13:26               ` Athlion
  2012-08-12 21:03                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-12 13:26 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
> On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> This seems to be the last kernel message you've got.
>>
>> It looks like there's a problem with a power management notifier within
>> the kernel.  Perhaps a race condition, since it is not reproducible 100%
>> of the time.
>>
>> Does it happen if you don't use the lid to trigger suspend?
>>
>> Rafael
>
> No, it does not.
>
> If I don't use the lid, the suspend succeeds 100% of the time (at
> least, I have achieved over 4 days of uptime by using the
> logout/suspend button of xfce, I never could stand not closing the lid
> for more...)
>
> What I don't know exactly is how to begin tracking this problem down.

Furthermore, the suspend actually *happens* if I initiate a shutdown
or reboot procedure, right after the point where the system says
killing all processes. On resume, the shutdown/reboot resumes
normally.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-12 13:26               ` Athlion
@ 2012-08-12 21:03                 ` Rafael J. Wysocki
  2012-08-13  7:13                   ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2012-08-12 21:03 UTC (permalink / raw)
  To: Athlion; +Cc: linux-pm, ACPI Devel Mailing List

On Sunday, August 12, 2012, Athlion wrote:
> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> This seems to be the last kernel message you've got.
> >>
> >> It looks like there's a problem with a power management notifier within
> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
> >> of the time.
> >>
> >> Does it happen if you don't use the lid to trigger suspend?
> >>
> >> Rafael
> >
> > No, it does not.
> >
> > If I don't use the lid, the suspend succeeds 100% of the time (at
> > least, I have achieved over 4 days of uptime by using the
> > logout/suspend button of xfce, I never could stand not closing the lid
> > for more...)
> >
> > What I don't know exactly is how to begin tracking this problem down.
> 
> Furthermore, the suspend actually *happens* if I initiate a shutdown
> or reboot procedure, right after the point where the system says
> killing all processes. On resume, the shutdown/reboot resumes
> normally.

There seems to be an input event handling race condition with system suspend
on your machine.  I wonder if it's related to the specific system configuration,
though, because no one else has reported anything like this before.

I'm not sure what to do to debug this further at the moment.

Please attach dmesg output from a clean boot.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-12 21:03                 ` Rafael J. Wysocki
@ 2012-08-13  7:13                   ` Athlion
  2012-08-13  7:27                     ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-13  7:13 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

Thanks,

Here is my dmesg from a clean boot:

https://dl.dropbox.com/u/63420/dmesg.txt

Now that I scanned it more thoroughly I found these:

[    0.363136] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
and
[    0.387856] PCI: Using host bridge windows from ACPI; if necessary,
use "pci=nocrs" and report a bug

my /proc/cmdline is:
BOOT_IMAGE=/boot/vmlinuz-linux
root=UUID=44cf687d-4827-4765-8758-98d44a745d07 ro quiet
resume=/dev/sda2

maybe they indicate a lurking problem?

(in parallel, I will try booting with pci=nocrs and report back)

And there are other people having this issue, some from way back, as
can be seen here
https://bbs.archlinux.org/viewtopic.php?id=144381
(Don't be fooled by the linux 3.4.x reference in the title, it happens
with older kernels, too)

Some of them have found the "solution" to be "never close the lid" but
this is unacceptable, for me.

Again, thanks!

On Mon, Aug 13, 2012 at 12:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Sunday, August 12, 2012, Athlion wrote:
>> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
>> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >> This seems to be the last kernel message you've got.
>> >>
>> >> It looks like there's a problem with a power management notifier within
>> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
>> >> of the time.
>> >>
>> >> Does it happen if you don't use the lid to trigger suspend?
>> >>
>> >> Rafael
>> >
>> > No, it does not.
>> >
>> > If I don't use the lid, the suspend succeeds 100% of the time (at
>> > least, I have achieved over 4 days of uptime by using the
>> > logout/suspend button of xfce, I never could stand not closing the lid
>> > for more...)
>> >
>> > What I don't know exactly is how to begin tracking this problem down.
>>
>> Furthermore, the suspend actually *happens* if I initiate a shutdown
>> or reboot procedure, right after the point where the system says
>> killing all processes. On resume, the shutdown/reboot resumes
>> normally.
>
> There seems to be an input event handling race condition with system suspend
> on your machine.  I wonder if it's related to the specific system configuration,
> though, because no one else has reported anything like this before.
>
> I'm not sure what to do to debug this further at the moment.
>
> Please attach dmesg output from a clean boot.
>
> Thanks,
> Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-13  7:13                   ` Athlion
@ 2012-08-13  7:27                     ` Athlion
  2012-08-16 15:01                       ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-13  7:27 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

And this is the dmesg with pci=nocrs acpi_osi=Linux

https://dl.dropbox.com/u/63420/dmesg2.txt

On Mon, Aug 13, 2012 at 10:13 AM, Athlion <athlion@gmail.com> wrote:
> Thanks,
>
> Here is my dmesg from a clean boot:
>
> https://dl.dropbox.com/u/63420/dmesg.txt
>
> Now that I scanned it more thoroughly I found these:
>
> [    0.363136] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
> and
> [    0.387856] PCI: Using host bridge windows from ACPI; if necessary,
> use "pci=nocrs" and report a bug
>
> my /proc/cmdline is:
> BOOT_IMAGE=/boot/vmlinuz-linux
> root=UUID=44cf687d-4827-4765-8758-98d44a745d07 ro quiet
> resume=/dev/sda2
>
> maybe they indicate a lurking problem?
>
> (in parallel, I will try booting with pci=nocrs and report back)
>
> And there are other people having this issue, some from way back, as
> can be seen here
> https://bbs.archlinux.org/viewtopic.php?id=144381
> (Don't be fooled by the linux 3.4.x reference in the title, it happens
> with older kernels, too)
>
> Some of them have found the "solution" to be "never close the lid" but
> this is unacceptable, for me.
>
> Again, thanks!
>
> On Mon, Aug 13, 2012 at 12:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Sunday, August 12, 2012, Athlion wrote:
>>> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
>>> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> >> This seems to be the last kernel message you've got.
>>> >>
>>> >> It looks like there's a problem with a power management notifier within
>>> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
>>> >> of the time.
>>> >>
>>> >> Does it happen if you don't use the lid to trigger suspend?
>>> >>
>>> >> Rafael
>>> >
>>> > No, it does not.
>>> >
>>> > If I don't use the lid, the suspend succeeds 100% of the time (at
>>> > least, I have achieved over 4 days of uptime by using the
>>> > logout/suspend button of xfce, I never could stand not closing the lid
>>> > for more...)
>>> >
>>> > What I don't know exactly is how to begin tracking this problem down.
>>>
>>> Furthermore, the suspend actually *happens* if I initiate a shutdown
>>> or reboot procedure, right after the point where the system says
>>> killing all processes. On resume, the shutdown/reboot resumes
>>> normally.
>>
>> There seems to be an input event handling race condition with system suspend
>> on your machine.  I wonder if it's related to the specific system configuration,
>> though, because no one else has reported anything like this before.
>>
>> I'm not sure what to do to debug this further at the moment.
>>
>> Please attach dmesg output from a clean boot.
>>
>> Thanks,
>> Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-13  7:27                     ` Athlion
@ 2012-08-16 15:01                       ` Athlion
  2012-08-25 17:31                         ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-16 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

Some new information, if that is helpful at all.

I have managed to circumvent the problem (I am not at 68 hours uptime
with proper suspend/resume by closing the lid) by killing the X server
every now and then (every 10-12 hours). Anyway, this afternoon, my
battery was drained and the system hibernated. On resume I saw this:

Aug 16 17:45:55 localhost kernel: [28755.912618] Uhhuh. NMI received
for unknown reason 3d on CPU 0.
Aug 16 17:45:55 localhost kernel: [28755.912622] Do you have a strange
power saving mode enabled?
Aug 16 17:45:55 localhost kernel: [28755.912623] Dazed and confused,
but trying to continue

Is this maybe related to my problem?

Thanks!

On Mon, Aug 13, 2012 at 10:27 AM, Athlion <athlion@gmail.com> wrote:
> And this is the dmesg with pci=nocrs acpi_osi=Linux
>
> https://dl.dropbox.com/u/63420/dmesg2.txt
>
> On Mon, Aug 13, 2012 at 10:13 AM, Athlion <athlion@gmail.com> wrote:
>> Thanks,
>>
>> Here is my dmesg from a clean boot:
>>
>> https://dl.dropbox.com/u/63420/dmesg.txt
>>
>> Now that I scanned it more thoroughly I found these:
>>
>> [    0.363136] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
>> and
>> [    0.387856] PCI: Using host bridge windows from ACPI; if necessary,
>> use "pci=nocrs" and report a bug
>>
>> my /proc/cmdline is:
>> BOOT_IMAGE=/boot/vmlinuz-linux
>> root=UUID=44cf687d-4827-4765-8758-98d44a745d07 ro quiet
>> resume=/dev/sda2
>>
>> maybe they indicate a lurking problem?
>>
>> (in parallel, I will try booting with pci=nocrs and report back)
>>
>> And there are other people having this issue, some from way back, as
>> can be seen here
>> https://bbs.archlinux.org/viewtopic.php?id=144381
>> (Don't be fooled by the linux 3.4.x reference in the title, it happens
>> with older kernels, too)
>>
>> Some of them have found the "solution" to be "never close the lid" but
>> this is unacceptable, for me.
>>
>> Again, thanks!
>>
>> On Mon, Aug 13, 2012 at 12:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> On Sunday, August 12, 2012, Athlion wrote:
>>>> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
>>>> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> >> This seems to be the last kernel message you've got.
>>>> >>
>>>> >> It looks like there's a problem with a power management notifier within
>>>> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
>>>> >> of the time.
>>>> >>
>>>> >> Does it happen if you don't use the lid to trigger suspend?
>>>> >>
>>>> >> Rafael
>>>> >
>>>> > No, it does not.
>>>> >
>>>> > If I don't use the lid, the suspend succeeds 100% of the time (at
>>>> > least, I have achieved over 4 days of uptime by using the
>>>> > logout/suspend button of xfce, I never could stand not closing the lid
>>>> > for more...)
>>>> >
>>>> > What I don't know exactly is how to begin tracking this problem down.
>>>>
>>>> Furthermore, the suspend actually *happens* if I initiate a shutdown
>>>> or reboot procedure, right after the point where the system says
>>>> killing all processes. On resume, the shutdown/reboot resumes
>>>> normally.
>>>
>>> There seems to be an input event handling race condition with system suspend
>>> on your machine.  I wonder if it's related to the specific system configuration,
>>> though, because no one else has reported anything like this before.
>>>
>>> I'm not sure what to do to debug this further at the moment.
>>>
>>> Please attach dmesg output from a clean boot.
>>>
>>> Thanks,
>>> Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-16 15:01                       ` Athlion
@ 2012-08-25 17:31                         ` Athlion
  2012-08-27  7:28                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Athlion @ 2012-08-25 17:31 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

I have managed to track where the kernel stops and generate sort of a backtrace.
The result is this (line numbers against linux-3.4.9)

drivers/tty/vt/vt_ioctl.c:133: wait_event_interruptible
drivers/tty/vt/vt_ioctl.c:1426: vt_waitactive
kernel/power/console.c:19: vt_move_to_console
kernel/power/suspend.c:98: pm_prepare_console
suspend_prepare called

Execution stops at wait_event_interruptible. Any ideas why this might be?

Thanks!

On Thu, Aug 16, 2012 at 6:01 PM, Athlion <athlion@gmail.com> wrote:
> Some new information, if that is helpful at all.
>
> I have managed to circumvent the problem (I am not at 68 hours uptime
> with proper suspend/resume by closing the lid) by killing the X server
> every now and then (every 10-12 hours). Anyway, this afternoon, my
> battery was drained and the system hibernated. On resume I saw this:
>
> Aug 16 17:45:55 localhost kernel: [28755.912618] Uhhuh. NMI received
> for unknown reason 3d on CPU 0.
> Aug 16 17:45:55 localhost kernel: [28755.912622] Do you have a strange
> power saving mode enabled?
> Aug 16 17:45:55 localhost kernel: [28755.912623] Dazed and confused,
> but trying to continue
>
> Is this maybe related to my problem?
>
> Thanks!
>
> On Mon, Aug 13, 2012 at 10:27 AM, Athlion <athlion@gmail.com> wrote:
>> And this is the dmesg with pci=nocrs acpi_osi=Linux
>>
>> https://dl.dropbox.com/u/63420/dmesg2.txt
>>
>> On Mon, Aug 13, 2012 at 10:13 AM, Athlion <athlion@gmail.com> wrote:
>>> Thanks,
>>>
>>> Here is my dmesg from a clean boot:
>>>
>>> https://dl.dropbox.com/u/63420/dmesg.txt
>>>
>>> Now that I scanned it more thoroughly I found these:
>>>
>>> [    0.363136] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
>>> and
>>> [    0.387856] PCI: Using host bridge windows from ACPI; if necessary,
>>> use "pci=nocrs" and report a bug
>>>
>>> my /proc/cmdline is:
>>> BOOT_IMAGE=/boot/vmlinuz-linux
>>> root=UUID=44cf687d-4827-4765-8758-98d44a745d07 ro quiet
>>> resume=/dev/sda2
>>>
>>> maybe they indicate a lurking problem?
>>>
>>> (in parallel, I will try booting with pci=nocrs and report back)
>>>
>>> And there are other people having this issue, some from way back, as
>>> can be seen here
>>> https://bbs.archlinux.org/viewtopic.php?id=144381
>>> (Don't be fooled by the linux 3.4.x reference in the title, it happens
>>> with older kernels, too)
>>>
>>> Some of them have found the "solution" to be "never close the lid" but
>>> this is unacceptable, for me.
>>>
>>> Again, thanks!
>>>
>>> On Mon, Aug 13, 2012 at 12:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>> On Sunday, August 12, 2012, Athlion wrote:
>>>>> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
>>>>> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>>>> >> This seems to be the last kernel message you've got.
>>>>> >>
>>>>> >> It looks like there's a problem with a power management notifier within
>>>>> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
>>>>> >> of the time.
>>>>> >>
>>>>> >> Does it happen if you don't use the lid to trigger suspend?
>>>>> >>
>>>>> >> Rafael
>>>>> >
>>>>> > No, it does not.
>>>>> >
>>>>> > If I don't use the lid, the suspend succeeds 100% of the time (at
>>>>> > least, I have achieved over 4 days of uptime by using the
>>>>> > logout/suspend button of xfce, I never could stand not closing the lid
>>>>> > for more...)
>>>>> >
>>>>> > What I don't know exactly is how to begin tracking this problem down.
>>>>>
>>>>> Furthermore, the suspend actually *happens* if I initiate a shutdown
>>>>> or reboot procedure, right after the point where the system says
>>>>> killing all processes. On resume, the shutdown/reboot resumes
>>>>> normally.
>>>>
>>>> There seems to be an input event handling race condition with system suspend
>>>> on your machine.  I wonder if it's related to the specific system configuration,
>>>> though, because no one else has reported anything like this before.
>>>>
>>>> I'm not sure what to do to debug this further at the moment.
>>>>
>>>> Please attach dmesg output from a clean boot.
>>>>
>>>> Thanks,
>>>> Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-25 17:31                         ` Athlion
@ 2012-08-27  7:28                           ` Rafael J. Wysocki
  2012-08-27  7:59                             ` Athlion
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2012-08-27  7:28 UTC (permalink / raw)
  To: Athlion; +Cc: linux-pm, ACPI Devel Mailing List

On Saturday, August 25, 2012, Athlion wrote:
> I have managed to track where the kernel stops and generate sort of a backtrace.
> The result is this (line numbers against linux-3.4.9)
> 
> drivers/tty/vt/vt_ioctl.c:133: wait_event_interruptible
> drivers/tty/vt/vt_ioctl.c:1426: vt_waitactive
> kernel/power/console.c:19: vt_move_to_console
> kernel/power/suspend.c:98: pm_prepare_console
> suspend_prepare called
> 
> Execution stops at wait_event_interruptible. Any ideas why this might be?

Not at the moment, but this is an important data point.

Thanks a lot for tracing this down!

Rafael


> On Thu, Aug 16, 2012 at 6:01 PM, Athlion <athlion@gmail.com> wrote:
> > Some new information, if that is helpful at all.
> >
> > I have managed to circumvent the problem (I am not at 68 hours uptime
> > with proper suspend/resume by closing the lid) by killing the X server
> > every now and then (every 10-12 hours). Anyway, this afternoon, my
> > battery was drained and the system hibernated. On resume I saw this:
> >
> > Aug 16 17:45:55 localhost kernel: [28755.912618] Uhhuh. NMI received
> > for unknown reason 3d on CPU 0.
> > Aug 16 17:45:55 localhost kernel: [28755.912622] Do you have a strange
> > power saving mode enabled?
> > Aug 16 17:45:55 localhost kernel: [28755.912623] Dazed and confused,
> > but trying to continue
> >
> > Is this maybe related to my problem?
> >
> > Thanks!
> >
> > On Mon, Aug 13, 2012 at 10:27 AM, Athlion <athlion@gmail.com> wrote:
> >> And this is the dmesg with pci=nocrs acpi_osi=Linux
> >>
> >> https://dl.dropbox.com/u/63420/dmesg2.txt
> >>
> >> On Mon, Aug 13, 2012 at 10:13 AM, Athlion <athlion@gmail.com> wrote:
> >>> Thanks,
> >>>
> >>> Here is my dmesg from a clean boot:
> >>>
> >>> https://dl.dropbox.com/u/63420/dmesg.txt
> >>>
> >>> Now that I scanned it more thoroughly I found these:
> >>>
> >>> [    0.363136] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
> >>> and
> >>> [    0.387856] PCI: Using host bridge windows from ACPI; if necessary,
> >>> use "pci=nocrs" and report a bug
> >>>
> >>> my /proc/cmdline is:
> >>> BOOT_IMAGE=/boot/vmlinuz-linux
> >>> root=UUID=44cf687d-4827-4765-8758-98d44a745d07 ro quiet
> >>> resume=/dev/sda2
> >>>
> >>> maybe they indicate a lurking problem?
> >>>
> >>> (in parallel, I will try booting with pci=nocrs and report back)
> >>>
> >>> And there are other people having this issue, some from way back, as
> >>> can be seen here
> >>> https://bbs.archlinux.org/viewtopic.php?id=144381
> >>> (Don't be fooled by the linux 3.4.x reference in the title, it happens
> >>> with older kernels, too)
> >>>
> >>> Some of them have found the "solution" to be "never close the lid" but
> >>> this is unacceptable, for me.
> >>>
> >>> Again, thanks!
> >>>
> >>> On Mon, Aug 13, 2012 at 12:03 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>> On Sunday, August 12, 2012, Athlion wrote:
> >>>>> On Sun, Aug 12, 2012 at 2:01 PM, Athlion <athlion@gmail.com> wrote:
> >>>>> > On Sun, Aug 12, 2012 at 1:08 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >>>>> >> This seems to be the last kernel message you've got.
> >>>>> >>
> >>>>> >> It looks like there's a problem with a power management notifier within
> >>>>> >> the kernel.  Perhaps a race condition, since it is not reproducible 100%
> >>>>> >> of the time.
> >>>>> >>
> >>>>> >> Does it happen if you don't use the lid to trigger suspend?
> >>>>> >>
> >>>>> >> Rafael
> >>>>> >
> >>>>> > No, it does not.
> >>>>> >
> >>>>> > If I don't use the lid, the suspend succeeds 100% of the time (at
> >>>>> > least, I have achieved over 4 days of uptime by using the
> >>>>> > logout/suspend button of xfce, I never could stand not closing the lid
> >>>>> > for more...)
> >>>>> >
> >>>>> > What I don't know exactly is how to begin tracking this problem down.
> >>>>>
> >>>>> Furthermore, the suspend actually *happens* if I initiate a shutdown
> >>>>> or reboot procedure, right after the point where the system says
> >>>>> killing all processes. On resume, the shutdown/reboot resumes
> >>>>> normally.
> >>>>
> >>>> There seems to be an input event handling race condition with system suspend
> >>>> on your machine.  I wonder if it's related to the specific system configuration,
> >>>> though, because no one else has reported anything like this before.
> >>>>
> >>>> I'm not sure what to do to debug this further at the moment.
> >>>>
> >>>> Please attach dmesg output from a clean boot.
> >>>>
> >>>> Thanks,
> >>>> Rafael
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... "
  2012-08-27  7:28                           ` Rafael J. Wysocki
@ 2012-08-27  7:59                             ` Athlion
  0 siblings, 0 replies; 13+ messages in thread
From: Athlion @ 2012-08-27  7:59 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, ACPI Devel Mailing List

On Mon, Aug 27, 2012 at 10:28 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Saturday, August 25, 2012, Athlion wrote:
>> I have managed to track where the kernel stops and generate sort of a backtrace.
>> The result is this (line numbers against linux-3.4.9)
>>
>> drivers/tty/vt/vt_ioctl.c:133: wait_event_interruptible
>> drivers/tty/vt/vt_ioctl.c:1426: vt_waitactive
>> kernel/power/console.c:19: vt_move_to_console
>> kernel/power/suspend.c:98: pm_prepare_console
>> suspend_prepare called
>>
>> Execution stops at wait_event_interruptible. Any ideas why this might be?
>
> Not at the moment, but this is an important data point.
>
> Thanks a lot for tracing this down!
>
> Rafael

Feel free to ask of anything supplementary you might need!

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-08-27  7:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CACfp4335KJQ8Zjj0TJNj65Va9K-t_iEHLEGEi8QuWyQj5dVT0g@mail.gmail.com>
     [not found] ` <201208090033.52380.rjw@sisk.pl>
     [not found]   ` <CACfp433snentBjXMaoLLnemD82-xRCZge9aCCRd1Ao=7yhG=RQ@mail.gmail.com>
2012-08-09  9:41     ` Kernel stops at "PM: Preparing system for mem sleep", never makes it to "Freezing user space processes ... " Rafael J. Wysocki
2012-08-09 17:14       ` Athlion
2012-08-11 15:18         ` Athlion
2012-08-11 22:08           ` Rafael J. Wysocki
2012-08-12 11:01             ` Athlion
2012-08-12 13:26               ` Athlion
2012-08-12 21:03                 ` Rafael J. Wysocki
2012-08-13  7:13                   ` Athlion
2012-08-13  7:27                     ` Athlion
2012-08-16 15:01                       ` Athlion
2012-08-25 17:31                         ` Athlion
2012-08-27  7:28                           ` Rafael J. Wysocki
2012-08-27  7:59                             ` Athlion

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox