From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhang Rui Subject: Re: Occasional (too common) suspend problem Date: Fri, 21 Jan 2011 15:26:53 +0800 Message-ID: <1295594813.1866.747.camel@rui> References: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: Received: from mga01.intel.com ([192.55.52.88]:50831 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751745Ab1AUH1V (ORCPT ); Fri, 21 Jan 2011 02:27:21 -0500 In-Reply-To: Sender: linux-acpi-owner@vger.kernel.org List-Id: linux-acpi@vger.kernel.org To: Linus Torvalds Cc: "Rafael J. Wysocki" , Len Brown , Jeff Chua , ACPI Devel Maling List , Linux-pm mailing list On Fri, 2011-01-21 at 12:50 +0800, Linus Torvalds wrote: > So I have one remaining problem on my nasty EeePC problem child > computer, and this one I cannot bisect simply because it's so flaky. > > The exact same kernel may suspend and resume many many times in a row, > and then I reboot it, and it hangs on the first suspend. Very > occasionally the machine comes back when I press a key, and resumes > ok. Most of the time it does not - it's just dead to the world, and > there are no logs to go by. I even tried pm_trace, and that didn't get > me anywhere, although I once got a hash match: > > hash matches drivers/base/power/main.c:555 > > which is the last part of a deivice_resume(), but none of the devices > matched, so that didn't really give any information at all. > > So I have very little to go on. > > However, at least one time when it failed and came back (remember: > very rare), I did get that suspend sequence printouts logged. Here's a > _good_ suspend: > > ... > [ 79.596367] PM: Saving platform NVS memory > [ 79.596378] Disabling non-boot CPUs ... > [ 79.700053] CPU 1 is now offline > [ 79.700565] PM: Restoring platform NVS memory > [ 79.700565] Enabling non-boot CPUs ... > [ 79.700565] Booting Node 0 Processor 1 APIC 0x1 > [ 79.597894] Initializing CPU#1 > ... > > and here's the one that failed and then ended up coming back on a keypress: > > ... > [ 54.628375] PM: Saving platform NVS memory > [ 54.628387] Disabling non-boot CPUs ... > [ 63.554966] ACPI Exception: AE_BAD_PARAMETER, Returned by Handler > for [EmbeddedControl] (20110112/evregion-474) > [ 63.554992] ACPI Error: Method parse/execution failed > [\_SB_.PCI0.SBRG.EC0_.RCTP] (Node f5c2dea0), AE_BAD_PARAMETER > (20110112/psparse-536) > [ 63.555022] ACPI Error: Method parse/execution failed > [\_TZ_.RTMP] (Node f5c32fa8), AE_BAD_PARAMETER (20110112/psparse-536) > [ 63.555047] ACPI Error: Method parse/execution failed > [\_TZ_.TZ00._TMP] (Node f5c34018), AE_BAD_PARAMETER > (20110112/psparse-536) > [ 63.555079] Thermal: failed to read out thermal zone 0 is this a 2.6.38-rc1 regression? can you attach the acpidump output of this machine? thanks, rui