From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Thomas Gleixner <tglx@linutronix.de>,
linux-arch@vger.kernel.org, Rik van Riel <riel@redhat.com>,
"Srivatsa S. Bhat" <srivatsa@mit.edu>,
Peter Zijlstra <peterz@infradead.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Oleg Nesterov <oleg@redhat.com>, Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Paul Turner <pjt@google.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Zhang, Rui" <rui.zhang@intel.com>,
Len Brown <len.brown@intel.com>,
Linux PM <linux-pm@vger.kernel.org>,
Linux ACPI <linux-acpi@vger.kernel.org>
Subject: Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]
Date: Tue, 31 May 2016 10:26:50 +0300 [thread overview]
Message-ID: <20160531072650.GP4329@intel.com> (raw)
In-Reply-To: <CAJZ5v0in8YwMBgLd6=KgGVc-f2yVvutOJG_u05OKLK7_EzN0yQ@mail.gmail.com>
On Mon, May 30, 2016 at 10:43:51PM +0200, Rafael J. Wysocki wrote:
> On Thu, May 26, 2016 at 8:32 PM, Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Wed, May 18, 2016 at 10:24:24AM +0300, Ville Syrjälä wrote:
> >> On Wed, May 18, 2016 at 01:14:42AM +0200, Rafael J. Wysocki wrote:
> >> > On 5/16/2016 9:39 PM, Ville Syrjälä wrote:
> >> > > On Wed, May 11, 2016 at 04:34:06PM +0300, Ville Syrjälä wrote:
> >> > >> On Wed, May 11, 2016 at 08:44:45AM -0400, Steven Rostedt wrote:
> >> > >>> On Wed, 11 May 2016 15:21:16 +0300
> >> > >>> Ville Syrjälä <ville.syrjala@linux.intel.com> wrote:
> >> > >>>
> >> > >>>> Yeah can't get anything from the machine at that point. netconsole
> >> > >>>> didn't help either, and no serial on this machine. And IIRC I've
> >> > >>>> tried ramoops on this thing in the past but unfortunately the memory
> >> > >>>> got cleared on reboot.
> >> > >>>>
> >> > >>> Can you look at the documentation in the kernel code at
> >> > >>>
> >> > >>> Documentation/power/basic-pm-debugging.txt And follow the procedures
> >> > >>> for testing suspend to RAM (although it requires mostly running the
> >> > >>> same tests as for hibernation suspending).
> >> > >>>
> >> > >>> You can also use the tool s2ram for this as well.
> >> > >>>
> >> > >>> See Documentation/power/s2ram.txt
> >> > >>>
> >> > >>> Perhaps this can give us a bit more light onto the problem.
> >> > >>>
> >> > >>> Basically the above does partial suspend and resume, and can pinpoint
> >> > >>> problem areas down to a more select location.
> >> > >> All the pm_test modes work fine. The only difference between them was
> >> > >> that 'platform' required me to manually wake up the machine (hitting a
> >> > >> key was sufficient), whereas the others woke up without help.
> >> > >>
> >> > >> pm_trace gave me
> >> > >> [ 1.306633] Magic number: 0:185:178
> >> > >> [ 1.322880] hash matches ../drivers/base/power/main.c:1070
> >> > >> [ 1.339270] acpi device:0e: hash matches
> >> > >> [ 1.355414] platform: hash matches
> >> > >>
> >> > >> which is the TRACE_SUSPEND in __device_suspend_noirq(), so no help
> >> > >> there.
> >> > >>
> >> > >> I guess I could try to sprinkle more TRACE_RESUMEs around into some
> >> > >> early resume code. If anyone has good ideas where to put them it
> >> > >> might speed things up a bit.
> >> > > So I did a bunch of that and found that it gets stuck somewhere
> >> > > around executing the _WAK method:
> >> > > platform_resume_noirq
> >> > > acpi_pm_finish
> >> > > acpi_leave_sleep_state
> >> > > acpi_hw_sleep_dispatch
> >> > > acpi_hw_legacy_wake
> >> > > acpi_hw_execute_sleep_method
> >> > > acpi_evaluate_object
> >> > > acpi_ns_evaluate
> >> > > acpi_ps_execute_method
> >> > > acpi_ps_parse_aml
> >> > >
> >> > > It also seesm that adding a few TRACE_RESUME()s or an msleep() right
> >> > > after enable_nonboot_cpus() can avoid the hang, sometimes.
> >> > >
> >> > > I've attached the DSDT in case anyone is interested in looking at it.
> >> > >
> >> >
> >> > What if you comment out the execution of _WAK (line 318 of
> >> > drivers/acpi/acpica/hwsleep.c in 4.6)? Does that make any difference?
> >>
> >> Indeed it does. Tried with acpi_idle and intel_idle, and both appear to
> >> resume just fine with that hack.
> >>
> >> - acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_state);
> >> + //acpi_hw_execute_sleep_method(METHOD_PATHNAME__WAK, sleep_state);
> >> + printk(KERN_CRIT "skipping _WAK\n");
> >
> > Continuing with my detective work a bit, I decided to hack the DSDT a
> > bit to see if I can narrow the it down further, and looks like I found
> > it on the first guess. The following change stops it from hanging.
> >
> > @ -5056,7 +5056,7 @@
> > If (LEqual (Arg0, 0x03))
> > {
> > Store (0x01, \SPNF)
> > - TRAP (0x46)
> > + //TRAP (0x46)
> > P8XH (0x00, 0x03)
> > }
> >
> > So what does that do? Let's see:
> >
> > OperationRegion (IO_T, SystemIO, 0x0800, 0x10)
> > Field (IO_T, ByteAcc, NoLock, Preserve)
> > {
> > Offset (0x08),
> > TRP0, 8
> > }
> >
> > OperationRegion (GNVS, SystemMemory, 0x3F5E0C7C, 0x0200)
> > Field (GNVS, AnyAcc, Lock, Preserve)
> > {
> > OSYS, 16,
> > SMIF, 8,
> > ...
> >
> > Method (TRAP, 1, Serialized)
> > {
> > Store (Arg0, SMIF) /* \SMIF */
> > Store (0x00, TRP0) /* \TRP0 */
> > Return (SMIF) /* \SMIF */
> > }
> >
> > and a dump of the IOTR registers shows:
> >
> > 0x1e80: 0x0000fe01
> > 0x1e84: 0x00020001
> > 0x1e98: 0x000c0801
> > 0x1e9c: 0x000200f0
> >
> > which seems to be telling me that ports 0x800-0x80f and
> > 0xfe00-0xfe03 would trigger an SMI.
>
> Well, the name of the method kind of suggests that it triggers an SMM trap. :-)
Which is why I wanted confirm that by looking at the IOTR regs ;)
>
> > So the next question is how do the idle drivers and cpu hotplug
> > fit into this picture. Do we need to force the second HT into
> > a specific C state before the SMI or something?
>
> Or you can ask why exactly someone put that SMM trap into _WAK.
>
> Apparently, it was regarded as necessary or no one would have
> bothered. The only reason I can see why it might be regarded as
> necessary was that Windows did something Linux doesn't do on that
> platform, or, which to me is far more interesting, that Windows didn't
> do something actually done by Linux.
>
> My theory would be that Windows didn't reinitialize the second HT
> properly during resume and the trap was added to let SMM do that. If
> that's the case, the trap may trigger by the time the second HT
> already executes code in Linux and then it will mess up with it and
> crash.
>
> Now, what do idles states have to do with that? IIRC, Windows puts
> nonboot CPUs into idle states before suspend, so the SMM code
> triggered by the trap may make assumptions about the CPU being in such
> a state or similar.
BTW I also tried to move the enable_nonboot_cpus() after _WAK, and I
tried to boot with nosmp, but neither trick helped. If someone could
throw some patches my way to force things into a specific state
before suspend/_WAK I'd be happy to test them out.
--
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-05-31 7:26 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20160511101920.GZ4329@intel.com>
[not found] ` <57332171.8070403@linutronix.de>
[not found] ` <20160511122116.GA4329@intel.com>
2016-05-11 13:36 ` S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")] Rafael J. Wysocki
2016-05-11 15:25 ` Jim Bos
2016-05-11 16:19 ` Rafael J. Wysocki
2016-05-11 16:21 ` Sebastian Andrzej Siewior
2016-05-11 16:24 ` Rafael J. Wysocki
[not found] ` <20160511084445.00030b49@gandalf.local.home>
[not found] ` <20160511133406.GC4329@intel.com>
[not found] ` <20160516193910.GL4329@intel.com>
2016-05-17 23:14 ` Rafael J. Wysocki
2016-05-18 7:24 ` Ville Syrjälä
2016-05-26 18:32 ` Ville Syrjälä
2016-05-30 20:43 ` Rafael J. Wysocki
2016-05-31 7:26 ` Ville Syrjälä [this message]
2016-07-13 14:54 ` Ville Syrjälä
2016-07-14 8:29 ` Feng Tang
2016-08-09 17:20 ` Ville Syrjälä
2016-10-27 17:28 ` Ville Syrjälä
2016-10-27 18:48 ` Thomas Gleixner
2016-10-27 19:20 ` Ville Syrjälä
2016-10-27 19:25 ` Thomas Gleixner
2016-10-27 20:37 ` Ville Syrjälä
2016-10-27 20:41 ` Thomas Gleixner
2016-10-28 15:56 ` Ville Syrjälä
2016-10-28 18:58 ` Thomas Gleixner
2016-11-01 20:47 ` Ville Syrjälä
2016-11-07 11:49 ` Ville Syrjälä
2016-11-07 13:07 ` Thomas Gleixner
2016-11-07 16:45 ` Ville Syrjälä
2016-11-09 3:54 ` Feng Tang
2016-11-09 6:08 ` Linus Torvalds
2016-11-17 17:14 ` Ville Syrjälä
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160531072650.GP4329@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=bigeasy@linutronix.de \
--cc=len.brown@intel.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
--cc=riel@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rui.zhang@intel.com \
--cc=rusty@rustcorp.com.au \
--cc=srivatsa@mit.edu \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).