From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Feng Tang <feng.79.tang@gmail.com>,
feng.tang@intel.com, "Rafael J. Wysocki" <rafael@kernel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-arch@vger.kernel.org, Rik van Riel <riel@redhat.com>,
"Srivatsa S. Bhat" <srivatsa@mit.edu>,
Peter Zijlstra <peterz@infradead.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Oleg Nesterov <oleg@redhat.com>, Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Paul Turner <pjt@google.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Zhang, Rui" <rui.zhang@intel>
Subject: Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]
Date: Tue, 1 Nov 2016 22:47:37 +0200 [thread overview]
Message-ID: <20161101204737.GB4617@intel.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1610282049500.5053@nanos>
On Fri, Oct 28, 2016 at 08:58:41PM +0200, Thomas Gleixner wrote:
> On Fri, 28 Oct 2016, Ville Syrjälä wrote:
> > On Thu, Oct 27, 2016 at 10:41:18PM +0200, Thomas Gleixner wrote:
> > > On Thu, 27 Oct 2016, Ville Syrjälä wrote:
> > > > On Thu, Oct 27, 2016 at 09:25:05PM +0200, Thomas Gleixner wrote:
> > > > > So it would be interesting whether that hunk in resume_broadcast() is
> > > > > sufficient.
> > > >
> > > > So far it looks like the answer is yes.
> > > >
> > > > Looks to be about 5 seconds slower than acpi-idle in resuming, but
> > > > I suppose that's not all that surprising ;)
> > >
> > > Well, set it to 1msec then. If that works reliably then we really can do
> > > that unconditionally. There is no harm in firing a useless timer during
> > > resume once.
> >
> > I narrowed down the required timeout, and looks like 25ms is the
> > minimum that works. With 24ms I already started to have failures. So
> > maybe just bump it up by an order of magnitude to 250ms for some
> > safety margin?
I left the thing running for the weekend and it failed 26 out of 16057
times with the 25ms timeout. Looks like it takes ~5 minutes to resume
when it fails, but eventually it does come back.
>
> Sure, but what puzzles me is that we need a timeout that big. What happens
> between broadcast_resume() and broadcast_resume() + 25ms?
>
> IOW, what is the event/resume function which we need to bridge. We should
> really try to track than down.
My hunch would be that SMM trap in the DSDT/SSDT since that's where
things ended up last time I was tracing these resume problems. Though I
can't recall if that was just with acpi-idle or if intel_idle landed in
the same spot as well.
I guess I can try to repeat that test tomorrow, or I'll try your function
tracer method if the other thing fails.
>
> You might try to enable function tracing and do a tracing_off() when that
> 25ms timeout fires.
>
> Something like
>
> stop_trace = true;
>
> in broadcast_resume() and then in the broadcast timer function:
>
> if (stop_trace) {
> stop_trace = false;
> tracing_off();
> }
>
> Then when the machine is up read the trace, compress and upload it
> somewhere or send it in private mail if it's not that big.
>
> Thanks,
>
> tglx
--
Ville Syrjälä
Intel OTC
WARNING: multiple messages have this Message-ID (diff)
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Feng Tang <feng.79.tang@gmail.com>,
feng.tang@intel.com, "Rafael J. Wysocki" <rafael@kernel.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-arch@vger.kernel.org, Rik van Riel <riel@redhat.com>,
"Srivatsa S. Bhat" <srivatsa@mit.edu>,
Peter Zijlstra <peterz@infradead.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Oleg Nesterov <oleg@redhat.com>, Tejun Heo <tj@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Paul McKenney <paulmck@linux.vnet.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Paul Turner <pjt@google.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Zhang, Rui" <rui.zhang@intel.com>,
Len Brown <len.brown@intel.com>,
Linux PM <linux-pm@vger.kernel.org>,
Linux ACPI <linux-acpi@vger.kernel.org>
Subject: Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]
Date: Tue, 1 Nov 2016 22:47:37 +0200 [thread overview]
Message-ID: <20161101204737.GB4617@intel.com> (raw)
Message-ID: <20161101204737.9mnLGvWPX7WwMv2c6C46XINyOQXS7g3SgQVFxWpqUNo@z> (raw)
In-Reply-To: <alpine.DEB.2.20.1610282049500.5053@nanos>
On Fri, Oct 28, 2016 at 08:58:41PM +0200, Thomas Gleixner wrote:
> On Fri, 28 Oct 2016, Ville Syrjälä wrote:
> > On Thu, Oct 27, 2016 at 10:41:18PM +0200, Thomas Gleixner wrote:
> > > On Thu, 27 Oct 2016, Ville Syrjälä wrote:
> > > > On Thu, Oct 27, 2016 at 09:25:05PM +0200, Thomas Gleixner wrote:
> > > > > So it would be interesting whether that hunk in resume_broadcast() is
> > > > > sufficient.
> > > >
> > > > So far it looks like the answer is yes.
> > > >
> > > > Looks to be about 5 seconds slower than acpi-idle in resuming, but
> > > > I suppose that's not all that surprising ;)
> > >
> > > Well, set it to 1msec then. If that works reliably then we really can do
> > > that unconditionally. There is no harm in firing a useless timer during
> > > resume once.
> >
> > I narrowed down the required timeout, and looks like 25ms is the
> > minimum that works. With 24ms I already started to have failures. So
> > maybe just bump it up by an order of magnitude to 250ms for some
> > safety margin?
I left the thing running for the weekend and it failed 26 out of 16057
times with the 25ms timeout. Looks like it takes ~5 minutes to resume
when it fails, but eventually it does come back.
>
> Sure, but what puzzles me is that we need a timeout that big. What happens
> between broadcast_resume() and broadcast_resume() + 25ms?
>
> IOW, what is the event/resume function which we need to bridge. We should
> really try to track than down.
My hunch would be that SMM trap in the DSDT/SSDT since that's where
things ended up last time I was tracing these resume problems. Though I
can't recall if that was just with acpi-idle or if intel_idle landed in
the same spot as well.
I guess I can try to repeat that test tomorrow, or I'll try your function
tracer method if the other thing fails.
>
> You might try to enable function tracing and do a tracing_off() when that
> 25ms timeout fires.
>
> Something like
>
> stop_trace = true;
>
> in broadcast_resume() and then in the broadcast timer function:
>
> if (stop_trace) {
> stop_trace = false;
> tracing_off();
> }
>
> Then when the machine is up read the trace, compress and upload it
> somewhere or send it in private mail if it's not that big.
>
> Thanks,
>
> tglx
--
Ville Syrjälä
Intel OTC
next prev parent reply other threads:[~2016-11-01 20:47 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-11 10:19 S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")] Ville Syrjälä
2016-05-11 12:11 ` Sebastian Andrzej Siewior
2016-05-11 12:21 ` Ville Syrjälä
2016-05-11 12:24 ` Sebastian Andrzej Siewior
2016-05-11 12:41 ` Ville Syrjälä
2016-05-11 12:44 ` Steven Rostedt
2016-05-11 13:34 ` Ville Syrjälä
2016-05-16 19:39 ` Ville Syrjälä
2016-05-17 23:14 ` Rafael J. Wysocki
2016-05-18 7:24 ` Ville Syrjälä
2016-05-26 18:32 ` Ville Syrjälä
2016-05-30 20:43 ` Rafael J. Wysocki
2016-05-31 7:26 ` Ville Syrjälä
2016-05-31 7:26 ` Ville Syrjälä
2016-07-13 14:54 ` Ville Syrjälä
2016-07-13 14:54 ` Ville Syrjälä
2016-07-14 8:29 ` Feng Tang
2016-07-14 8:29 ` Feng Tang
2016-08-09 17:20 ` Ville Syrjälä
2016-08-09 17:20 ` Ville Syrjälä
2016-10-27 17:28 ` Ville Syrjälä
2016-10-27 17:28 ` Ville Syrjälä
2016-10-27 18:48 ` Thomas Gleixner
2016-10-27 18:48 ` Thomas Gleixner
2016-10-27 19:20 ` Ville Syrjälä
2016-10-27 19:20 ` Ville Syrjälä
2016-10-27 19:25 ` Thomas Gleixner
2016-10-27 19:25 ` Thomas Gleixner
2016-10-27 20:37 ` Ville Syrjälä
2016-10-27 20:37 ` Ville Syrjälä
2016-10-27 20:41 ` Thomas Gleixner
2016-10-27 20:41 ` Thomas Gleixner
2016-10-28 15:56 ` Ville Syrjälä
2016-10-28 15:56 ` Ville Syrjälä
2016-10-28 18:58 ` Thomas Gleixner
2016-10-28 18:58 ` Thomas Gleixner
2016-11-01 20:47 ` Ville Syrjälä [this message]
2016-11-01 20:47 ` Ville Syrjälä
2016-11-07 11:49 ` Ville Syrjälä
2016-11-07 11:49 ` Ville Syrjälä
2016-11-07 13:07 ` Thomas Gleixner
2016-11-07 13:07 ` Thomas Gleixner
2016-11-07 16:45 ` Ville Syrjälä
2016-11-07 16:45 ` Ville Syrjälä
2016-11-09 3:54 ` Feng Tang
2016-11-09 3:54 ` Feng Tang
2016-11-09 6:08 ` Linus Torvalds
2016-11-09 6:08 ` Linus Torvalds
2016-11-17 17:14 ` Ville Syrjälä
2016-11-17 17:14 ` Ville Syrjälä
2016-05-11 13:36 ` Rafael J. Wysocki
2016-05-11 15:25 ` Jim Bos
2016-05-11 16:19 ` Rafael J. Wysocki
2016-05-11 16:21 ` Sebastian Andrzej Siewior
2016-05-11 16:24 ` Rafael J. Wysocki
2016-05-11 12:44 ` Arjan van de Ven
2016-05-11 15:26 ` Arjan van de Ven
2016-05-11 17:09 ` Ville Syrjälä
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161101204737.GB4617@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=bigeasy@linutronix.de \
--cc=feng.79.tang@gmail.com \
--cc=feng.tang@intel.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rafael@kernel.org \
--cc=riel@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rui.zhang@intel \
--cc=rusty@rustcorp.com.au \
--cc=srivatsa@mit.edu \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.