All of lore.kernel.org
 help / color / mirror / Atom feed
From: Clark Williams <williams@redhat.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	rostedt@goodmis.org
Subject: Re: Suspend resume problem (WAS Re: [ANNOUNCE] 3.8.10-rt6)
Date: Tue, 30 Apr 2013 16:54:58 -0500	[thread overview]
Message-ID: <20130430165458.719d5556@riff.lan> (raw)
In-Reply-To: <20130430141824.196bd758@riff.lan>

[-- Attachment #1: Type: text/plain, Size: 4224 bytes --]

On Tue, 30 Apr 2013 14:18:24 -0500
Clark Williams <williams@redhat.com> wrote:

> On Tue, 30 Apr 2013 19:09:48 +0200
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> 
> > * Clark Williams | 2013-04-29 16:19:25 [-0500]:
> > 
> > >On Mon, 29 Apr 2013 22:12:02 +0200
> > >Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:
> > >>     - suspend / resume seems to program program the timer wrong and wait
> > >>       ages until it continues.
> > >
> > >It has to be something we're doing when we apply RT to v3.8.x, since
> > >v3.8.x suspends/resumes with no issues and I was able to suspend and
> > >resume fine with the 3.6-rt series. 
> > 
> > I think I figured out what is going on or atleast I think I did.
> > 
> > This log snippet is from the resume path (from suspend to mem):
> > 
> > [   15.052115] Enabling non-boot CPUs ...
> > [   15.052115] smpboot: Booting Node 0 Processor 1 APIC 0x1
> > [   14.841378] Initializing CPU#1
> > [   42.840017] [sched_delayed] sched: RT throttling activated
> > [   42.842144] CPU1 is up
> > [   42.842536] smpboot: Booting Node 0 Processor 2 APIC 0x2
> > 
> > Two things happen here:
> > - the time goes backwards from 15.X to 14.X. This is okay because the
> >   14.X is the timestamp from the secondary CPU not - yet synchronized
> >   with the bootcpu
> > - the printk with "CPU1 is up" is comming from the boot CPU and
> >   according to the timestamp about 28secs passed by. But this did not
> >   really happen as the whole procedure took less time.
> > 
> > The next thing that happens is that RCU assumes nobody is doing any
> > progress (for almost 28secs) and triggers NMIs & printks to get some
> > attention. I have a trace where
> > - CPU0: arch_trigger_all_cpu_backtrace_handler() => printk()
> >         has "lock" and is spinning for logbuf_lock
> > 
> > - CPU1: print_cpu_stall() => printk() (spinning for the lock) => NMI =>
> >   arch_trigger_all_cpu_backtrace_handler()
> >         it may have logbuf_lock and is spinning for "lock"
> > 
> > I can't tell if CPU1 got the logbuf_lock at this time but it seemed that
> > it made no progress until I ended it.
> > This NMI releated deadlock is a problem which should also trigger
> > mainline, right?
> > 
> > Now, the time jump on the other hand is the real issue here and is
> > RT-only. It looks like we get a big number of timer updates via
> > tick_do_update_jiffies64() because according to ktime_get() that much
> > time really passed by.
> > 
> > The sollution seems as simple as
> > 
> > From c27eb2e0ab0b5acd96a4b62288976f1b72789b3e Mon Sep 17 00:00:00 2001
> > From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Date: Tue, 30 Apr 2013 18:53:55 +0200
> > Subject: [PATCH] time/timekeeping: shadow tk->cycle_last together with
> >  clock->cycle_last
> > 
> > Commit ("timekeeping: Store cycle_last value in timekeeper struct as
> > well") introduced a tk-> based cycle_last values which needs to be reset
> > on resume path as well or else ktime_get() will think that time
> > increased a lot.
> > 
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> >  kernel/time/timekeeping.c |    1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index 99f943b..688817f 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -777,6 +777,7 @@ static void timekeeping_resume(void)
> >  	}
> >  	/* re-base the last cycle value */
> >  	tk->clock->cycle_last = tk->clock->read(tk->clock);
> > +	tk->cycle_last = tk->clock->cycle_last;
> >  	tk->ntp_error = 0;
> >  	timekeeping_suspended = 0;
> >  	timekeeping_update(tk, false, true);
> > -- 
> > 1.7.10.4
> > 
> > So Clark, does this patch fix your problem?
> >
> 
> It does seem to! I've got both patches applied right now (your patch to
> vprintk_emit() and the above patch) and it fixes the long delay on my
> lab box. When I get done today (or have a break in the action) I'll try
> it on my laptop to verify. 
> 
> Thanks Sebastian,
> Clark

Tested on my laptop which now resumes. 

Many thanks.

Clark

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2013-04-30 21:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-29 20:12 [ANNOUNCE] 3.8.10-rt6 Sebastian Andrzej Siewior
2013-04-29 21:19 ` Clark Williams
2013-04-30  8:47   ` John Kacur
2013-04-30 10:35   ` Sebastian Andrzej Siewior
2013-04-30 17:09   ` Suspend resume problem (WAS Re: [ANNOUNCE] 3.8.10-rt6) Sebastian Andrzej Siewior
2013-04-30 18:08     ` Steven Rostedt
2013-05-03  9:59       ` Sebastian Andrzej Siewior
2013-05-03 15:31         ` Steven Rostedt
2013-04-30 19:18     ` Clark Williams
2013-04-30 21:54       ` Clark Williams [this message]
2013-04-30 22:31     ` Borislav Petkov
2013-05-02  7:59       ` Sebastian Andrzej Siewior
2013-05-01  8:30     ` Bernhard Schiffner
2013-05-01  8:32     ` Bernhard Schiffner
2013-05-03 10:27       ` Sebastian Andrzej Siewior
2013-05-03 17:46         ` Bernhard Schiffner
     [not found] ` <23187402.mkEEi1N7Lp@bs8>
2013-04-30  7:26   ` [ANNOUNCE] 3.8.10-rt6 Sebastian Andrzej Siewior
2013-05-03  4:40 ` Jain Priyanka-B32167
2013-05-03  4:40   ` Jain Priyanka-B32167
2013-05-03  8:40   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130430165458.719d5556@riff.lan \
    --to=williams@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.