public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Steven Rostedt <srostedt@redhat.com>
Cc: dsaxena@plexity.net, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, "Rafael J. Wysocki" <rjw@sisk.pl>,
	Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Subject: Re: TSC not updating after resume: Bug or Feature?
Date: Thu, 18 Dec 2008 10:24:39 +0100	[thread overview]
Message-ID: <1229592279.9487.133.camel@twins> (raw)
In-Reply-To: <1229536210.30177.21.camel@localhost.localdomain>

On Wed, 2008-12-17 at 12:50 -0500, Steven Rostedt wrote:
> I added Peter Zijlstra too, since he works on the cpu clock as well.

Thanks Steve..

> On Wed, 2008-12-17 at 09:27 -0800, Deepak Saxena wrote:
> > Hi,
> > 
> > I am using ftrace to get a detailed timing analysis of the resume process 
> > on the OLPC XO laptop and am seeing that when we start running again, the
> > system timestamep is not being updated for several (hundreds of) thousands 
> > of cycles (~2000 ftrace entries). From following the ftrace, what is 
> > happening is that the clocksource is not updated until we run the cpu_idle() 
> > thread due to an explicit scheduling operation in the resume path that occurs 
> > via call to msleep from pci_set_power_state().  As I'm still fully groking the 
> > timekeeping code, the question(s) I have is whether this is expected behaviour 
> > and I should not assume valid timestamp data in the initial bits of the suspend path, 
> > whether this is an OLPC-specific bug, or whether I've uncovered a generic bug in 
> > the timekeeping implementation.  This is on 2.6.27.7 as I've not gotten 28-rc 
> > up and running on the XO but can move that up in priority if this behaviour
> > is different in newer kernels.

Right, so cpu_clock() relies on gtod and tsc to generate a timestamp. We
use a tick_raw (tsc) and tick_gtod, we then use tick_raw to compute a
delta in tsc and add that to tick_gtod and clip the stuff to a jiffy
window and filter backward motion.

On resume I imagine the TSC has a totally unrelated value to the TSC
before suspend, this would mean the clock would quickly end up either at
whatever place the backward motion filter saw last (new TSC < old TSC)
or at the jiffy window clip (new TSC > old TSC).

Time will only start running correctly again once the jiffy tick starts.

Then there is the acpi idle code and NOHZ stuff, that call the
sched_clock_tick() by hand in order to re-sync time (because an unknown
quantity of time passed and gtod has all the machinery of figuring that
out).

So yes, this is expected behaviour,.. OTOH I really didn't think much
about code relying on this timesource _that_ early and such.

Rafael, would something like this explain why we had to revert Shaggy's
patch? His patch fixes the backward motion filter and I'm at an utter
loss why that would break suspend.

> > 
> > # dsaxena: Return from firmware
> >  do_suspend_trac-1267  [00]   154.925216: restore_processor_state <-ret_point
> >  do_suspend_trac-1267  [00]   154.925216: enable_sep_cpu <-restore_processor_state
> >  do_suspend_trac-1267  [00]   154.925216: olpc_fixup_wakeup <-olpc_pm_enter
> >  do_suspend_trac-1267  [00]   154.925216: geode_get_dev_base <-olpc_fixup_wakeup
> >  do_suspend_trac-1267  [00]   154.925216: olpc_ec_cmd <-olpc_fixup_wakeup
> >  do_suspend_trac-1267  [00]   154.925216: _spin_lock_irqsave <-olpc_ec_cmd
> >  do_suspend_trac-1267  [00]   154.925216: __wait_on_ibf <-olpc_ec_cmd
> >  do_suspend_trac-1267  [00]   154.925216: printk <-olpc_ec_cmd
> >  do_suspend_trac-1267  [00]   154.925216: vprintk <-printk
> >  do_suspend_trac-1267  [00]   154.925216: _spin_lock <-vprintk
> >  do_suspend_trac-1267  [00]   154.925216: emit_log_char <-vprintk
> >  do_suspend_trac-1267  [00]   154.925216: emit_log_char <-vprintk
> > ....
> > # pci_set_power_state() calls msleep(10) as per PCI spec
> >  do_suspend_trac-1267  [00]   154.925216: msleep <-pci_set_power_state
> >  do_suspend_trac-1267  [00]   154.925216: msecs_to_jiffies <-msleep
> >  do_suspend_trac-1267  [00]   154.925216: schedule_timeout_uninterruptible <-msleep
> >  do_suspend_trac-1267  [00]   154.925216: schedule_timeout <-schedule_timeout_uninterruptible
> >  do_suspend_trac-1267  [00]   154.925216: init_timer_on_stack <-schedule_timeout
> >  do_suspend_trac-1267  [00]   154.925216: __init_timer <-init_timer_on_stack
> >  do_suspend_trac-1267  [00]   154.925216: __mod_timer <-schedule_timeout
> >  do_suspend_trac-1267  [00]   154.925216: __timer_stats_timer_set_start_info <-__mod_timer
> >  do_suspend_trac-1267  [00]   154.925216: lock_timer_base <-__mod_timer
> >  do_suspend_trac-1267  [00]   154.925216: _spin_lock_irqsave <-lock_timer_base
> >  do_suspend_trac-1267  [00]   154.925216: internal_add_timer <-__mod_timer
> >  do_suspend_trac-1267  [00]   154.925216: _spin_unlock_irqrestore <-__mod_timer
> >  do_suspend_trac-1267  [00]   154.925216: _spin_lock <-schedule
> >  do_suspend_trac-1267  [00]   154.925216: marker_probe_cb <-schedule
> >  do_suspend_trac-1267  [00]   154.925216: _spin_unlock <-tracing_record_cmdline
> >  do_suspend_trac-1267  [00]   154.925216: __switch_to <-schedule
> >           <idle>-0     [00]   154.925216: _spin_unlock_irq <-finish_task_switch
> >           <idle>-0     [00]   154.925216: tick_nohz_stop_sched_tick <-cpu_idle
> >           <idle>-0     [00]   154.925216: ktime_get <-tick_nohz_stop_sched_tick
> >           <idle>-0     [00]   154.925216: ktime_get_ts <-ktime_get
> >           <idle>-0     [00]   154.925216: getnstimeofday <-ktime_get_ts
> >           <idle>-0     [00]   154.925216: set_normalized_timespec <-ktime_get_ts
> >           <idle>-0     [00]   154.925216: get_next_timer_interrupt <-tick_nohz_stop_sched_tick
> >           <idle>-0     [00]   154.925216: _spin_lock <-get_next_timer_interrupt
> >           <idle>-0     [00]   154.925216: _spin_unlock <-get_next_timer_interrupt
> >           <idle>-0     [00]   154.925216: hrtimer_get_next_event <-get_next_timer_interrupt
> >           <idle>-0     [00]   154.925216: _spin_lock_irqsave <-hrtimer_get_next_event
> >           <idle>-0     [00]   154.925216: _spin_unlock_irqrestore <-hrtimer_get_next_event
> >           <idle>-0     [00]   154.925216: rcu_needs_cpu <-tick_nohz_stop_sched_tick
> >           <idle>-0     [00]   154.925216: rcu_pending <-rcu_needs_cpu
> >           <idle>-0     [00]   154.925216: __rcu_pending <-rcu_pending
> >           <idle>-0     [00]   154.925216: __rcu_pending <-rcu_pending
> >           <idle>-0     [00]   154.925216: rcu_pending <-cpu_idle
> >           <idle>-0     [00]   154.925216: __rcu_pending <-rcu_pending
> >           <idle>-0     [00]   154.925216: __rcu_pending <-rcu_pending
> >           <idle>-0     [00]   154.925216: default_idle <-cpu_idle
> >           <idle>-0     [00]   154.925216: do_IRQ <-common_interrupt
> >           <idle>-0     [00]   154.925216: irq_enter <-do_IRQ
> >           <idle>-0     [00]   154.925216: tick_nohz_stop_idle <-irq_enter
> >           <idle>-0     [00]   154.925216: ktime_get <-tick_nohz_stop_idle
> >           <idle>-0     [00]   154.925216: ktime_get_ts <-ktime_get
> >           <idle>-0     [00]   154.925216: getnstimeofday <-ktime_get_ts
> >           <idle>-0     [00]   154.925216: set_normalized_timespec <-ktime_get_ts
> >           <idle>-0     [00]   154.925216: ktime_get <-sched_clock_tick
> >           <idle>-0     [00]   154.925216: ktime_get_ts <-ktime_get
> >           <idle>-0     [00]   154.925216: getnstimeofday <-ktime_get_ts
> >           <idle>-0     [00]   154.925216: set_normalized_timespec <-ktime_get_ts
> > # ktime_get_ts() calls getnstimeofday() which ends up calling clocksource_read()
> > # thus updating the kernel timestamp. note that we call getnstimeofday()
> > # twice above but for some reason they do not impact the time stamp...
> > # There is an IRQ in here, possibly an HRT-driven IRQ and I'm wondering
> > # if that is kicking the TSC into ticking?
> >           <idle>-0     [00]   154.929837: touch_softlockup_watchdog <-sched_clock_idle_wakeup_event
> >           <idle>-0     [00]   154.929839: tick_nohz_update_jiffies <-irq_enter


  reply	other threads:[~2008-12-18  9:25 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-17 17:27 TSC not updating after resume: Bug or Feature? Deepak Saxena
2008-12-17 17:50 ` Steven Rostedt
2008-12-18  9:24   ` Peter Zijlstra [this message]
2008-12-18 12:39     ` Ingo Molnar
2008-12-18 22:19       ` Rafael J. Wysocki
2008-12-18 22:25         ` Dave Kleikamp
2008-12-18 23:39           ` Rafael J. Wysocki
2008-12-21 16:26         ` Thomas Gleixner
2008-12-21 19:43           ` Thomas Gleixner
2008-12-21 22:43             ` Fabio Comolli
2008-12-22 14:48               ` Thomas Gleixner
2008-12-22 15:00                 ` Ingo Molnar
2008-12-22 18:47                   ` Thomas Gleixner
2008-12-22 20:36                     ` Thomas Gleixner
2008-12-22 20:39                       ` Fabio Comolli
2008-12-22 21:13                         ` Thomas Gleixner
2008-12-22 21:35                           ` Fabio Comolli
2008-12-22 22:05                             ` Thomas Gleixner
2008-12-22 22:12                               ` Fabio Comolli
2008-12-22 22:16                                 ` Thomas Gleixner
2008-12-22 22:29                                   ` Fabio Comolli
2008-12-22 22:33                                     ` Thomas Gleixner
2008-12-23  0:28                                       ` Rafael J. Wysocki
2008-12-23 19:42                                       ` Fabio Comolli
2008-12-24 14:09                               ` Pavel Machek
2008-12-26  8:47                                 ` Peter Zijlstra
2008-12-22 20:19                 ` Fabio Comolli
2008-12-21 20:54           ` Thomas Gleixner
2008-12-23  0:47           ` Rafael J. Wysocki
2008-12-23  7:03             ` Thomas Gleixner
2008-12-23 14:15               ` Rafael J. Wysocki
2008-12-21 21:19 ` Thomas Gleixner
2008-12-23 22:03   ` Deepak Saxena

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1229592279.9487.133.camel@twins \
    --to=peterz@infradead.org \
    --cc=dsaxena@plexity.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=shaggy@linux.vnet.ibm.com \
    --cc=srostedt@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox