public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Rusty Russell <rusty@rustcorp.com.au>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Getting WARN_ON in hres_timers_resume after Xen resume
Date: Tue, 20 May 2008 15:54:39 +0100	[thread overview]
Message-ID: <4832E62F.9030901@goop.org> (raw)

I'm implementing suspend/resume for Xen at the moment.  It's all going 
well, but I'm getting this WARN_ON:

------------[ cut here ]------------
WARNING: at /home/jeremy/hg/xen/paravirt/linux/kernel/hrtimer.c:635 hres_timers_resume+0x33/0x56()
Modules linked in:
Pid: 1397, comm: kstopmachine Tainted: G        W 2.6.26-rc2-sched-devel.git #94
 [<c102e87d>] warn_on_slowpath+0x41/0x5d
 [<c10477a1>] ? clockevents_program_event+0x105/0x10d
 [<c1047dd3>] ? tick_resume+0x5c/0x61
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c104b8da>] ? trace_hardirqs_off+0xb/0xd
 [<c139b67e>] ? _spin_unlock_irqrestore+0x56/0x6c
 [<c1047dd3>] ? tick_resume+0x5c/0x61
 [<c1047e2d>] ? tick_notify+0x55/0x60
 [<c139db0a>] ? notifier_call_chain+0x32/0x64
 [<c1047960>] ? clockevents_notify+0x42/0x46
 [<c100145d>] ? xen_restore_fl+0x2e/0x52
 [<c104cc50>] ? lock_release+0x71/0x77
 [<c1047960>] ? clockevents_notify+0x42/0x46
 [<c1042192>] hres_timers_resume+0x33/0x56
 [<c1045255>] timekeeping_resume+0x14e/0x157
 [<c11b6ecc>] __sysdev_resume+0x14/0x38
 [<c11b7091>] sysdev_resume+0x36/0x69
 [<c11ba59e>] device_power_up+0x8/0xf
 [<c1183476>] xen_suspend+0x9a/0xb2
 [<c105fd3d>] do_stop+0x17/0x61
 [<c105fd26>] ? do_stop+0x0/0x61
 [<c103f806>] kthread+0x37/0x59
 [<c103f7cf>] ? kthread+0x0/0x59
 [<c100782b>] kernel_thread_helper+0x7/0x10

The WARN_ON is correct, because I do have other CPUs online.  However, 
I'm in the middle of stop_machine, so they're effectively off-line as 
far as the rest of the system is concerned. (Xen suspend doesn't require 
all the CPUs to be offlined, and not doing so makes things a fair bit 
faster and cleaner.)

It seems to me that either:

   1. stop_machine is enough like offlining that we can remove stopped
      cpus from the online map, or
   2. the check in hres_timers_resume is too strong, and can be either
      weakened or removed, or
   3. hres_timers_resume needn't be called here at all, or
   4. I'm missing something, and I'm introducing a bug

BTW, once everything is out of stop_machine, I call clock_was_set() to 
make sure that timers are retriggered on all CPUs.

Thoughts?

Thanks,
    J

                 reply	other threads:[~2008-05-20 14:55 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4832E62F.9030901@goop.org \
    --to=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox